Page MenuHomePhabricator

[Request] Analyzing the roll-out of temp accounts on major pilots as it impacts anti-abuse work
Closed, ResolvedPublic

Description

Please provide all the following information:

  • Context. Provide a short paragraph with some background context for your request, please include links to relevant material.

Temporary accounts are a paradigm shift for anonymous editing. We have some insights from rollout to small wikis, but in June/July we are rolling out to large wikis. After the rollout, we want to understand 1) impact on anti-abuse workflows, to understand if there are improvements we should make to tooling or features we need to introduce 2) ways that temporary accounts are being used for abuse, so we can understand if there are software based solutions to mitigate this abuse.

We also want to understand how IP reveal right is manually granted and how it is used across various user groups, considering also the potential that this right may be abused by those who have access.

  • Description. What is your request about?

Analyzing patterns in abuse related to temporary accounts to figure out where we need improvements in tooling to support communities.

  • Expected Deliverable. What is the ideal outcome or result of your request?

An analysis of abuse related to temporary accounts that points to specific software interventions that will reduce the burden of anti-abuse efforts on communities. Collection of comments or case studies where temporary accounts have impacted anti-abuse efforts.

  • Estimated Effort. Please provide an estimate of the amount of work needed to complete this task, if known.

Estimated 4 week analysis effort. This work should start after enough time has passed since roll-out of temporary accounts T340001 (roll-out begins in mid-June), in order to collect enough data for analysis. Ideally this should be finished by end of Q1, before further roll-out of temporary accounts.

  • Priority Please indicate a priority of your task and a small description of what it would unlock for you. We ask you to leave this task as “needs triage” since your request will go through a Backlog refinement process where our team will prioritize the work.

I need this task resolved in:

  • 1 month.
  • 3 months.
  • 6 months.
  • Whenever you get to it :-)
  • Other. Do you have any other questions or comments ?

For use by WMF Research team; please leave everything below as it is:

  1. Does the request serve one of the existing Research team's audiences? If yes, choose the primary audience. (1 of 4)
  2. What is the type of work requested?
  3. What is the impact of responding to this request?
    • Support a technology or policy need of one or more WM projects
    • Advance the understanding of the WM projects.
    • Something else. If you choose this option, please explain briefly the impact below.

Event Timeline

Assigning to Claudia to follow up with Kosta about this request to confirm and determine what UX research approach is needed for this.

Kate Z says: Eric walked me through this one and the way he described it, it sounds like they're looking for a mix of quant & qual

So Kate's assessment is that this project should be addressed by both Product Analytics & Design Research. Morten is supporting WE4 from Product Analytics. If Design Research takes on this request, Kate wants the report to be a shared report between Product Analytics & Design Research.

Updated task description, especially around estimated effort and priority, after more discussion with stakeholder group.

After discussion with @nettrom_WMF , I believe this task would be better scoped with this set of primary research goals:

  1. Understand impact of deploying Temporary Accounts (TA) on pilot wikis, to inform development of anti-abuse features
    • Is the loss of IP address info at large, significantly detrimental to community anti-abuse efforts?
  2. Document uses of TAs for abuse, to inform development of software-based mitigations
    • Are there documented cases where Temporary Accounts have broken existing anti-abuse workflows?
    • If yes, what are potential product recommendations to address these scenarios?
  3. Document how IP reveal rights are being rolled out, as they now must be manually granted

With our previous goals, two major methodological hurdles were that we did not have an operationalized definition of "anti-abuse efficacy", and any major qualitative study focused around perceptions of anti-abuse efficacy would need to cover at least 11 different languages. Scoping this down to focus on concrete examples where Temporary Accounts has significant detriments to anti-abuse efforts will help us manage these challenges most effectively.

Weekly update: research brief review meeting scheduled for next week.

Weekly update: Research brief alignment meetings concluded, we can begin in earnest. Morten and I will work to figure out the details of our approaches. We've also begun soliciting examples where temporary accounts has helped or hindered attempts to stop bad behaviour from the community, using existing comms channels for the Temporary Accounts project.

Update: We're still waiting for more time to pass before we begin quantitative data collection and analysis. On the qualitative front, I have been working with @sgrabarczuk to reach out to ambassadors and find good recruiting venues to get feedback.

Update: We're still waiting for July's data to be available. In the meantime, I have compiled a table of the 31 wikis where Temporary Accounts are enabled, and policies on the temporary-account-viewer usergroup. 60% of the wikis do not currently have a local page for the usergroup (the usergroup page is a red link or redirects to the Meta-Wiki policy); 75% of them use the WMF's local access minimum thresholds for the group.

Update on findings. @nettrom_WMF has retraced T395618 for the major project rollout wikis, looking at a 30 day window prior to first deployment week compared to a 30-day window after the last deployment (see "Methodological details" below for more.)

We found no significant changes in the following:

  • the rate of account registrations
  • the rate of edits, neither overall nor when looking at the Main and Talk namespaces in particular
  • the proportion of edits made by non-logged in users, again looking across all edits and Main & Talk specifically
  • the proportion of reverts of edits made by non-logged in users, with similar namespace-limitations as before
  • the overall number of blocks issued
  • the number of blocks issued to registered accounts
  • the number of checkuser checks run in July

We found a significant change in the proportion of blocks that are issued to IP addresses/ranges. Namely, after the introduction of temporary accounts, far fewer blocks are issued to IP addresses/ranges than before, similar to what we saw with the minor project rollout of Temporary Accounts.

The extent to which temporary accounts are blocked varies greatly from wiki to wiki. Both wikis with a large article count, or wikis that issued a large total number of blocks (e.g. German and Japanese), have a low proportion of blocks going to temp accounts (1.39% and 0.89%, respectively). We have a group in the middle with Chinese, Persian, and French in the 7.5% to 12.0% range. At the higher end, we have Czech, Turkish, and Polish who all come in above 20%.

Changes in user rights

One of our hypotheses was that communities would respond to the loss of easy IP address information by granting more users either the checkuser or Temporary Account IP viewer rights. The rollout does not appear to be associated with a significant immediate increase in users looking to get checkuser rights. Apart from the two new accounts being granted checkuser rights on Hebrew Wikipedia, all other wikis are stable.

With the introduction of Temporary Accounts also comes the new "Temporary Account IP Viewer" (TAIV) user group, which allows non-admins, non-bureaucrats and non-checkusers local access to see the IP addresses used by temporary accounts. The Foundation has set a minimum threshold for local users to be a part of this group, which I will summarize as "6 months, 300 edits, active in the past year". Wikis are free to set more stringent thresholds locally. Looking at the major rollout wikis:

  • 7 out of 18 wikis have set more stringent local thresholds (Czech Wikipedia does not use the group at all; I am interpreting this as "must be an admin to see temporary account IPs", which represents a stricter threshold)
  • 11 out of 18 wikis have a local page on the TAIV group; the other six wikis either redirect to the Meta-Wiki page on TAIVs or have a redlink

Of the seven wikis with a more stringent local threshold, a common theme is requiring proven and recent participation in anti-vandalism practices. Otherwise the standards range from "slightly stricter" (e.g. French, 12 month old account and 500 edits) to "much stricter" (e.g. Ukrainian, which only allows active ArbCom members and potentially interface admins to get TAIV rights).

Currently, the total number of TAIV users is low. 9 of the 18 rollout wikis have no members of the TAIV group. Of the other 9, only four wikis have more than 6 TAIVs. These are French (19), Hebrew (33), Chinese (79), and German (112).

Overall, I would conclude that the need for IP information among non-administrators and non-functionaries seems low. We do not see evidence to support the hypothesis that the Temporary Accounts rollout significantly negatively impacts blocking behavior or causes an immediate increase in CheckUser requests, at this early stage. (Hooray for null results.)

Methodological details: We used a 30-day window prior to the first deployment week, and a 30-day window after the last deployment. The former ends at midnight on Jun 16, 2025, so anything happening from Monday that week onwards is excluded from analysis. The latter starts on Jul 1, 2025, which is the day after the last deployment. While that means that we didn’t exclude three whole weeks, it allows us to get 30 days of data using the July 2025 MediaWiki History snapshot. Similarly as for the Minor Pilots analysis, we used a Difference-in-Differences analysis approach because it’s straightforward and quick. Improving on this analysis by applying causal inference methods is left for future work.

Reduction in proportion of blocks issued to IP addresses/ranges:

As mentioned above, in our analysis we find a significant decrease in the proportion of blocks that are issued to IPs. We have previously reported on this in our analysis of the minor pilots deployment, and to make the decrease more tangible for the major pilots rollout we repeat that part as well.

Across the years and pre-/post-deployment periods, the average proportion of blocks that are issued to IPs is as follows:

YearPre-deploymentPost-deployment
201941.6%37.2%
202338.2%35.3%
202437.8%32.6%
202546.3%20.4%

The table shows that we generally see a reduction in IP-based blocks post-deployment. One relatively straightforward explanation for this is that the post-deployment period corresponds to summer holidays, we typically see reduced activity on the wikis during this time. We also see that the proportion in the post-deployment period is typically in the 32%–37% range, while in the deployment year (2025) the proportion is well below that (20.4%).

Similarly as we did for the minor pilots, we investigated whether there was a significant change in the denominator in this proportion: the number of overall blocks issued. As mentioned previously, we did not find any significant change there, which means that the difference is mainly attributed to a change in the number of IP blocks issued.

Lastly, we can plot the distribution of these proportions on a graph:

temp-accounts-minor-pilots-ipblock-proportion-2024.png (2×3 px, 155 KB)

While the change in this graph is less pronounced than what it was for the minor pilots, we can still see the same kind of trend that we saw there. The pre-deployment periods are very similar for both years (and they are also for 2023 and 2019, but left out for brevity). In the post-deployment period we see that in 2025 the distribution becomes more compressed. Some wikis still have a relatively high proportion, but the outliers disappear and there's more of a concentration in the <25% range (and we note that in the minor pilots all wikis were in this <25% range).

Some final updates on elements brought up in this study's research brief:

  • Attempts to garner specific comments and feedback on Temp Accounts largely failed, despite efforts from MoveComms and outreach to specific editors
  • In the research brief, we initially considered sending out a short survey to gather opinions about the Temporary Accounts rollout. T402277 instead fulfilled this role, so this study did not incorporate its own survey so as to avoid duplicating effort.

Marking as closed.