Page MenuHomePhabricator

[WE4.4] Additional monitoring capabilities for rollout of Temporary Account
Open, HighPublic

Description

Background

We want to understand the impact of temporary accounts rollout on our projects. This is a significant change and we want to be careful about how temporary accounts impact the ecosystem. This is especially true now with upcoming pilot wiki deployments in the near future.

Request

Metrics brainstormed together between Product, Design Research and Comms:

  • Edits rollbacks T371404, T377516
  • Revert rate of temp account edits
  • Number of temp accounts being created

Note: drop it from this ticket. as the source table is currently only available in mariadb. Discussed it with engineer, it will be monitored in Grafana instead (T375505)

  • Number of temp accounts being blocked versus IP addresses T376080, T377516
  • (Edit) Traffic IPs versus temp accounts: number edits by IP. & number of edits by temp accounts

Note: deferred, Reason: T372481#10119239

  • New checkuser admins

    Note: Dropped from this request. Reason: T372481#10119239. Propose measuring number of checkuser admins instead.
  • New regular account creations T377516
  • Number of IP reveals: Need to request access to the logging table

    Note: Dropped from this request. Reason: T372481#10119239
  • Number of rate limit trips for temporary account creations per IP (default is 6 temp account creations per IP address per day).

    Note: Dropped from this request and will be tracked under another task based on the discussion in T371402#10116224
  • NEW: number of checkuser admins T377516

List will change based on what is feasible to include. The list should also be reviewed and the order/priority of the metrics decided.

Timeline

For Temp Accounts-related metrics, they should be in place by the time Temp Accounts is deployed to pilot wikis and ready to be measured.

For non-Temp Accounts-related metrics, we should have measurements by October (or November at the latest) to have baselines available before deployment.

Acceptance Criteria

Related Objects

Event Timeline

mpopov triaged this task as High priority.
mpopov moved this task from Triage to Current Quarter on the Product-Analytics board.
mpopov edited projects, added Product-Analytics (Kanban); removed Product-Analytics.
mpopov renamed this task from Additional monitoring capabilities for rollout of Temporary Account to [WE4.4] Additional monitoring capabilities for rollout of Temporary Account.Jul 30 2024, 2:51 PM

After exploration, I have found that the user table in mariadb captured temp account creation. Temp account can be identified using regular expression filter user_name regexp '^[~]2'.

Hi @Tchanders , Is this the way to identify temp account creation? Are there other schemas that also capture temporary account names and related events?

After exploration, I have found that the user table in mariadb captured temp account creation. Temp account can be identified using regular expression filter user_name regexp '^[~]2'.

Hi @Tchanders , Is this the way to identify temp account creation?

If you have access to the DB, you can use the user_is_temp column on the user table.

Are there other schemas that also capture temporary account names and related events?

I'm not sure if I understand this, could you please add some more detail on what you are looking for?

Thanks @kostajh for the info.
Are there other schemas which also have user_is_temp column, or similar column indicating temp account? For context, I'm exploring the schemas to determine how we can extract data for the request metrics by joining them. It would be helpful to know which schemas already include the 'temp account' column.

Thanks @kostajh for the info.
Are there other schemas which also have user_is_temp column, or similar column indicating temp account? For context, I'm exploring the schemas to determine how we can extract data for the request metrics by joining them. It would be helpful to know which schemas already include the 'temp account' column.

Looking at codesearch, I see a few other schemas, here is also the list of extensions that use the property:

Hi @kostajh, thank you for doing the code search. Can you also give more info on how "IP reveals" are tracked? Any documents about it?

Hi @kostajh, thank you for doing the code search. Can you also give more info on how "IP reveals" are tracked? Any documents about it?

We log IP reveals to Special:Log, e.g. https://en.wikipedia.org/wiki/Special:Log?type=ipinfo&user=&page=&wpdate=&tagfilter=&wpfilters[]=newusers&wpFormIdentifier=logeventslist. Those logs are visible to users with CheckUser level access.

It's likely that we'll implement an auto-reveal mode (T358853#9704846, cc @Niharika) and we may also then log IPs that are revealed when this mode is enabled to an EventLogging schema.

@Niharika another thing to consider: we'll be shifting the logged-out, anonymous IP editing experience to a logged-in, temporary account. Accounts generally have slower page load times because they will bypass some edge caches we have in place for anonymous traffic. Given the impact of web performance on conversion rates, we might want to keep track of page load times and editor interactions for temporary accounts. (In this case, I think the conversion rate we might be concerned with is subsequent edits after the first edit, but we might also see a decrease in pages browsed after initial edit for temporary accounts, compared with anonymous IP editors.)

Example:

Number of rate limit trips for temporary account creations per IP (default is 6 temp account creations per IP address per day)

@kostajh , to measure it, we need a schema which captures the IP address of the temp account. Do you know which schema that is?

Number of rate limit trips for temporary account creations per IP (default is 6 temp account creations per IP address per day)

@kostajh , to measure it, we need a schema which captures the IP address of the temp account. Do you know which schema that is?

We don't capture rate limit trips in an event logging schema, as far as I know. This metric is probably best done via T357763: [Epic] Create a temporary accounts initiative Grafana dashboard.

jwang updated the task description. (Show Details)
jwang closed subtask T371404: Measuring Edits Rollbacks as Resolved.
jwang updated the task description. (Show Details)
Status update

Completed
All metrics we planned have been deployed except for 'Revert rate of temp account edits'.

  • Edits Rollbacks
  • Number of temp accounts being blocked versus IP addresses
  • New regular account creations
  • Checkuser admins

Next steps:

  • Revert rate of temp account edits

Update Revert rate graph in temp account deployment dashboard after the source schema is ready. (tracked in sub ticket T377516)

The source schema edit_hourly does not have a field to indicate temporary account users. This graph will be updated to reflect the revert rate of edits made by temporary accounts once the schema is updated by the data engineering team.
Dependency:T377767, T377768