User Details
- User Since
- Aug 21 2018, 8:23 PM (390 w, 6 d)
- Roles
- Disabled
- IRC Nick
- Nettrom
- LDAP User
- Unknown
- MediaWiki User
- MWang (WMF) [ Global Accounts ]
Fri, Feb 6
I used queries that we developed for the hCaptcha account creation A/B test analysis for a snapshot analysis, with data from the first three weeks of January 2026. We find that the data allows us to easily answer the first two questions, and that we're unfortunately unable to answer the last question, which might be the most important one to answer. I'll be filing a subtask of T394744 to update the instrumentation to fix that problem.
Jan 7 2026
Jan 6 2026
Dec 12 2025
Dec 10 2025
Nov 25 2025
Nov 10 2025
Tagging this with Test Kitchen as it affects MP PHP library usage, and partly because I want to make sure this bug doesn't get lost.
Oct 30 2025
Oct 28 2025
I did some investigation into the data today, and found fourthree issues:
Oct 20 2025
My investigation found that there doesn't appear to be something related to IRS in https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/tree/main/analytics_product/dags. I'll build upon an existing team's DAGs and submit a merge request.
Reassigning this to myself. I'll look into whether there's code for the DAG somewhere, but I did notice that this is not in the PA Airflow dashboard.
Oct 14 2025
Sep 24 2025
Sep 16 2025
Sep 9 2025
Aug 26 2025
Reduction in proportion of blocks issued to IP addresses/ranges:
Aug 25 2025
Aug 21 2025
Jul 18 2025
For future reference, I've uploaded the notebooks used for data gathering and analysis to GitLab: https://gitlab.wikimedia.org/nettrom/2025-temp-accounts
Jul 2 2025
The notebook that I created to understand the extent of this is now available on Gitlab: T366222-global-local-issue-investigation.ipynb
@JJMC89 : Thank you for chiming in with information about this! Those were great points, particularly the fact that a block might just be a continuation of a previous block and therefore naturally have longer duration than previously.
Jun 18 2025
For documentation purposes, here are the key topics we discussed and made decisions about:
Jun 17 2025
@kostajh asked me to review the contextual attributes. I'd add agent_client_platform_family to the list, since that makes it easy to identify desktop/mobile web usage. That's all, looks good to me!
Jun 16 2025
Jun 12 2025
I'll take another look at the Measurement plan tomorrow when I have more time. In the meantime I wanted to expand on a comment I left in the Instrumentation plan where I suggested we use the funnel_name and funnel_entry_token fields in the schema.
Jun 10 2025
Jun 9 2025
I tagged the Editing Team since from what I understand they created the original instrumentation in T310390, but T&SP is my touch point at the moment, so I'll be chatting with them first.
Jun 5 2025
I left this open for a week in case there were questions or comments that needed addressing. Wrapping it up now as resolved.
Jun 4 2025
This has been completed by rerunning the Airflow DAGs for that snapshot. Thanks to the Data Engineering ops team!
Might be easier to do this through Airflow. Waiting for the DAG errors to go away first, though.
May 30 2025
Summary: Our analysis compared the deployment period to the same time period in two reference years (2022 and 2019). We find no significant impact of Temporary Accounts on any metric except the proportion of blocks issued that are IP-based blocks, where we find that Temporary Accounts are associated with a strong decrease. This indicates that wikis change from blocking IP addresses or ranges to blocking temporary accounts.
May 29 2025
May 27 2025
May 22 2025
Closing this as declined as there hasn't been a push for prioritizing this. Can be reopened and reassigned if needs change.
May 21 2025
Thanks for answering my questions @Tchanders, it's good to get confirmations about those things! No worries that the reminder feature isn't available at launch, it's trivial to plan for it being added later.
May 13 2025
I've taken a look at this and would like to apologize in advance about the number of questions and comments I have. They'll help me understand the context and what decisions we're trying to make, so that the instrumentation and measurements can help support those. They're also to make sure we've covered all the bases we're interested in.
May 2 2025
Apr 29 2025
I've completed an initial investigation into this, finding that 2% of users who signed up for one or more campaigns encountered at least one blocked edit attempt during those campaigns.
Apr 25 2025
Assigning this to me as I'll pick this up and do a quick investigation into this.
Apr 23 2025
@AUgolnikova-WMF : There's a gap in the dashboard between Dec 20222 and Feb 2025 because the dashboard was no longer in use, and because the underlying data source doesn't go back further than 90 days. The "Daily Media Searches in VE" chart is different, it'll always show the last 90 days because that's querying event data directly (and that's the same data source that used for the aggregations shown in all the other charts).
Apr 21 2025
I've completed backfilling the data from Feb 1 onwards. There's now a daily cron job that updates the underlying data, it runs at 05:20 UTC every day. The dashboard has been updated with filters so that it by default shows the last quarter's data. Older data is still available, it's just not shown immediately.
Feb 3 2025
I've uploaded the notebook that I used for gathering data on the usage of mw.track() to this GitLab repo.
Dec 20 2024
This analysis has been completed. We found no significant difference between the group of users who saw the Community Updates module and the control group. The primary reason for this is that the number of users who registered on Spanish Wikipedia during the experiment and signed up for a campaign is too low.
Fixing subscribers.
Dec 11 2024
Nov 14 2024
Resolving this task as the work has been published in the Newcomer Homepage paper, presented at CSCW 2023. We also presented this work in the May 2024 Research Showcase.
Nov 12 2024
Merge request submitted: https://gitlab.wikimedia.org/repos/data-engineering/wmfdata-python/-/merge_requests/65
Oct 28 2024
Oct 15 2024
Confirmed by generating events on testwiki and querying logged events in the Data Lake that performer.id is now captured correctly in the database table. Closing this as resolved, thanks everyone!
Oct 9 2024
Sep 19 2024
Sep 18 2024
I've verified that the impression and click events are firing as expected for the module on testwiki.
Sep 16 2024
Sep 13 2024
@KStoller-WMF : Sounds good to me! Ping me when the overview is on MW and I'll take a look?
The estimate has been done and the associated googledoc shared with stakeholders for review. In short, we're in need of about 10k visitors to the Homepage, which in a worst-case scenario we're expected to get in 1.5 months. We also have valid reasons to expect that worst-case scenario to not come true, but we've also defined leading indicators to tell us what is happening just in case.
I've taken a look at it and couldn't find anything to add or change. Moving it to "done". Maybe @mpopov wants to have the pleasure of closing this?
I checked out the data and the associated dashboard, and this looks good to me.
Awesome example dashboard, great work!
Confirmed that these click events are firing correctly on both desktop and mobile web, moving to "done".
When verifying these events, I don't see action_context set for these (but it's set for the subsequent click event). Moving it back to "blocked" for now, but that might not be the right column?
Looks good to me, moving to "Done"
Sep 12 2024
Sep 11 2024
While I know a decision has been made, I wanted to add a few notes about option 1 in case someone comes back to this in the future. From the current task description:
Sep 10 2024
Sep 4 2024
I think the last time we used this data to make changes was in 2019, where clicks on these links were one of the Help Panel Leading Indicators. My recollection, and @Trizek-WMF should correct me if I'm wrong, is that afterwards we had a fairly standard set of suggested links to go in the Help Panel (and subsequently the Help module on the Homepage) that reflected those stats.