User Details
- User Since
- Oct 28 2022, 3:06 PM (188 w, 4 d)
- Availability
- Available
- LDAP User
- Hghani
- MediaWiki User
- HGhani-WMF [ Global Accounts ]
Wed, May 13
I noticed that most of the pageviews to Nahui Ollin are referred to us from www.vakarta.com and on that website there are some links on https://vakarta.com/vaccines/combination/ which redirect to https://en.wikipedia.org/wiki/Nahui_Ollin. For example, https://vakarta.com/vaccines/combination/Ipol%2BVerorab redirects to that Wikipedia page. I wonder if this site was being crawled and the pageviews are a byproduct of broken redirects?
May 7 2026
@Khantstop I think this is a reasonable assumption! Previous analysis shows search traffic and internal traffic to exhibit less signs of bot behavior and as work in SDS 1.5.5 develops we will have a better idea around a confidence interval for our bot traffic.
Apr 23 2026
Feb 26 2026
@OSefu-WMF There were a few last changes that were needed but the sheet is now updated and was shared on slack in the #general channel.
Dec 17 2025
Nov 14 2025
Oct 31 2025
Updated my last notebook with some additional findings:
Oct 23 2025
Following up from the previous update:
@Snwachukwu Based on our discussion it sounds like using March 20-th to October 15 2025 to compute the quantiles and then defining whenever user share becomes >95th percentile as the alert threshold works reasonably well as a starting point. Standard deviation might work as well if we want a more specific signal - maybe as a possible future candidate.
Oct 21 2025
I think the dynamic/static mixed approach makes sense especially as our heuristics change/upgrade and become closer to a true user and automated split we would want to have the alert thresholds become more accurate along with it.
Oct 17 2025
The Wikimedia Descriptive Statistics was updated.
Oct 15 2025
@Snwachukwu Thanks for testing these methods.
Thanks for reviewing, I have summarised the observations so far below and I have added a new notebook to the repo that has streamlined the tables. This should hopefully be easier to interpret.
Oct 3 2025
Dashboard has been created and can be viewed here.
Oct 2 2025
Oct 1 2025
Sep 18 2025
In the team meeting we decided to go with:
Sep 10 2025
@OSefu-WMF everything should be up to date now except the pageviews-related items which I am intending to wait until after the backfill to complete. I've placed a placeholder note in those cells in the meantime if anyone takes a look.
Aug 29 2025
@JAllemandou Thanks for generating the test data. The domain data looks okay to me. I agree with @Mayakp.wiki that we don't want to lose the turnilo data for the reasons she mentioned. I don't think Superset can be treated as a substitute for the easy-to-use turnilo charts yet.
Aug 21 2025
Hi @JAllemandou, we also think it would be a good idea to add the access_method field to the domain tables.
Aug 15 2025
Jul 24 2025
duplicate
Jul 22 2025
Jun 17 2025
A sysop list was generated here (Data for up until June 16 2025) using this notebook. As discussed with @Qgil, the notebook can be run on demand and will generate a list of sysop in csv format with a list of sysops current up to the day the notebook is run.
Jun 13 2025
Update of some new findings:
Jun 11 2025
Jun 3 2025
Hi @Qgil, No problem, and I have a question for clarification: can you clarify on how to interpret the starting date of the constraint 2024-12-15? I can generate the notebook up until 2025-03-31 (all sysops until that point), but I am not sure how to interpret what the starting date represents.
May 30 2025
May 28 2025
Hi, @Qgil I should be able to generate new data by end of the week. I can share the data on this ticket once it is ready. Would it be useful or practical for you or your team to have the ability to alter a single line of code to generate the data on demand according to whatever date range you're interested in? If so I can include that customization capability once I pull the new data.
May 22 2025
May 21 2025
Apr 26 2025
Apr 18 2025
Apr 15 2025
A first draft was created here.
Apr 10 2025
Mar 26 2025
Mar 19 2025
@Mayakp.wiki Yes I updated the pageviews data.
Mar 13 2025
Updated the notes above with the following next steps for clarity:
Mar 12 2025
Notes outlining the initial scope and objectives for this project were documented here:
Feb 28 2025
@nshahquinn-wmf Updated the Github repo and archived it.
Feb 26 2025
Feb 24 2025
Jan 2025 snapshot was posted.
@Samwalton9-WMF Since we will be updating the wiki-comparison tool very shortly, just wanted to provide an update regarding the admins. Our contributor's pipeline is using the same definition of active admins as we see in the wiki-comparison tool and so to maintain consistency and avoid confusion, this year's wiki-comparison tool will continue using that definition. However, the feedback regarding local sysops is very useful and will be taken as a point of improvement for the existing metric in the contributor's pipeline.
Feb 21 2025
Data for jan 2025 and dec 2024 was added
Feb 17 2025
Feb 13 2025
We've also decided to add some additional data this year to group monthly active editors by the following edit buckets: 5 to 24, 25 to 99 and 100 plus. These will be in addition to the total monthly active editors that we already include in the tool.
Jan 31 2025
Jan 17 2025
Jan 14 2025
Jan 10 2025
Jan 9 2025
Jan 8 2025
@cmadeo Hi, I've pulled data for everything except the number of 'total new editors and total editors in 2024' which doesn't appear to be available yet (I will check again this week). Please review and let me know if there any questions/concerns.
Dec 20 2024
Nov and oct 2024 data was added.
Dec 18 2024
Thank you all for the quick turn around on the backfill!
Dec 4 2024
Yes, using research.article_topics. @Mayakp.wiki
Nov 28 2024
I've looked at the additional analysis questions mentioned in the last update and put the results on this google doc along with the underlying queries for review/replication.
