I have started working on WE 1.5.1a and posting my main updates in Asana. I will keep track of all the other things I am working on in this phab task (as much as I can! )
- Feed Queries
- All Stories
- Search
- Feed Search
- Transactions
- Transaction Logs
Sat, Apr 11
Fri, Apr 10
Thu, Apr 9
- Further investigation:
Tue, Mar 31
hey @amastilovic , confirming, this is for the backfill use case we discussed yesterday, right?
Also you may want to change the Q3 tag to Q4 since its already March 31 :)
Fri, Mar 20
Update: we tried this today and the test failed. Discovered a bug where the same dbt model run works when run manually but not from within the Skein operator. DE is working on a fix
Thu, Mar 19
Hi @JAllemandou , the metrics have been developed in accordance to the new Contributor measurement strategy. you can see definitions in the following links-
Core and health metrics
Indicator metrics
pls feel free to reach out if you have any questions
@JAllemandou , yes, thats correct
Mon, Mar 16
Mar 3 2026
Feb 27, 2026
Progress update
- Bi-weekly Movement Trends brief report
- Ran the notebooks and paired with Omari to look at trends, and produce the narrative
- Monthly Movement metrics
- Published January report and had conversations with WMDE team about wikidata
- WMDE have requested for Wikidata contributors data for measuring impact from activities
- Coordinated with Contributors strategy
Feb 26 2026
Feb 20 2026
We continue to see inflation in user pageviews coming from Vietnam (superset) in Janaury 2026
Feb 19 2026
I ran sqlfluff on the cluster today using the steps @JMonton-WMF provided 2 days ago and was successfully able to pass all the linter rules in the current version of the model.
I also updated the sqlfluff ignore file and submitted a merge request for the change
Feb 18 2026
I was reporting the Contributors dashboard updates on Asana and skipped updating it here. The hypothesis was closed on Jan 30, 2026 so let me add updates from the last 2 weeks here.
- Both Snapshots have been updated on the site - https://foundation.wikimedia.org/wiki/Legal:EU_DSA_Userbase_Statistics
- We have provided the numbers via gitlab instead of updating the excel sheet, and will continue this going forward
- A few observations on the data
- The latest snapshot is seeing huge drops in the following smaller projects -
wikinews
wikiversity
wikivoyage
mediawiki
- And big increases in wikifunctions
- also seeing bigger drops in some wikimedia projects (not in Commons)
We anticipate seeing some new trends in the Commons and Wikidata split which will need additional investigation so we're targeting to draft the monthly report by Monday Feb 23rd.
Feb 13 2026
thanks @brennen , that worked!
@brennen can you pls check if my gitlab account Mayakpwiki is part of movement-insights group ? I am part of the team, and currently not able to access the private repo.
Feb 12 2026
- code needs to be updated to use iceberg unique devices table since hive tables are not backfilled beyond august 2025
- will be publishing output to gitlab going forward
- I will re-run the Feb-July 2025 report as well since the data was backfilled
Feb 11 2026
Edited description to add the data modeling guidelines. Since we are in the process of cleaning up the code I can try and implement some of the recommendations there.
Feb 6 2026
Thank you @JMonton-WMF ! for all this info. this is a very nice and clear proposal. all of this makes sense to me. sharing a few thoughts -
- I like the proposed folder structure. happy to think through the naming convention.
- <team>_<project>_<modelname>.sql would translate to something like movementinsights_contributors_retained_editors.sql or productanalystics_consumers_reader_retention_weekly.sql which could become unreasonably long or would need a lot of acronyms.
- My 2c on organizing models: while packages are complex, I feel they would facilitate re-orgs better than groups. if a team ceases to exist or changes, another team could inherit the projects of that group vs having to make copies of their existing projects and models and discarding the older version (this is how i understood group vs package concept, i may be wrong).
- anyway the likelihood of this happening very often is low, so lets go with whats sustainable, and less complex.
Feb 5 2026
Feb 2 2026
Jan 27 2026
Jan 20 2026
Jan 17 2026
provided the data. waiting for review and approval to close
Jan 14 2026
re ran the numbers today, and provided the csv in Asana task.
Jan 13 2026
also update T384088#11087378
hey @Ahoelzl, yes it was breaking due to the link. I removed it and that seems to have fixed the annotation.
Jan 9 2026
Reviewed this iteration with Sonja yesterday in our 1:1. i'll go ahead and close this task.
Jan 8 2026
All the work described here is completed so far -
is this task paused because its dependent on the success of T413977 ?
Dec 18 2025
T413027 opened for investigating reader trends observed in Nov 2025
Dec 12 2025
Takeaway and next step from @KStoller-WMF that I agree with:
Dec 4 2025
Dec 2 2025
Nov 26 2025
Ok, I added an entry to the annotation for Edits and Editors.
fingers crossed it shows up for November 2025 in a few days :) will check back.
Nov 24 2025
hey @GGoncalves-WMF , thanks for that info! I wasnt aware that temp edits and temp editors are being counted under Anonymous. I want to say then that no direct impact from temp accounts project or missclassification other than the fact that this metric could get extra attention because of the rollout.
I looked at Edits hourly which also shows a gradual decrease in anonymous edits since June 2021 (the drop is sharper in the last 2 months because mwh and edits_hourly have separate fields for temp and anonymous). When you add the anonymous and temp edits it checks out with Wikistats.
Nov 21 2025
This is all I can think of for the scope for this task right now, but needs further discussion with DPE and possible changes to AQS - https://wikitech.wikimedia.org/wiki/Data_Platform/Systems/Wikistats_2
Nov 20 2025
Nov 19 2025
Nov 17 2025
Going forward we will get updates from DPE Team on this task about progress on dbt - the new tool we are using to develop a standard for storing metric calculation logic. I will add these updates to my Asana updates every Friday.
Nov 6 2025
Oct 30 2025
thank you @JAllemandou !
Oct 29 2025
@JAllemandou , do we have webrequest data retained from the pageviews backfill? it would be great if we could backfill this field.
Oct 27 2025
Hey everyone we discussed this a bit more and have decided to go with a new category called external (ai_chatbot) to include some of the top chabot referrers. In addition I’ve proposed a few changes and additions to existing categories.
Oct 25 2025
hey @JAllemandou/ @Ahoelzl , Ive chimed in here T383088#11308163 about the modification to referer - since most of it is being tagged as automated there is low impact to user pageviews currently.
Oct 24 2025
+1 to @Isaac . I re ran the query that @Krinkle ran using pageview_actor for a day in Oct and Sep 2025. Most of the requests are now tagged as 'automated'. I think this could be a result of the new heuristic we put into place in T395934.
based on my observation -
- the proportion of non https requests are much smaller than the ones tagged correctly as search_engine referer
- of the ones tagged as external or unknown, majority are automated ie user pageviews arent impacted
- also, looking at browser_family, the automated requests are mostly coming from 'Chrome' or 'Other' browser family. Firefox isnt dominating this anymore.
Oct 23 2025
Iteration 1 of Draft Dashboard - https://superset.wikimedia.org/superset/dashboard/p/eNLvg06Bgl5/
@JAllemandou , +1 on changing the IP referer to unknown.
hey @Isaac, more and more search engines are integrating conversational AI features into the search experience, and yes, I agree that the boundaries are getting blurry. so the way I see it, it means we may not need a new classifier for chat agents. Traditional chat agents like ChatGPT would fit into the search engine category, and even more now since the launch of Atlas.
May be some time in the future we may just have to rename or modify the definition of search engine referer to accommodate this nuance.
Oct 21 2025
Hi Everyone, thank you for your inputs! this is helpful as we begin to think about upgrading the referer column. However, this should be done in phases as we can make many improvements and I see quite a few suggestions in this task. I'll try to curate them in a document and share soon.

