See https://github.com/wikimedia-research/canonical-data/issues/1 for more details
Description
Details
Subject | Repo | Branch | Lines +/- | |
---|---|---|---|---|
Add Doteli Wikipedia and Punjabi Wikisource | mediawiki/extensions/WikimediaMessages | master | +4 -0 |
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | nettrom_WMF | T306813 Fix omission of wikis in canonical_data.wikis | |||
Resolved | JAllemandou | T307749 Make analytics-product the owner of canonical_data |
Event Timeline
Per discussion in the PA team's planning meeting, I'm assigning this to myself and moving it to our Kanban board. I'll work on this later this week.
Change 788429 had a related patch set uploaded (by Nettrom; author: Nettrom):
[mediawiki/extensions/WikimediaMessages@master] Add Doteli Wikipedia and Punjabi Wikisource
I reviewed whether we could use the SiteMatrix API and left this comment suggesting we don't as the data doesn't really support our use case. I'll investigate whether there's a better source for canonical data, and I also created this pull request that identifies the wikis we lack English names for and adds those manually (similarly to how the notebook handles missing language names).
Moving this to Needs Review so @jwang can review.
I've also updated the patch for WikimediaMessages in Gerrit so it also modifies qqq.json, which should make it pass build tests.
Have done review. The changes at https://github.com/wikimedia-research/canonical-data/pull/2 look good to me.
@tstarling @Jdforrester-WMF Product Analytics has been using Extension:WikimediaMessages's i18n/wikimediaprojectnames/en.json for a canonical dataset we use in reporting (to get the English names of various wikis). But since we discovered some current Wikis were missing, we're wondering whether this extension is maintained & reliable.
I saw your names listed as authors; do you know who or which teams are currently responsible for maintaining Extension:WikimediaMessages and WikimediaMessages?
It's not really maintained by any one team. Historically it's been me and others helping on an as-needed basis. It's the responsibility of people creating new wikis to add the new entries to that file, but it looks like although that documented it hasn't always been done.
I saw your names listed as authors; do you know who or which teams are currently responsible for maintaining Extension:WikimediaMessages and WikimediaMessages?
No team is responsible for that extension, unfortunately. See the table at https://www.mediawiki.org/wiki/Developers/Maintainers#MediaWiki_extensions_deployed_at_Wikimedia_Foundation for details of which teams admit responsibility for what. This particular data set was originally created for cross-wiki notification messages, as part of the Collaboration Team's work (now nominally owned by Growth-Team but I think they've have > 100% turnover of staff since they last worked on this).
I've merged the patch in question. Happy to help with further changes as needed!
Change 788429 merged by jenkins-bot:
[mediawiki/extensions/WikimediaMessages@master] Add Doteli Wikipedia and Punjabi Wikisource
The updates wikis.csv has now been loaded into canonical_data.wikis and I've confirmed that all the new/updated wikis are correctly present in the dataset.