Page MenuHomePhabricator

Cleaning up after EducationProgram
Open, Needs TriagePublic

Description

EducationProgram puts a lot of data into logging

There also talk pages

What are we going to do with this after we undeploy the extension? They become un-reachable...

Delete the lot after dumping?

Event Timeline

Reedy created this task.Jul 26 2018, 1:08 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJul 26 2018, 1:08 AM
Reedy updated the task description. (Show Details)Jul 26 2018, 1:49 AM
Reedy updated the task description. (Show Details)Jul 29 2018, 1:51 AM

Are the plans to remove all related data from the databases? Right now, the MediaWiki API says the Education( talk) namespaces don't exist, but such pages are still in the database. This causes errors for some analytical tools like XTools. We have the page_title, but we can't display it because we don't know what the namespace is.

Delete the lot after dumping?

This is my feeling. I see from T188407 there are concerns on accessing historical data. So I would dump it somewhere so everyone can access it, then remove all data from the database. As it stands now, I consider the lingering data to be corrupt.

Reedy added a comment.Oct 12 2018, 6:22 PM

Are the plans to remove all related data from the databases? Right now, the MediaWiki API says the Education( talk) namespaces don't exist, but such pages are still in the database. This causes errors for some analytical tools like XTools. We have the page_title, but we can't display it because we don't know what the namespace is.

https://en.wikipedia.org/w/api.php?action=query&pageids=38654465&prop=info|revisions&rvprop=content

Can a batch fix of moving them all to Project/Project_talk just be done?

Reedy added a subscriber: Urbanecm.Dec 8 2018, 7:30 PM

Might be best to discuss this on wiki as any mass solution will likely require a bot, needing approval at WP:BRFA

No, a bot wouldn't need approval. It's not like it's doing mass unattended changes to pages. It would be one just dumping the contents to some pre determined pages

I have created a simple sigle purpose script, allowing to publish the courses as wikipages. It's on https://gist.github.com/urbanecm/8a090da58429b121067bf491d1e9a510 , along with some docs. I can generate wikipages containing course data for any wiki wanting me to do it. If you want me to do it, maybe dedicated phab task assigned to me would be warranted, so I will surely notice it.

Reedy added a comment.Dec 8 2018, 7:32 PM

Can a batch fix of moving them all to Project/Project_talk just be done?

I wonder if we should just re-add the EducationProgram NS in mw-config to the Wikis that had it enabled...

https://en.wikipedia.org/w/api.php?action=query&pageids=38654465&prop=info%7Crevisions&rvprop=content confirms the content type isn't anything that depends on the extension, it's just wikitext

Can a batch fix of moving them all to Project/Project_talk just be done?

I wonder if we should just re-add the EducationProgram NS in mw-config to the Wikis that had it enabled...
https://en.wikipedia.org/w/api.php?action=query&pageids=38654465&prop=info%7Crevisions&rvprop=content confirms the content type isn't anything that depends on the extension, it's just wikitext

That would solve the problem with what to do about https://en.wikipedia.org/wiki/Category:Pages_linking_to_the_Education_Program_namespace

Otherwise a bot may be needed to edit all of the pages in that category, if a workaround can't be implemented in the template that populates that category.

Can a batch fix of moving them all to Project/Project_talk just be done?

I wonder if we should just re-add the EducationProgram NS in mw-config to the Wikis that had it enabled...
https://en.wikipedia.org/w/api.php?action=query&pageids=38654465&prop=info%7Crevisions&rvprop=content confirms the content type isn't anything that depends on the extension, it's just wikitext

That would solve the problem with what to do about https://en.wikipedia.org/wiki/Category:Pages_linking_to_the_Education_Program_namespace
Otherwise a bot may be needed to edit all of the pages in that category, if a workaround can't be implemented in the template that populates that category.

Done FWIW

Krinkle added subscribers: MaxBioHazard, Krinkle, Base.

Note that the namespaces were re-created in prod in December per T211494. As such, the talk pages are now accessible again, e.g. at https://en.wikipedia.org/wiki/Education_Program_talk:Cornell_University/Online_Communities_(Fall_2013).

If this is only about talk pages and not the subject space, then I suppose they could perhaps be mass renamed to be in the project namespace? E.g. Wikipedia_talk:Education_Program/.. or some such.

Change 488720 had a related patch set uploaded (by Reedy; owner: Reedy):
[operations/mediawiki-config@master] Don't add EP NS where the wiki has no pages in that NS

https://gerrit.wikimedia.org/r/488720

Change 488720 merged by jenkins-bot:
[operations/mediawiki-config@master] Don't add EP NS where the wiki has no pages in that NS

https://gerrit.wikimedia.org/r/488720

Izno added a subscriber: Izno.Sat, Jul 6, 1:28 AM