As demanded by @Legoktm.
Description
Details
| Status | Subtype | Assigned | Task | ||
|---|---|---|---|---|---|
| Open | None | T106123 Extensions needing to be removed from Wikimedia wikis | |||
| Declined | None | T218079 CodeRevisionListView::getRevCount is creating slow queries on mediawiki.org | |||
| Resolved | Jdforrester-WMF | T116948 Undeploy CodeReview | |||
| Resolved | None | T205361 Make an HTML dump of the output of the CodeReview extension on MediaWiki.org | |||
| Resolved | Dzahn | T243056 Set up static-codereview.wikimedia.org to host static HTML dump of CodeReview |
Event Timeline
I have everything dumped locally, it's about 4GB. I'll rsync it to people.wm.o so people can review it before we place it in its final location. I mostly did what @Krinkle suggested but with a few regex tweaks to fix URLs.
- https://people.wikimedia.org/~legoktm/CodeReview/MediaWiki/rev/1.html
- https://people.wikimedia.org/~legoktm/CodeReview/pywikipedia/rev/2.html
All of the r### links should point to the archive.
Mentioned in SAL (#wikimedia-operations) [2020-01-16T02:35:18Z] <Krinkle> krinkle@mwmaint1002 Change code_repo.repo_viewvc from 'https://svn.wikimedia.org/viewvc/mediawiki' to '' for 'MediaWiki' repo_name. Ref 2162cf2fc46cfe, T205361.
@Legoktm Awesome. I do have a few small nit picks:
- Per T205361#5080437, I've applied the repo_viewvc change. This results in some of the broken interface links, being omitted. If you re-run the script now, those links will be gone, and the paths will remain as plain text.
- The archive pages could do with a basic <h1> heading. Maybe make them a simple copy of the <title> that you have already?
- I noticed the relative links are currently absolute e.g. <a href="/~legoktm/CodeReview/MediaWiki/rev/2.html">. If these were relative, like ./2.html, the archive would be more portable (without needing string replacements or regeneration).
- I have a few minor CSS tweaks (e.g. hide the no-op "purge" link), but I'll stuff that in a patch later.
Change 565805 had a related patch set uploaded (by Legoktm; owner: Legoktm):
[mediawiki/tools/codereview-archiver@master] Initial commit
Change 565805 merged by Legoktm:
[mediawiki/tools/codereview-archiver@master] Initial commit
Marking as Resolved as it is in the Done column. Feel free to reopen if there is remaining work.
Looks like it is at https://people.wikimedia.org/~legoktm/CodeReview/MediaWiki/rev/
Should it be on https://dumps.wikimedia.org/ in the long-term ?
Thanks @Dzahn. I looked in https://people.wikimedia.org/~legoktm/CodeReview/MediaWiki/rev/35.html (example) and found that some URL's are to MediaWiki.org like history, "purge"...
@Legoktm Is the dump available somewhere more public or documented somewhere? Could you please add a link to the final location somewhere and re-resolve this task?
Maybe talk to @ArielGlenn about getting it on the official dumps servers (dumps.wikimedia.org) under "misc". That would be more stable than the people VM.
This task depends on T243056: Set up static-codereview.wikimedia.org to host static HTML dump of CodeReview , which is about setting up a domain to host the dump.
We could host a tarball of the hmtl pages but that's different than a static copy that people can browse online.
Since the SQL dumps for codereview are also on dumps servers (T243055) doesn't it fit to also have the HTML together with it?
The HTML dump can be in a tarball for download, sure. But that is separate from what was requested in T243056 i.e. actually serving a static copy for browsing. I don't think the labstore boxes should be doing that.
In that case i think it sounds like this should have a dedicated ganeti VM just for this.
This has happened in T243056 (sites have been added to the miscweb* VMs shared with other static sites)
https://static-codereview.wikimedia.org/MediaWiki/1.html
I think it's (basically?) done.
Some HTML corruption ocurred in the post-processing step. This has caused "follow-up" links to become broken:
https://static-codereview.wikimedia.org/MediaWiki/75446.html?
<a ./75429.html" title="Special:Code/MediaWiki/75429">r75429</a> <a ./75446.html" title="Special:Code/MediaWiki/75446">r75446</a>
Original from https://www.mediawiki.org/wiki/Special:Code/MediaWiki/75446
<a href="/wiki/Special:Code/MediaWiki/75466" title="Special:Code/MediaWiki/75466">r75466</a> <a href="/wiki/Special:Code/MediaWiki/75467" title="Special:Code/MediaWiki/75467">r75467</a>
Also, is it MediaWiki/1.html or MediaWiki/rev/1.html. I've seen both versions. It seems we're back to the former?
It's https://static-codereview.wikimedia.org/MediaWiki/1.html The other version was MediaWiki/r1.html but not /rev/1.html.
So, when is the Apache rewrite being put in place? That's blocking undeploying the extension.
Ping! :)
(I remembered this task after this comment on the GitLab consultation: https://www.mediawiki.org/wiki/Topic:Vu63x95by4od74uc )
Change 724049 had a related patch set uploaded (by Majavah; author: Majavah):
[operations/puppet@production] mediawiki: Redirect Special:CodeReview to static archives
Change 744080 had a related patch set uploaded (by Legoktm; author: Legoktm):
[mediawiki/tools/codereview-archiver@master] Fix re1 replacement
Change 744080 merged by Legoktm:
[mediawiki/tools/codereview-archiver@master] Fix re1 replacement
Mentioned in SAL (#wikimedia-operations) [2021-12-06T19:58:36Z] <legoktm> trying new dump of Special:CodeReview on mwmaint1002 (T205361)
OK, I copied over the new dump to miscweb, the issues in T205361#6150835 are fixed now. I *think* we're ready to finally do this.
No, the Apache config change is non trivial (disable puppet everywhere, roll out to one/few servers, use httpbb for verification plus manual testing, then slowly enable everywhere) and it was too close to the freeze. I had asked in this week's ServiceOps meeting if anyone wanted to pick it up but I don't think anyone volunteered (or at least I didn't see it in the notes). Maybe we can do it during one of the puppet request windows next week.
Change 754088 had a related patch set uploaded (by Krinkle; author: Krinkle):
[mediawiki/tools/codereview-archiver@master] Set HTML doctype and lang, strip purge link, add basic styles
Change 754088 merged by Legoktm:
[mediawiki/tools/codereview-archiver@master] Set HTML doctype and lang, strip purge link, add basic styles
Mentioned in SAL (#wikimedia-operations) [2022-03-28T22:31:27Z] <rzl> rzl@cumin2002:~$ sudo cumin A:mw 'disable-puppet T205361'
Change 724049 merged by RLazarus:
[operations/puppet@production] mediawiki: Redirect Special:CodeReview to static archives
Mentioned in SAL (#wikimedia-operations) [2022-03-28T22:39:14Z] <rzl> rzl@cumin2002:~$ sudo cumin A:mw 'enable-puppet T205361'
Change 774821 had a related patch set uploaded (by Majavah; author: Majavah):
[operations/puppet@production] httpbb: fix status code checks for CodeReview redirects
Change 774821 merged by RLazarus:
[operations/puppet@production] httpbb: fix status code checks for CodeReview redirects
It seems the redirect isn't working for the r URLs:
https://www.mediawiki.org/wiki/Special:Code/MediaWiki/r113071
It seems the result of this change was not deployed, the pages display:
This page is in Quirks Mode. Page layout may be impacted.
And the layout is impacted as a result (odd sizes, broken font, etc.)
Change 774943 had a related patch set uploaded (by Majavah; author: Majavah):
[operations/puppet@production] mediawiki: fix r123 syntax for special:codereview redirects
Browser console. But also view-source shows the page is not in sync with the repo, note the lack of doctype, and the font styles missing etc,
<html> <head> <title>r113071 MediaWiki - Code Review archive</title>
Change 774981 had a related patch set uploaded (by Dzahn; author: Dzahn):
[operations/puppet@production] httpbb: follow-up to 'fix status code checks for CodeReview redirects'
Change 774981 merged by Dzahn:
[operations/puppet@production] httpbb: follow-up to 'fix status code checks for CodeReview redirects'
Mentioned in SAL (#wikimedia-operations) [2022-03-29T22:50:25Z] <mutante> cumin1001 - systemctl start httpbb_hourly_appserver fixed Icinga alert after gerrit:774981 T205361
A month later, it'd be really nice to get this finally done so that we can undeploy the code. How feasible is this?
@Jdforrester-WMF I recall this issue from, at least, more than a year ago. I don't think, but am willing to be wrong, that Platform Engineering owns completing this but who does own this right now? (Apologies if that is already evident and I just missed it, curious where it fits in priorities and if it should move up/down)
I don't know how we messed this up, but given the redirects are in place we should write a small script to update the existing HTML instead of re-scraping it.
Sorry for missing this at the time; this work was pioneered by @Legoktm whilst as staff in Platform, then as a volunteer, then as staff in SRE, and now as a volunteer again. To the extent that CodeReview the extension is 'owned', it was nominally owned by Platform but I appreciate that commitments made in the past aren't necessarily reflected in current resourcing. Given Legoktm's e-mail to wikitech-l this weekend, I'll undeploy the extension later today.
Change 774943 merged by Giuseppe Lavagetto:
[operations/puppet@production] mediawiki: fix r123 syntax for special:codereview redirects
@Legoktm: Removing task assignee as this open task has been assigned for more than two years - See the email sent to task assignee on Feburary 22nd, 2023.
Please assign this task to yourself again if you still realistically [plan to] work on this task - it would be welcome! :)
If this task has been resolved in the meantime, or should not be worked on by anybody ("declined"), please update its task status via "Add Action… 🡒 Change Status".
Also see https://www.mediawiki.org/wiki/Bug_management/Assignee_cleanup for tips how to best manage your individual work in Phabricator. Thanks!
Setting to stalled as I don't think there is anyone working on this nor is it obvious what the actual fix is after skimming the ticket history.
Status:
- "Follow up" links to revisions in both locations work correctly for me thanks to last fix
- "Author" and "history" links don't work and I'd say that's fine.
- "This page is in Quirks Mode" in the web browser console and I propose not to care.
- https://static-codereview.wikimedia.org/favicon.ico link is a 404 and I propose not to care.
- CSS says font-family: monospace, 'Courer New'; so there's a typo for Courer and I propose not to care.
I propose to close this task as resolved. Perfect is the enemy of good enough.

