Page MenuHomePhabricator

Set up static-codereview.wikimedia.org to host static HTML dump of CodeReview
Closed, ResolvedPublic

Description

Similar to static-bugzilla.wikimedia.org, we need a place to host the static HTML dump of CodeReview. The dump is around 4.6GB of HTML files that can be served by any basic webserver. I would expect a ganeti VM would work.

The plan is to have apache rewrite rules direct traffic from mediawiki.org/wiki/Special:Code(Review)?/.... to this static website.

Event Timeline

MoritzMuehlenhoff subscribed.

We have role::webserver_misc_static (bromine/vega) for this.

Change 567407 had a related patch set uploaded (by Legoktm; owner: Legoktm):
[operations/puppet@production] Add profile and module for for static HTML dump of CodeReview

https://gerrit.wikimedia.org/r/567407

Mhm, maybe put it in a subdomain of mediawiki.org?

Mhm, maybe put it in a subdomain of mediawiki.org?

That's not unreasonable :) On the other hand, we might want to avoid adding non-wiki content to primary wiki domains. *.wikimedia.org is a difficult enough exception to maintain as it is.

Why not just put it in Labs where other toy projects go?

Why not just put it in Labs where other toy projects go?

Can't/shouldn't redirect tier 1 trusted urls to wmcloud servers.

Has it been considered to put it on https://dumps.wikimedia.org/ along with other dumps? Maybe https://dumps.wikimedia.org/other/ ?

That was my first thought because it's a "dump" of some kind.

If you don't want that it's not a huge deal to put it on miscweb1002/miscweb2002 though either. These servers just replaced bromine and vega. And I could do that.

We have currently about 9.4GB left on those servers. So while 4GB kind of works for now.. it will not for a long time if that keeps growing. dumps servers have much more storage.

Agreed with what Daniel said, miscweb is for small static webservices, if the dump is already 4.6G dumps.wikimedia.org is the more appropriate place to provide it.

Why not just put it in Labs where other toy projects go?

Can't/shouldn't redirect tier 1 trusted urls to wmcloud servers.

Totally agreed. However, not doing the redirect was implicit in my question.

We have currently about 9.4GB left on those servers. So while 4GB kind of works for now.. it will not for a long time if that keeps growing. dumps servers have much more storage.

This dump is static/an archive and will never increase.

Agreed with what Daniel said, miscweb is for small static webservices, if the dump is already 4.6G dumps.wikimedia.org is the more appropriate place to provide it.

Do we currently host viewable HTML files on the dumps domain (besides the index listings)? I thought it was just stuff to be downloaded.

This dump is static/an archive and will never increase.

Alright, so it would be possible on these VMs but maybe we should do dedicated VMs instead.

Do we currently host viewable HTML files on the dumps domain (besides the index listings)? I thought it was just stuff to be downloaded.

I guess not since this matches what was said at T205361#6099622.

This dump is static/an archive and will never increase.

Alright, so it would be possible on these VMs but maybe we should do dedicated VMs instead.

Why that? We can simply increase the disk space on the VMs: https://wikitech.wikimedia.org/wiki/Ganeti#Adding_disk_space

We can add a second virtual disk and mount that but growing the existing disk is not advisable.

Change 594501 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/dns@master] add static-codereview.wikimedia.org

https://gerrit.wikimedia.org/r/594501

Change 594501 merged by Dzahn:
[operations/dns@master] add static-codereview.wikimedia.org

https://gerrit.wikimedia.org/r/594501

Change 594669 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] add profile to setup static-codereview.wikimedia.org

https://gerrit.wikimedia.org/r/594669

@Legoktm I added the new name to DNS and started with the puppet patch to create the httpd config.

Some questions:

  • Should we add Bacula backups for the files? For now i assume yes and created a backup set that would include /srv/org/wikimedia/static-codereview
  • Should we add Icinga monitoring for the site? For now i assume yes but commented it out and would check just "index.html" ?
  • For the above, we should add a "notes_url". This usually means a Wikitech page with instuctions what to do if the alert triggers, for example who to contact. Is there a URL with more information on CodeReview or could we create https://wikitech.wikimedia.org/wiki/CodeReview with some basic info?
  • How are we going to get the files on the webserver? In most cases there is a content repo on Gerrit and then we let puppet git::clone from that but that is mostly for much smaller sites. Since you said this will never change i assume we can just upload it manually once and be done with it? Or would you want to be able to push updates in some way?

Sorry, i forgot that you had already started https://gerrit.wikimedia.org/r/c/operations/puppet/+/567407 doing the same thing. I took the liberty to just amend to that and will abandon my newer change instead as a duplicate.

Change 594669 abandoned by Dzahn:
add profile to setup static-codereview.wikimedia.org

Reason:
duplicate of https://gerrit.wikimedia.org/r/c/operations/puppet/ /567407

https://gerrit.wikimedia.org/r/594669

Change 567407 merged by Dzahn:
[operations/puppet@production] Add profile and module for for static HTML dump of CodeReview

https://gerrit.wikimedia.org/r/567407

Change 594674 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] add static-codereview profile to miscweb, comment out monitoring

https://gerrit.wikimedia.org/r/594674

Change 594674 merged by Dzahn:
[operations/puppet@production] add static-codereview profile to miscweb, comment out monitoring

https://gerrit.wikimedia.org/r/594674

Change 594678 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] ATS: add backend for static-codereview on miscweb

https://gerrit.wikimedia.org/r/594678

Change 594707 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] httpbb: add test for static-codereview.wikimedia.org

https://gerrit.wikimedia.org/r/594707

Mentioned in SAL (#wikimedia-operations) [2020-05-06T13:36:09Z] <mutante> puppetmaster - revoking cert for webserver-misc-apps , recreating it with static-codereview.wikimedia.org as addiitonal SAN (T243056)

Change 594712 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] ssl: update webserver-misc-apps, delete webserver-misc static cert

https://gerrit.wikimedia.org/r/594712

Change 594712 merged by Dzahn:
[operations/puppet@production] ssl: update webserver-misc-apps, delete webserver-misc static cert

https://gerrit.wikimedia.org/r/594712

Change 594678 merged by Dzahn:
[operations/puppet@production] ATS: add backend for static-codereview on miscweb

https://gerrit.wikimedia.org/r/594678

Change 594707 merged by Dzahn:
[operations/puppet@production] httpbb: add test for static-codereview.wikimedia.org

https://gerrit.wikimedia.org/r/594707

Yea. My message should be read as "the site has been created, the cert updated, puppet created the config, now i just need the dump to upload".

P.S. Greetings Max :)

@Legoktm Where can i find the dump file please?

@Legoktm I added the new name to DNS and started with the puppet patch to create the httpd config.

Some questions:

  • Should we add Bacula backups for the files? For now i assume yes and created a backup set that would include /srv/org/wikimedia/static-codereview

Yes please.

  • Should we add Icinga monitoring for the site? For now i assume yes but commented it out and would check just "index.html" ?

Yeah, and maybe one of the r1.html links?

  • For the above, we should add a "notes_url". This usually means a Wikitech page with instuctions what to do if the alert triggers, for example who to contact. Is there a URL with more information on CodeReview or could we create https://wikitech.wikimedia.org/wiki/CodeReview with some basic info?

I will create a documentation page.

  • How are we going to get the files on the webserver? In most cases there is a content repo on Gerrit and then we let puppet git::clone from that but that is mostly for much smaller sites. Since you said this will never change i assume we can just upload it manually once and be done with it? Or would you want to be able to push updates in some way?

Yeah, I think if we can just rsync it once that's fine. We may want to adjust the CSS in the future, but that's editing one file.

@Legoktm Where can i find the dump file please?

I'm regenerating the dump right now, it'll take a few hours. But I'll ping you again with the name of a tarball on mwmaint1002.

Change 596019 had a related patch set uploaded (by Legoktm; owner: Legoktm):
[operations/puppet@production] static-codereview: Update notes URL to new wikitech page

https://gerrit.wikimedia.org/r/596019

@Dzahn the dump is at mwmaint1002:/home/legoktm/codereview.tar.gz

Change 596019 merged by Dzahn:
[operations/puppet@production] static-codereview: Update notes URL to new wikitech page

https://gerrit.wikimedia.org/r/596019

Change 596193 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] static-codereview: activate Icinga monitoring

https://gerrit.wikimedia.org/r/596193

Change 596228 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] maintenance: temp allow rsyncing home dir to miscweb

https://gerrit.wikimedia.org/r/596228

Change 596228 merged by Dzahn:
[operations/puppet@production] maintenance: temp allow rsyncing home dir to miscweb

https://gerrit.wikimedia.org/r/596228

Change 596235 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] maintenance: also rsync codereview files to codfw miscweb

https://gerrit.wikimedia.org/r/596235

Change 596240 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] static-codereview: do not allow directory listing for subdirs

https://gerrit.wikimedia.org/r/596240

Change 596280 had a related patch set uploaded (by Legoktm; owner: Legoktm):
[operations/puppet@production] static-codereview: Fix links on index.html

https://gerrit.wikimedia.org/r/596280

Change 596235 merged by Dzahn:
[operations/puppet@production] maintenance: also rsync codereview files to codfw miscweb

https://gerrit.wikimedia.org/r/596235

Change 596280 merged by Dzahn:
[operations/puppet@production] static-codereview: Fix links on index.html

https://gerrit.wikimedia.org/r/596280

Change 596240 merged by Dzahn:
[operations/puppet@production] static-codereview: do not allow directory listing for subdirs

https://gerrit.wikimedia.org/r/596240

Change 596193 merged by Dzahn:
[operations/puppet@production] static-codereview: activate Icinga monitoring

https://gerrit.wikimedia.org/r/596193

  • Should we add Bacula backups for the files?

Yes please.

Confirmed on backup1001 (bacula) with bconsole. Files are there, just with the "r" prefix but that will change soon.

  • Should we add Icinga monitoring for the site? For now i assume yes but commented it out and would check just "index.html" ?

Yeah, and maybe one of the r1.html links?

Added on Icinga. Adjust link from r1.html to just 1.html

https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=miscweb1002&service=Static+CodeReview+archive+HTTP
https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=miscweb2002&service=Static+CodeReview+archive+HTTP

Though i should follow-up to remove one of them and only have it on the host that is actually serving production or it's kind of duplicate.

  • For the above, we should add a "notes_url". This usually means a Wikitech page with instuctions what to do if the alert triggers, for example who to contact. Is there a URL with more information on CodeReview or could we create https://wikitech.wikimedia.org/wiki/CodeReview with some basic info?

I will create a documentation page.

Thank you! I will edit that a bit more to add which VM it is on etc.

  • How are we going to get the files on the webserver?

Yeah, I think if we can just rsync it once that's fine. We may want to adjust the CSS in the future, but that's editing one file.

Yep, done!

https://gerrit.wikimedia.org/r/c/operations/puppet/+/596228

Also copied the files to codfw https://gerrit.wikimedia.org/r/c/operations/puppet/+/596235

See https://static-codereview.wikimedia.org/MediaWiki/1.html now the content is there.

Presumably this is now done and the put-the-redirect-in is T205361?

Boldly calling it resolved. Unless I'm missing anything @Legoktm