Page MenuHomePhabricator

Recreate a wiki for Wikimedia Portugal
Closed, ResolvedPublic

Description

Language: pt - Português
Sitename: Wikimedia Portugal

Our wiki will be dismissed, and this one is to replace it. Once created, we'll filter the pages that need to be imported, and dismiss the old one.

Roadmap

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Change 314539 merged by jenkins-bot:
Logo for pt.wikimedia

https://gerrit.wikimedia.org/r/314539

Mentioned in SAL (#wikimedia-operations) [2016-10-06T23:31:34Z] <dereckson@tin> Synchronized static/images/project-logos/: Logo for pt.wikimedia (T126832, 1/2) (duration: 00m 50s)

Mentioned in SAL (#wikimedia-operations) [2016-10-06T23:34:38Z] <dereckson@tin> Synchronized wmf-config/InitialiseSettings.php: Logo for pt.wikimedia (T126832, 2/2, no-op for the moment) (duration: 00m 50s)

Dereckson updated the task description. (Show Details)Oct 8 2016, 12:07 AM
Dereckson renamed this task from create a wiki for Wikimedia Portugal to Recreate a wiki for Wikimedia Portugal.Oct 8 2016, 12:26 AM

Change 314792 had a related patch set uploaded (by Dereckson):
Respawn ptwikimedia configuration

https://gerrit.wikimedia.org/r/314792

Dereckson updated the task description. (Show Details)Oct 8 2016, 12:27 AM
Dereckson added a comment.EditedOct 8 2016, 12:30 AM

Current status: configuration ready, next step is to clean old ptwikimedia database.

[…]
Do you want me to delete "everthing I can find". Just the s3 contents? Please tell me in db terms, please, and I will do it.

As the old wiki contained pre-SUL user accounts, and the wiki in their host use different accounts' usernames, artefacts from the 2012 version won't be really useful, so yes delete "everything you can find" seems cleaner.

I checked the 52cec03e kill commit, it was well in s3 shard.

jcrespo reassigned this task from demon to Dereckson.Oct 9 2016, 9:33 AM

yes delete "everything you can find" seems cleaner.

Ok, please give me some time, as it involves several different services to check. Also, it is not as immediate as dropping whole databases; to prevent service problems or too much locking for other wikis, I have to backup everything I find and drop it slowly table by table (there could be some buffer pool contention on old servers).

This will also help document dropping other wikis in the future.

Krenair added a comment.EditedOct 28 2016, 1:16 AM

SQL:

  • s3 - to backup+drop
  • external storage - no rows in here to backup AFAICT, just drop
  • CentralAuth DB - no rows for this wiki that I found
  • commonswiki.globalimagelinks - no rows for this wiki that I found

Swift - godog verified no container ever got created for this wiki (I guess the Swift migration happened after the wiki deletion)
Memc, redis - ori said no chance anything would have survived this long (did we even have redis at the time?)

Obviously there will be references in metawiki.logging but those are harmless and historical record.
I looked through pretty much every other service host I could find with grep in mediawiki-config, it turns out a lot of stuff has been developed or created since 2012.

jcrespo moved this task from Triage to Next on the DBA board.

It was decided at the yearly operations meeting that the project Blocked-on-Operations will no longer be used, except for clearing the still-there tickets:

https://phabricator.wikimedia.org/T142734#2669843

As you can see, the project is archived to mark it should no longer be used.

jcrespo moved this task from Next to In progress on the DBA board.Dec 15 2016, 10:26 AM
jcrespo claimed this task.
Dereckson added a comment.EditedDec 15 2016, 11:59 AM

Thanks to take care of that Jaime.

@Alchimista So you can plan a little bit, we're the last deployment day of the year before the code freeze, so a little short to correctly plan it, even if there is an add wiki window this evening. Once the database is purged, I'll plan a deployment window to create the wiki the first week of January, ie probably Thursday 5.

@jcrespo Could you prioritize this or give a target date estimate? All other changes are ready to create this wiki. Only the DBA cleaning is remaining.

This will allow Wikimedia Portugal to take decisions about their current hosting renewal.

@Dereckson This is already prioritized, it is "Normal", and it is on the DBA column "In progress". As I have said multiple times before on this ticket, dropping tables is a dangerous process, which is not fast (I warned about that multiple times in advance, and that is why I had offered easier/faster alternatives), as the decision was to go with the deletion route, it will be done when it is done (it is now in progress).

This had to be delayed many times due to Deployment freeze: https://wikitech.wikimedia.org/wiki/Deployments/Archive/2016/12 , Christmas period and my own vacations.

"Only the DBA cleaning is remaining." is a strange way to put it, when it is a multi-stage, multi-server, highly critical process that has never, ever, been tried on the production cluster before and that can bring down multiple wikis if there is the simplest of mistakes, or even leak user's private data, so it is pretty stressful. Of the 2 other people that offered help to ease it, only @Krenair had the time (thanks!) to give a hand. It cannot be done in a rush because it is far from a standard procedure.

This will NOT get done before the 17th of January.

Dereckson added a comment.EditedJan 13 2017, 3:35 PM

@jcrespo Thanks for the update. My statement was to be read as "Apache, MediaWiki configuration and other regular tasks to create a wiki are ready." and not as "DBA operation will be trivial". Your remarks are wise, and indeed this will be a complicated process for the reasons you evoked.

@Dereckson I also didn't meant to be hard on you, I just wanted to transmit that this is not forgotten and that I am not slacking off, and trying to get understood about why it needs extra time giving extra context. Hopefully this will not take too long anyway.

This worries me: T157636#3012112

@Dereckson: let's schedule the deletion and the recreation at the same time to avoid issues like this one.

hashar removed a subscriber: hashar.Feb 10 2017, 2:52 PM

This worries me: T157636#3012112

@Dereckson: let's schedule the deletion and the recreation at the same time to avoid issues like this one.

A good moment to create a wiki is generally thursday after the MediaWiki train, ie 22h UTC, as the site gets the current MediaWiki version and won't be version bumped, offering some days of stability at launch time to be sure any issue is purely config related, and not train related.

As this week it's probably too early to correctly plan, perhaps the week after, Thursday 23 after the MediaWiki train, ie at 22h?

Another possibility if you prefer earlier in the day could be Monday 27 in the EU morning.

Let's aim for the 23, if that is ok for you, but with the possibility of delaying it if we need more time for preparation/something goes wrong before that date.

Ok, works for me.

I had some things in the way, and we have some important maintenance this afternoon (T153768) that could take more than expected, ok to delay until 27 morning, as I previewed? If you are ok with it, let's add it to Deployments page when possible.

We should also warn in advance to Rel-eng of this potential breakage.

demon added a comment.Feb 23 2017, 5:52 PM

We should also warn in advance to Rel-eng of this potential breakage.

Releng knows :)

I am prepared for this (e.g. backups and prepared to recover ASAP), but I will not drop anything until just before Dereckson or anyone else start the recreation of the actual database on the db.

jcrespo removed jcrespo as the assignee of this task.

Next step is to plan a deployment window, something we initially wanted do to this morning.

@jcrespo I ask greg a green light for a window Tuesday 11am CET?

I ask greg a green light for a window Tuesday 11am CET?

Any new deployment window on the horizon? :)

@greg-g @jcrespo I suggest the following approach: we note it at the top of
the calendar for the 27th week, so it's announced, and we can try to
schedule it Monday/Tuesday/Wednesday morning when both jaime and me are
available.

As I said, I am ready. Send me a calendar invite or ping me well in advance so I can do the DB stuff.

Just pinging to make sure this hasn't fallen off the radar :) is there anything blocking the scheduling?

From the DBA side, we are just waiting for @Dereckson to let us know when this want to be done as Jaime expressed here: T126832#3117923

Thanks to have confirmed your interest @waldyrious, we currently have several wikis to create, and wait logos for them, so my first intent was to do a slot for all them. But I'm scheduling a window for pt.wikimedia right now, so we can move forward on your wiki, independently of the status of the other new projects.

Thanks to have confirmed your interest @waldyrious, we currently have several wikis to create, and wait logos for them, so my first intent was to do a slot for all them. But I'm scheduling a window for pt.wikimedia right now, so we can move forward on your wiki, independently of the status of the other new projects.

Just for you to know, the 19th we have the DC switchover, so the day after and before might be hard to get it, in case we have unexpected issues with the DC.
The date to get it switched back is the 3rd of May, so probably the same applies, getting it done the day before or after might be hard.

Cheers!

And 17 is still Easter, so next week seems difficult. Monday 24 is then the
first available deployment day afterwards.I tentatively scheduled a windows
that morning.

Mentioned in SAL (#wikimedia-operations) [2017-04-24T08:55:08Z] <jynus> dropping ptwikimedia from s3 T126832

Mentioned in SAL (#wikimedia-operations) [2017-04-24T09:04:42Z] <jynus> dropping ptwikimedia from x1 T126832

Mentioned in SAL (#wikimedia-operations) [2017-04-24T09:08:34Z] <jynus> dropping ptwikimedia from es2 T126832

Mentioned in SAL (#wikimedia-operations) [2017-04-24T09:11:01Z] <jynus> dropping ptwikimedia from es3 T126832

Mentioned in SAL (#wikimedia-operations) [2017-04-24T09:14:16Z] <jynus> dropping ptwikimedia from es1012,es1016,es1018,es2011,es2012,es2013, T126832

Mentioned in SAL (#wikimedia-operations) [2017-04-24T10:00:47Z] <jynus> disabling puppet on app servers for apache config deploy T126832

Change 270479 merged by Jcrespo:
[operations/puppet@production] Apache configuration for pt.wikimedia.org

https://gerrit.wikimedia.org/r/270479

Dereckson updated the task description. (Show Details)Apr 24 2017, 10:32 AM

In a few minutes, pt.wikimedia.org will temporarily redirect to pt.wikipedia.org (as handled again by our application server, with an entrypoint happy to redirect <known language>.wikimedia.org or <known language>.wikipedia.org).

Then, around 12:00 UTC, I'll recreate the database and deploy the new pt.wikimedia configuration, and you'll get a fresh wiki hosted on the WMF cluster available.

This comment was removed by Dereckson.

Change 314792 merged by jenkins-bot:
[operations/mediawiki-config@master] Respawn ptwikimedia configuration

https://gerrit.wikimedia.org/r/314792

Mentioned in SAL (#wikimedia-operations) [2017-04-24T11:41:02Z] <Dereckson> Recreate database for ptwikimedia (T126832)

Mentioned in SAL (#wikimedia-operations) [2017-04-24T11:42:55Z] <dereckson@naos> rebuilt wikiversions.php and synchronized wikiversions files: +pt.wikimedia (T126832)

Mentioned in SAL (#wikimedia-operations) [2017-04-24T11:48:32Z] <dereckson@naos> Synchronized wmf-config/InitialiseSettings.php: Initial configuration for pt.wikimedia (T126832)

Mentioned in SAL (#wikimedia-operations) [2017-04-24T11:50:10Z] <Dereckson> mwscript extensions/WikimediaMaintenance/filebackend/setZoneAccess.php ptwikimedia --backend=local-multiwrite (T126832)

@Alchimista Wiki has been created and seems to work fine, you can import.

Mentioned in SAL (#wikimedia-operations) [2017-04-24T11:55:41Z] <dereckson@naos> Synchronized multiversion/MWMultiVersion.php: Entry point for pt.wikimedia.org (T126832) (duration: 00m 44s)

Dereckson updated the task description. (Show Details)Apr 25 2017, 10:35 AM

Thanks everyone, much appreciated!

@Dereckson- is this a public wiki, should it be replicated to labs?

Yes, this is a public-facing wiki, to be replicated, yes (if we replicate .wikimedia.org chapter wikis too — check if bewikimedia is replicated).

We do it normally as long as it is not private (the text written there is open to the internet).

Okay, so yes, we can replicate ptwikimedia.

done, views will be created at: T164103

We do it normally as long as it is not private (the text written there is open to the internet).

Note that we did have non-public pages on our previous wiki (whose content is planned for migration between today and tomorrow), in a "Confidencial:" namespace, using the Lockdown extension.

However, that shouldn't be a problem since we won't be importing those pages, and will move them to google docs instead. I'm writing this here both for historical reference, and because we're open to any suggestion for alternatives to google docs.

waldyrious- no data existe live anymore of the old wiki (only temporarily on backups), not even on labs. Note that while content exposing follows wiki autorization rules, most content metadata is freely exposed on labs (who edited who and where). If that is not ok for the new wiki, you should file a new bug before any import is done asking to disable access from labs (and warning at T164103).

waldyrious- no data existe live anymore of the old wiki (only temporarily on backups), not even on labs.

That's not a problem -- by "old wiki" I meant the one currently live at wikimedia.pt and whose content is being migrated to the one that was created now. The non-public parts of that content won't be migrated to the new wiki, as I mentioned above (unless there's a new way to mix public and private content on a wiki that we're unaware of). As such, there are no caveats for T164103 (from the part of the ptwikimedia wiki, at least).

The MediaWiki developers position not to implement ACL in namespaces,
pages, group of pages, etc. is still the current doctrine.

You took a risk with the Lockdown extension: the MediaWiki code base
exposes metadata several ways, and isn't written to play nice with ACL
extensions.

Those weren't really sensitive (in e.g. the legal sense) documents, just internal organizational stuff that didn't make sense to be published, so there's no significant risk for us (in fact, most of those documents aren't event current anymore). We already have plans to deal with that content, and were just making sure there weren't new recommendations for that use case. Thanks all for confirming.

On a slight tangent: during the import, we're taking care to avoid importing all pages indiscriminately, to reduce some of the cruft (templates, redirects, images from commons, etc.) that accumulated over the years. It would be very helpful to this effect if we could run maintenance scripts on the wiki during the import process. Would it be possible to install Extension:Maintenance? If so, let us know if you'd prefer a separate issue to track that.

No, we can't. But you can arrange with someone with Terbium access a
maintenance window to run the scripts together.

Which scripts exactly do you want to run? Are we talking about existing scripts on the mediawiki maintenance hosts (terbium/wasat)?

@Dzahn According
https://www.mediawiki.org/wiki/Extension:Maintenance#List_of_supported_maintenance_scripts
and the section afterwards, this is a wrapper to run the CLI MediaWiki
maintenance scripts, the same kind we run on Terbium, yes.

@waldyrious by the way, I'm not sure the scripts you one are supported by
this extension.

The scripts are the ones who feed the Special:SpecialPages Maintenance reports: Dead-end pages, Orphaned pages, Uncategorized[categories | pages | files], wanted [categories | pages | files], and some others like What links here. Also, categorized pages aren't shown on categories. I can make a bot script to make null edits, but since it's almost 300 pages, a server side action seems to be a much more sane solution.

Alchimista- unless there is a configuration error, those are setup automatically for you when are part of our production cluster. Wait 1 or 2 weeks, and if they have been run, report them. The frequency of execution varies, last update on other wikis was on the 23 April and it hasn't been run again since.

Dereckson closed this task as Resolved.May 20 2017, 10:03 AM