Page MenuHomePhabricator

Publish recurring Flow dumps at http://dumps.wikimedia.org/
Closed, ResolvedPublic

Description

Event Timeline

Mattflaschen-WMF renamed this task from Publish Flow dumps at http://dumps.wikimedia.org/ to Publish recurring Flow dumps at http://dumps.wikimedia.org/.Nov 24 2015, 3:48 PM
Mattflaschen-WMF set Security to None.

Change 264129 had a related patch set uploaded (by Reedy):
Don't try and run dumpBackup.php if not enabled on the wiki

https://gerrit.wikimedia.org/r/264129

From discussion on IRC with Reedy, Mattflaschen:

Let's get the change https://gerrit.wikimedia.org/r/#/c/264129/ merged in.

It looks like the easiest thing for me as far as deciding when to add the dump job to the outputs of the regular dump runs is to check, for any given wiki, if wmgUseFlow is set, since otherwise I have to hardcode some logic about private wikis etc.

Change 264129 merged by jenkins-bot:
Don't try and run dumpBackup.php if not enabled on the wiki

https://gerrit.wikimedia.org/r/264129

Nemo_bis raised the priority of this task from Medium to High.Jan 19 2016, 7:55 AM
Nemo_bis subscribed.

Fix priority per blocked task.

I'm trying to run the flow maintenance script from the command line on an actual snapshot host, thusly:

php5 /srv/mediawiki/multiversion/MWScript.php extensions/Flow/maintenance/dumpBackup.php --wiki=elwiki --full | more

and I'm getting the usage message "This script dumps the Flow discussion database" etc. Can you folks help me out here?

I'm trying to run the flow maintenance script from the command line on an actual snapshot host, thusly:

php5 /srv/mediawiki/multiversion/MWScript.php extensions/Flow/maintenance/dumpBackup.php --wiki=elwiki --full | more

and I'm getting the usage message "This script dumps the Flow discussion database" etc. Can you folks help me out here?

This was broken by rMW0a0b02b56c2c: Add support for specifying options multiple times in Maintenance scripts..

I started reviewing https://gerrit.wikimedia.org/r/#/c/275519/ ("Turn dumpBackup into proper Maintenance script") which will fix this. Also, I'm going to review https://gerrit.wikimedia.org/r/#/c/276404/ which is a general fix.

Can you retry when https://gerrit.wikimedia.org/r/#/c/275519/ is merged?

That looks better. I need to see if all the options necessary for dumps are there (output fles etc) but I can get working on the python wrapper now.

Uh, this is done, insofar as they show up on dumps.wm.org as files to be downloaded just like everything else, for every dump run. Can we close this out?

Uh, this is done, insofar as they show up on dumps.wm.org as files to be downloaded just like everything else, for every dump run. Can we close this out?

https://dumps.wikimedia.org/wikidatawiki/20160601/ has no Flow data.

https://dumps.wikimedia.org/rswikimedia/20160601/ has "2016-06-14 23:22:52 failed content of flow pages in xml format" (perhaps because there are no Flow pages yet https://rs.wikimedia.org/wiki/Special:AllPages?from=&to=&namespace=2600 )

It's marked as in progress:

" This dump is in progress; see also the previous dump from 2016-05-01".

But I'm not sure if it should show the Flow entry as pending, etc. T89398: Add Flow to database dumps mentions, "I push out a fix to get the right list of flow-enabled wikis" which should take care of this if it's missing.

https://dumps.wikimedia.org/rswikimedia/20160601/ has "2016-06-14 23:22:52 failed content of flow pages in xml format" (perhaps because there are no Flow pages yet https://rs.wikimedia.org/wiki/Special:AllPages?from=&to=&namespace=2600 )

Flow is not enabled there at all. This is also discussed at T89398: Add Flow to database dumps.

Uh, this is done, insofar as they show up on dumps.wm.org as files to be downloaded just like everything else, for every dump run. Can we close this out?

Yeah, it looks like these remaining small issues are tracked at T89398.