Page MenuHomePhabricator

Publish recurring Flow dumps at http://dumps.wikimedia.org/
Closed, ResolvedPublic

Description

Event Timeline

Restricted Application added subscribers: StudiesWorld, Aklapper. · View Herald TranscriptNov 24 2015, 3:47 PM
Mattflaschen-WMF renamed this task from Publish Flow dumps at http://dumps.wikimedia.org/ to Publish recurring Flow dumps at http://dumps.wikimedia.org/.Nov 24 2015, 3:48 PM
Mattflaschen-WMF set Security to None.
Catrope triaged this task as Normal priority.Dec 19 2015, 1:32 AM

Change 264129 had a related patch set uploaded (by Reedy):
Don't try and run dumpBackup.php if not enabled on the wiki

https://gerrit.wikimedia.org/r/264129

From discussion on IRC with Reedy, Mattflaschen:

Let's get the change https://gerrit.wikimedia.org/r/#/c/264129/ merged in.

It looks like the easiest thing for me as far as deciding when to add the dump job to the outputs of the regular dump runs is to check, for any given wiki, if wmgUseFlow is set, since otherwise I have to hardcode some logic about private wikis etc.

Change 264129 merged by jenkins-bot:
Don't try and run dumpBackup.php if not enabled on the wiki

https://gerrit.wikimedia.org/r/264129

Nemo_bis raised the priority of this task from Normal to High.
Nemo_bis added a subscriber: Nemo_bis.

Fix priority per blocked task.

Hydriz added a subscriber: Hydriz.
ArielGlenn moved this task from Backlog to Up Next on the Dumps-Generation board.Jan 20 2016, 2:15 PM

I'm trying to run the flow maintenance script from the command line on an actual snapshot host, thusly:

php5 /srv/mediawiki/multiversion/MWScript.php extensions/Flow/maintenance/dumpBackup.php --wiki=elwiki --full | more

and I'm getting the usage message "This script dumps the Flow discussion database" etc. Can you folks help me out here?

I'm trying to run the flow maintenance script from the command line on an actual snapshot host, thusly:

php5 /srv/mediawiki/multiversion/MWScript.php extensions/Flow/maintenance/dumpBackup.php --wiki=elwiki --full | more

and I'm getting the usage message "This script dumps the Flow discussion database" etc. Can you folks help me out here?

This was broken by rMW0a0b02b56c2c: Add support for specifying options multiple times in Maintenance scripts..

I started reviewing https://gerrit.wikimedia.org/r/#/c/275519/ ("Turn dumpBackup into proper Maintenance script") which will fix this. Also, I'm going to review https://gerrit.wikimedia.org/r/#/c/276404/ which is a general fix.

Can you retry when https://gerrit.wikimedia.org/r/#/c/275519/ is merged?

That looks better. I need to see if all the options necessary for dumps are there (output fles etc) but I can get working on the python wrapper now.

ArielGlenn moved this task from Up Next to Active on the Dumps-Generation board.Mar 18 2016, 11:14 PM
ArielGlenn moved this task from Active to Up Next on the Dumps-Generation board.Mar 29 2016, 1:10 PM

Uh, this is done, insofar as they show up on dumps.wm.org as files to be downloaded just like everything else, for every dump run. Can we close this out?

Uh, this is done, insofar as they show up on dumps.wm.org as files to be downloaded just like everything else, for every dump run. Can we close this out?

https://dumps.wikimedia.org/wikidatawiki/20160601/ has no Flow data.

https://dumps.wikimedia.org/rswikimedia/20160601/ has "2016-06-14 23:22:52 failed content of flow pages in xml format" (perhaps because there are no Flow pages yet https://rs.wikimedia.org/wiki/Special:AllPages?from=&to=&namespace=2600 )

Mattflaschen-WMF closed this task as Resolved.Jun 16 2016, 5:44 PM

It's marked as in progress:

" This dump is in progress; see also the previous dump from 2016-05-01".

But I'm not sure if it should show the Flow entry as pending, etc. T89398: Add Flow to database dumps mentions, "I push out a fix to get the right list of flow-enabled wikis" which should take care of this if it's missing.

https://dumps.wikimedia.org/rswikimedia/20160601/ has "2016-06-14 23:22:52 failed content of flow pages in xml format" (perhaps because there are no Flow pages yet https://rs.wikimedia.org/wiki/Special:AllPages?from=&to=&namespace=2600 )

Flow is not enabled there at all. This is also discussed at T89398: Add Flow to database dumps.

Uh, this is done, insofar as they show up on dumps.wm.org as files to be downloaded just like everything else, for every dump run. Can we close this out?

Yeah, it looks like these remaining small issues are tracked at T89398.

Hydriz moved this task from Incoming to Done on the Datasets-Archiving board.Jul 3 2016, 10:27 AM
ArielGlenn moved this task from Active to Done on the Dumps-Generation board.Jul 12 2016, 6:36 AM