ArielGlenn (ariel)
User

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Wednesday

  • Clear sailing ahead.

User Details

User Since
Oct 8 2014, 7:09 PM (189 w, 4 d)
Availability
Available
IRC Nick
apergos
LDAP User
ArielGlenn
MediaWiki User
ArielGlenn

Recent Activity

Sat, May 26

ArielGlenn committed R1885:1f410185a500: add snapshot1008 to dumps scap targets (authored by ArielGlenn).
add snapshot1008 to dumps scap targets
Sat, May 26, 6:04 PM
ArielGlenn added a comment to T195560: Request For Ownership for repo labs-tools-Commons-twitter-bot.

@ArielGlenn: Sorry, that was not intentional. We ran into T96464 here. :(

Sat, May 26, 3:27 PM · Repository-Ownership-Requests
ArielGlenn added a comment to T181936: Give misc dump crons their own host.

@hoo: wikidata weeklies now run on snapshot1008. Do not try to run them on snapshot1007!

Sat, May 26, 3:19 PM · Patch-For-Review, hardware-requests, Operations, Datasets-General-or-Unknown, Dumps-Generation
ArielGlenn added a comment to T181936: Give misc dump crons their own host.

To do:

Sat, May 26, 3:16 PM · Patch-For-Review, hardware-requests, Operations, Datasets-General-or-Unknown, Dumps-Generation
ArielGlenn closed T195385: Rack and setup snapshot1008, a subtask of T181936: Give misc dump crons their own host, as Resolved.
Sat, May 26, 3:13 PM · Patch-For-Review, hardware-requests, Operations, Datasets-General-or-Unknown, Dumps-Generation
ArielGlenn closed T195385: Rack and setup snapshot1008 as Resolved.
Sat, May 26, 3:13 PM · Operations, Datasets-General-or-Unknown, Dumps-Generation
ArielGlenn added a comment to T195385: Rack and setup snapshot1008.

Puppetization is done, closing this in favor of T181936 if there are any issue.

Sat, May 26, 3:12 PM · Operations, Datasets-General-or-Unknown, Dumps-Generation

Fri, May 25

ArielGlenn added a comment to T195560: Request For Ownership for repo labs-tools-Commons-twitter-bot.

Hey @Aklapper please don't use that plain ariel user, it was made for the CoC NDA only (required non wmf email). It even as of today says in parens DO NOT USE. The right account is this one. Thanks!

Fri, May 25, 10:50 PM · Repository-Ownership-Requests
ArielGlenn updated subscribers of T195358: Add WMDE-leszek to the ldap/nda group.

Hi Leszek, I think there's a separate NDA you sign for this access, not the L2 one.

Fri, May 25, 2:07 PM · LDAP-Access-Requests

Thu, May 24

ArielGlenn added a comment to T194027: Creating a Facebook Messenger Bot for Wikipedia.

Welp. You're using the restbase stuff so the above isn't valid for it, my bad.

Thu, May 24, 8:25 AM · Possible-Tech-Projects
ArielGlenn added a comment to T194027: Creating a Facebook Messenger Bot for Wikipedia.

I like the app so far, though I haven't tested it out yet. One thing that might be nice is to allow users to configure it for wikipedias in other languages; I suppose for now you would want to stick the the top ten or so, if they have similar featured feeds.

Thu, May 24, 7:53 AM · Possible-Tech-Projects

Wed, May 23

ArielGlenn moved T181936: Give misc dump crons their own host from Blocked/Stalled/Waiting for event to Active on the Dumps-Generation board.
Wed, May 23, 3:12 PM · Patch-For-Review, hardware-requests, Operations, Datasets-General-or-Unknown, Dumps-Generation

Tue, May 22

ArielGlenn added a comment to T195289: Add Addshore & possibly other WMDE devs/deployers to the wikidata icinga contact list.

Before I add @aude and @Ladsgroup let's make sure they want to be added, I suppose.

Tue, May 22, 11:24 AM · Patch-For-Review, User-Addshore, Wikidata, Operations, monitoring
ArielGlenn moved T184446: Configure Toolforge replica views and dumps for the new MCR tables from Backlog to Done on the Dumps-Generation board.
Tue, May 22, 10:39 AM · Patch-For-Review, Dumps-Generation, Data-Services, DBA, MediaWiki-Platform-Team
ArielGlenn added a comment to T172165: Require either PHP 7.0+ or HHVM in MW 1.31.

As mentioned a couple of days ago, some jobs on terbium are still running with hp5; those are the jobs that run via foreachwiki as www-data which calls foreachwikiindblist.
A list of those jobs is here: P7139

Tue, May 22, 9:48 AM · MW-1.32-release-notes (WMF-deploy-2018-05-29 (1.32.0-wmf.6)), MW-1.31-release-notes, MW-1.31-release, TechCom-RFC (TechCom-Approved), MediaWiki-General-or-Unknown
ArielGlenn added a comment to T195224: Add Tonina Zhelyazkova (Tonina Zhelyazkova) to the ldap/nda group.

@RStallman-legalteam Great! Thanks for that.
@Tonina_Zhelyazkova_WMDE you have been added to the nda group. You should be all set.

Tue, May 22, 8:03 AM · LDAP-Access-Requests

Mon, May 21

ArielGlenn reopened T173050: Investigate icinga (einsteinium) load as "Open".

I'm going to re-open this; https://grafana.wikimedia.org/dashboard/db/prometheus-machine-stats?panelId=9&fullscreen&orgId=1&var-server=einsteinium&var-datasource=eqiad%20prometheus%2Fops&from=1524151895118&to=1526903931086 shows the load has gone up significantly since May 2nd. Maybe folks still think this is well in the bounds of ok, n which case, feel free to close again.

Mon, May 21, 12:04 PM · Patch-For-Review, monitoring
ArielGlenn added a comment to T179059: Consider skipping or modifying recombine step for page content dumps for wikidata.

That's good news about lbzip2. Should I set you up with a test directory so you can run dump steps using it and see how that is compared to bzip2?

Mon, May 21, 11:38 AM · Patch-For-Review, Dumps-Generation
ArielGlenn moved T195223: Add Christoph Jauera (WMDE-Fisch) to the ldap/nda group from Backlog to Awaiting User Input on the LDAP-Access-Requests board.
Mon, May 21, 10:27 AM · LDAP-Access-Requests
ArielGlenn moved T195224: Add Tonina Zhelyazkova (Tonina Zhelyazkova) to the ldap/nda group from Backlog to Awaiting User Input on the LDAP-Access-Requests board.
Mon, May 21, 10:26 AM · LDAP-Access-Requests
ArielGlenn updated subscribers of T195223: Add Christoph Jauera (WMDE-Fisch) to the ldap/nda group.

@RStallman-legalteam Can you get them squared away with signing the appropriate NDA? Thanks!
@WMDE-Fisch Please let Rachel know your email address so she can invite you to sign the NDA.

Mon, May 21, 10:26 AM · LDAP-Access-Requests
ArielGlenn updated subscribers of T195224: Add Tonina Zhelyazkova (Tonina Zhelyazkova) to the ldap/nda group.

@RStallman-legalteam Can you get them squared away with signing the appropriate NDA? Thanks!
@Tonina_Zhelyazkova_WMDE Please let Rachel know your email address so she can invite you to sign the NDA.

Mon, May 21, 10:26 AM · LDAP-Access-Requests
ArielGlenn triaged T194669: Provide a mean to mass discard/reject subscription requests on Wikimedia mailing lists as Normal priority.
Mon, May 21, 10:17 AM · Wikimedia-Mailing-lists, Operations
ArielGlenn triaged T194997: Track more detailed disk usage on maps servers as Normal priority.
Mon, May 21, 10:14 AM · Operations, Discovery, Maps
ArielGlenn triaged T195059: Cannot add or update records under DNS zones in Horizon as Normal priority.
Mon, May 21, 10:13 AM · Operations, Cloud-VPS

Sun, May 20

ArielGlenn created P7139 cron jobs with foreachwiki on terbium (www-data).
Sun, May 20, 11:19 AM
ArielGlenn moved T194124: Allow use of separate directory for storing/reading from prefetch files from Active to Blocked/Stalled/Waiting for event on the Dumps-Generation board.
Sun, May 20, 9:45 AM · Patch-For-Review, Dumps-Generation
ArielGlenn added a comment to T194124: Allow use of separate directory for storing/reading from prefetch files.

Keeping this open until tomorrow morning, when the first job that should preserve partial dumps will run.

Sun, May 20, 9:45 AM · Patch-For-Review, Dumps-Generation
ArielGlenn moved T181029: Upgrade dump hosts to stretch with php7 from Active to Done on the Dumps-Generation board.
Sun, May 20, 9:17 AM · Patch-For-Review, Dumps-Generation, MediaWiki-General-or-Unknown
ArielGlenn closed T181029: Upgrade dump hosts to stretch with php7, a subtask of T172165: Require either PHP 7.0+ or HHVM in MW 1.31, as Resolved.
Sun, May 20, 9:16 AM · MW-1.32-release-notes (WMF-deploy-2018-05-29 (1.32.0-wmf.6)), MW-1.31-release-notes, MW-1.31-release, TechCom-RFC (TechCom-Approved), MediaWiki-General-or-Unknown
ArielGlenn closed T181029: Upgrade dump hosts to stretch with php7 as Resolved.

This task is finally complete!

Sun, May 20, 9:16 AM · Patch-For-Review, Dumps-Generation, MediaWiki-General-or-Unknown

Sat, May 19

ArielGlenn added a comment to T195028: Add goat import and export capability.

We'll need to settle on the xml schema update. Current schema here: https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/core/+/master/docs/export-0.10.xsd

Sat, May 19, 7:04 PM · Goatification, Wikimedia-Hackathon-2018
Sjoerddebruin awarded T195028: Add goat import and export capability a Goat token.
Sat, May 19, 2:01 PM · Goatification, Wikimedia-Hackathon-2018
Bmueller awarded T195028: Add goat import and export capability a Goat token.
Sat, May 19, 5:24 AM · Goatification, Wikimedia-Hackathon-2018
ArielGlenn triaged T195028: Add goat import and export capability as Normal priority.
Sat, May 19, 1:21 AM · Goatification, Wikimedia-Hackathon-2018

Fri, May 18

ArielGlenn added a comment to T164262: Make flow dumps run faster.

Yesterday's run of 3 hours went even better today: 1.5 hours for batches of 200 workflows each. I tried that same batchsize on snapshot1007 (php5) and it slowed to a crawl on the 6th batch, the behavior we've seen all this time in the flow history dumps. I'm trying a full run with no batching on the php7/stretch host now to see if that's magically better also.

Fri, May 18, 3:14 PM · MW-1.32-release-notes (WMF-deploy-2018-05-22 (1.32.0-wmf.5)), User-ArielGlenn, MW-1.30-release-notes (WMF-deploy-2017-07-11_(1.30.0-wmf.9)), Patch-For-Review, StructuredDiscussions, Dumps-Generation, Collaboration-Team-Triage
ArielGlenn added a comment to T172025: Flow isAllowed gets actual revision text before it is needed.

Going to move the above discussion to the parent task ('make flow run faster").

Fri, May 18, 3:11 PM · Patch-For-Review, Collaboration-Team-Triage (Collab-Team-This-Quarter), StructuredDiscussions

Thu, May 17

ArielGlenn added a comment to T172025: Flow isAllowed gets actual revision text before it is needed.

I've hacked together a script to do this month's flow history run in small batches of 50 boards at a time; it ran in three hours. Admittedly, I ran this on a php7/stretch host; I'll try tomorrow without the batching and see if there's the same slowdown I saw on the other box with php5 on it.

Thu, May 17, 10:37 PM · Patch-For-Review, Collaboration-Team-Triage (Collab-Team-This-Quarter), StructuredDiscussions
ArielGlenn added a comment to T172025: Flow isAllowed gets actual revision text before it is needed.

This month's run is even slower than usual. To be specific, it's taken over 8 days and is still not complete. While monkeying around looking for ways to get it done quicker, I found the following interesting thing, from the flow current dump (not the history dump):

<board id="rl7iby6wgksbpfno" title="Project_talk:Sandbox/Structured_Discussions_test">
<board id="u0kgwe453ib4ons4" title="Project_talk:Sandbox/Structured_Discussions_test">

Doesn't the title correspond to a page title? How then can this be listed with two board ids?

Thu, May 17, 10:33 PM · Patch-For-Review, Collaboration-Team-Triage (Collab-Team-This-Quarter), StructuredDiscussions

Tue, May 15

ArielGlenn moved T193626: Alerts (subscription) when database dumps are completed from Backlog to Blocked/Stalled/Waiting for event on the Dumps-Generation board.
Tue, May 15, 1:43 PM · Dumps-Generation

Mon, May 14

ArielGlenn added a comment to T181029: Upgrade dump hosts to stretch with php7.

Currently waiting for the following dumps to complete: commonswiki, dewiki, frwiki, ruwiki, and the dreaded mediawikiwiki flow history dumps. I estimate three days for all of these.

Mon, May 14, 8:45 AM · Patch-For-Review, Dumps-Generation, MediaWiki-General-or-Unknown
ArielGlenn moved T181029: Upgrade dump hosts to stretch with php7 from Blocked/Stalled/Waiting for event to Active on the Dumps-Generation board.
Mon, May 14, 8:42 AM · Patch-For-Review, Dumps-Generation, MediaWiki-General-or-Unknown

Thu, May 10

ArielGlenn added a comment to T181936: Give misc dump crons their own host.

In theory the new host is arriving today, and if all goes well it should be available for getting its puppet role by early next week. We can probably use raid1-lvm-ext4-srv.cfg for it (can't use the one that snapshot1005-7 have because they are HP with hw raid).

Thu, May 10, 12:22 PM · Patch-For-Review, hardware-requests, Operations, Datasets-General-or-Unknown, Dumps-Generation
ArielGlenn moved T161509: Test php7 on snapshot1001 for dumps from Active to Done on the Dumps-Generation board.
Thu, May 10, 6:09 AM · Patch-For-Review, Dumps-Generation
ArielGlenn moved T193399: Test all misc dumps cron jobs on snapshot1001 with php7/stretch from Active to Done on the Dumps-Generation board.
Thu, May 10, 6:09 AM · Dumps-Generation
ArielGlenn added a comment to T181029: Upgrade dump hosts to stretch with php7.

The last host to upgrade, snapshot1007, can be done when the current dump run completes, or at least the jobs on that host complete. The misc cron jobs host, see T190112, is due to arrive today and with any luck will be available for install early next week, at which point cron jobs could be moved there.

Thu, May 10, 6:08 AM · Patch-For-Review, Dumps-Generation, MediaWiki-General-or-Unknown
ArielGlenn closed T161509: Test php7 on snapshot1001 for dumps, a subtask of T181029: Upgrade dump hosts to stretch with php7, as Resolved.
Thu, May 10, 6:05 AM · Patch-For-Review, Dumps-Generation, MediaWiki-General-or-Unknown
ArielGlenn closed T161509: Test php7 on snapshot1001 for dumps as Resolved.

Welp. This is done. How interesting.

Thu, May 10, 6:05 AM · Patch-For-Review, Dumps-Generation
ArielGlenn closed T193399: Test all misc dumps cron jobs on snapshot1001 with php7/stretch as Resolved.

Looks like this task is done. Huh!

Thu, May 10, 6:05 AM · Dumps-Generation
ArielGlenn closed T193399: Test all misc dumps cron jobs on snapshot1001 with php7/stretch, a subtask of T161509: Test php7 on snapshot1001 for dumps, as Resolved.
Thu, May 10, 6:05 AM · Patch-For-Review, Dumps-Generation

Wed, May 9

ArielGlenn added a comment to T194124: Allow use of separate directory for storing/reading from prefetch files.

I'm getting ready to abandon those above patchsets; what we really want is a nice way to keep just the prefetch files from some old dumps (plus the status file), instead of having to have a separate directory and copy/move them in and go look them up there etc.

Wed, May 9, 6:09 PM · Patch-For-Review, Dumps-Generation
ArielGlenn updated subscribers of T193399: Test all misc dumps cron jobs on snapshot1001 with php7/stretch.

@dcausse has kindly agreed to try importing the cirrus search indices resulting from the test dumps, or some of them, as another test that they are good.

Wed, May 9, 11:35 AM · Dumps-Generation

Tue, May 8

ArielGlenn added a comment to T193626: Alerts (subscription) when database dumps are completed.

Great! You have the link for them? They are at e.g. https://dumps.wikimedia.org/enwiktionary/latest/ (change the wiki name accordingly).

Tue, May 8, 7:14 PM · Dumps-Generation
ArielGlenn added a comment to T193399: Test all misc dumps cron jobs on snapshot1001 with php7/stretch.

All outputs look good. In case we get more expert eyes to look at these, they are available on snapshot1001 in /mnt/dumpsdata2/crontests in various subdirectories.

Tue, May 8, 9:10 AM · Dumps-Generation
ArielGlenn updated the task description for T193399: Test all misc dumps cron jobs on snapshot1001 with php7/stretch.
Tue, May 8, 9:07 AM · Dumps-Generation
ArielGlenn moved T194124: Allow use of separate directory for storing/reading from prefetch files from Backlog to Active on the Dumps-Generation board.

There are a couple of very drafty changesets on this already, see https://gerrit.wikimedia.org/r/#/c/423241/ and https://gerrit.wikimedia.org/r/#/c/423242/ The approach outlined in them may change completely.

Tue, May 8, 7:58 AM · Patch-For-Review, Dumps-Generation
ArielGlenn triaged T194124: Allow use of separate directory for storing/reading from prefetch files as Normal priority.
Tue, May 8, 7:57 AM · Patch-For-Review, Dumps-Generation
ArielGlenn added a comment to T193626: Alerts (subscription) when database dumps are completed.

There are rss feeds per dump output file; do you want notification when all dump output files are done, in addition?

Tue, May 8, 7:50 AM · Dumps-Generation
ArielGlenn moved T182348: dcatap.rdf in dumps contains invalid data from Backlog to Done on the Dumps-Generation board.
Tue, May 8, 7:49 AM · Dumps-Generation, Wikidata
ArielGlenn moved T190457: Include checksums in https://dumps.wikimedia.org/wikidatawiki/entities/ from Backlog to Done on the Dumps-Generation board.
Tue, May 8, 7:49 AM · Wikidata-Ministry-Of-Magic, Dumps-Generation, Wikidata
ArielGlenn moved T184258: get a snapshot instance running in beta with stretch, php7 from Active to Done on the Dumps-Generation board.
Tue, May 8, 7:49 AM · Dumps-Generation, MediaWiki-General-or-Unknown
ArielGlenn moved T182540: get datset1001, ms1001 ready for decommission from Active to Done on the Dumps-Generation board.
Tue, May 8, 7:49 AM · Patch-For-Review, Dumps-Generation
ArielGlenn moved T191177: data retrieval/write issues via NFS on dumpsdata1001, impacting some dump jobs from Active to Done on the Dumps-Generation board.
Tue, May 8, 7:41 AM · Patch-For-Review, Dumps-Generation, Operations
ArielGlenn closed T191177: data retrieval/write issues via NFS on dumpsdata1001, impacting some dump jobs as Resolved.

This month's run looks good, no nulls in stub files, no other weirdness either so I'm going to close this.

Tue, May 8, 7:41 AM · Patch-For-Review, Dumps-Generation, Operations
ArielGlenn moved T178047: Investigate why wikidata abstracts dumps are so large, see if we can reduce the size somehow. from Active to Done on the Dumps-Generation board.
Tue, May 8, 7:31 AM · MW-1.32-release-notes (WMF-deploy-2018-05-01 (1.32.0-wmf.2)), MediaWiki-extensions-WikibaseRepository, Wikidata, Dumps-Generation
ArielGlenn renamed T178047: Investigate why wikidata abstracts dumps are so large, see if we can reduce the size somehow. from Investigate why wikidata abstracts dumps are so large to Investigate why wikidata abstracts dumps are so large, see if we can reduce the size somehow..
Tue, May 8, 7:31 AM · MW-1.32-release-notes (WMF-deploy-2018-05-01 (1.32.0-wmf.2)), MediaWiki-extensions-WikibaseRepository, Wikidata, Dumps-Generation
ArielGlenn closed T178047: Investigate why wikidata abstracts dumps are so large, see if we can reduce the size somehow. as Resolved.

I've checked some output files from this month's run and they look good! Closing.

Tue, May 8, 7:31 AM · MW-1.32-release-notes (WMF-deploy-2018-05-01 (1.32.0-wmf.2)), MediaWiki-extensions-WikibaseRepository, Wikidata, Dumps-Generation
ArielGlenn moved T186099: Rethink 12± hour lag of incremental dumps for Wikidata from Blocked/Stalled/Waiting for event to Done on the Dumps-Generation board.
Tue, May 8, 7:29 AM · Dumps-Generation, Wikidata.org, Wikidata
ArielGlenn closed T186099: Rethink 12± hour lag of incremental dumps for Wikidata as Declined.

I'm going to go ahead and decline this. If there is new information to take into account at a later date, it can be re-opened or a new ticket created.

Tue, May 8, 7:29 AM · Dumps-Generation, Wikidata.org, Wikidata
ArielGlenn moved T189527: dumps.wikimedia.org/enwiki/latest/ out of date files from Blocked/Stalled/Waiting for event to Done on the Dumps-Generation board.
Tue, May 8, 7:28 AM · Patch-For-Review, Dumps-Generation
ArielGlenn closed T189527: dumps.wikimedia.org/enwiki/latest/ out of date files as Resolved.

I'm going to go ahead and close this for now.

Tue, May 8, 7:27 AM · Patch-For-Review, Dumps-Generation
ArielGlenn moved T29112: Select of revisions for stub history files does not explicitly order revisions from Active to Blocked/Stalled/Waiting for event on the Dumps-Generation board.
Tue, May 8, 7:26 AM · Dumps-Generation, User-ArielGlenn, MW-1.28-release-notes, MW-1.28-release (WMF-deploy-2016-06-21_(1.28.0-wmf.7)), Patch-For-Review, DBA, Datasets-General-or-Unknown

Mon, May 7

ArielGlenn added a comment to T193399: Test all misc dumps cron jobs on snapshot1001 with php7/stretch.

Truthy nt dumps look good; next up is to look closely at the ttl ones.

Mon, May 7, 8:07 PM · Dumps-Generation
ArielGlenn created T194060: decommission dataset1001, ms1001.
Mon, May 7, 4:59 PM · User-ArielGlenn, decommission
ArielGlenn closed T182540: get datset1001, ms1001 ready for decommission as Resolved.
Mon, May 7, 4:53 PM · Patch-For-Review, Dumps-Generation
ArielGlenn updated the task description for T182540: get datset1001, ms1001 ready for decommission.
Mon, May 7, 4:53 PM · Patch-For-Review, Dumps-Generation
ArielGlenn updated the task description for T193399: Test all misc dumps cron jobs on snapshot1001 with php7/stretch.
Mon, May 7, 1:20 PM · Dumps-Generation
ArielGlenn updated the task description for T182540: get datset1001, ms1001 ready for decommission.
Mon, May 7, 10:52 AM · Patch-For-Review, Dumps-Generation
ArielGlenn added a comment to T29112: Select of revisions for stub history files does not explicitly order revisions.

Tested:

for wikiname in $list; do
> /usr/bin/php7.0 /srv/mediawiki/multiversion/MWScript.php dumpBackup.php --wiki=$wikiname --full --stub --report=1000 --output=file:"/mnt/dumpsdata2/crontest/temp/${wikiname}-stubs-stuff" --orderrevs --start=1  --end 10
> done
Mon, May 7, 10:13 AM · Dumps-Generation, User-ArielGlenn, MW-1.28-release-notes, MW-1.28-release (WMF-deploy-2016-06-21_(1.28.0-wmf.7)), Patch-For-Review, DBA, Datasets-General-or-Unknown
ArielGlenn added a comment to T193399: Test all misc dumps cron jobs on snapshot1001 with php7/stretch.

I have run a subset of the wikidata json and rdf dumps; the validator checks out for the json dumps. I'll do some inspection of the content as compared to production dumps shortly.

Mon, May 7, 8:09 AM · Dumps-Generation
ArielGlenn updated the task description for T193399: Test all misc dumps cron jobs on snapshot1001 with php7/stretch.
Mon, May 7, 8:06 AM · Dumps-Generation

Fri, May 4

ArielGlenn added a comment to T29112: Select of revisions for stub history files does not explicitly order revisions.

Mae a custom dblist file with the above wikis, ran the below and grepped the output files:

Fri, May 4, 12:44 PM · Dumps-Generation, User-ArielGlenn, MW-1.28-release-notes, MW-1.28-release (WMF-deploy-2016-06-21_(1.28.0-wmf.7)), Patch-For-Review, DBA, Datasets-General-or-Unknown
ArielGlenn added a comment to T29112: Select of revisions for stub history files does not explicitly order revisions.

List of wikis with more than 10 million revs and low prefetch rates below. Need to make sure the indices on these wikis support the ordered query before changing the config.

Fri, May 4, 11:36 AM · Dumps-Generation, User-ArielGlenn, MW-1.28-release-notes, MW-1.28-release (WMF-deploy-2016-06-21_(1.28.0-wmf.7)), Patch-For-Review, DBA, Datasets-General-or-Unknown
ArielGlenn moved T29112: Select of revisions for stub history files does not explicitly order revisions from Backlog to Active on the Dumps-Generation board.
Fri, May 4, 10:48 AM · Dumps-Generation, User-ArielGlenn, MW-1.28-release-notes, MW-1.28-release (WMF-deploy-2016-06-21_(1.28.0-wmf.7)), Patch-For-Review, DBA, Datasets-General-or-Unknown
ArielGlenn added a project to T29112: Select of revisions for stub history files does not explicitly order revisions: Dumps-Generation.

I need to go through the logs for the page content dumps for the last run and see which projects have low prefetch ratios; these would benefit from applying this change, if possible.

Fri, May 4, 10:47 AM · Dumps-Generation, User-ArielGlenn, MW-1.28-release-notes, MW-1.28-release (WMF-deploy-2016-06-21_(1.28.0-wmf.7)), Patch-For-Review, DBA, Datasets-General-or-Unknown
ArielGlenn added a comment to T47646: Create -latest alias for dumps.

This was about the copies of dumps rsynced to labs; now those are made available directly from the web server, which should have the appropriate -latest links there too. Are folks still seeing a problem?

Fri, May 4, 10:42 AM · Dumps-Generation, Cloud-Services, Cloud-VPS
ArielGlenn updated the task description for T193399: Test all misc dumps cron jobs on snapshot1001 with php7/stretch.
Fri, May 4, 10:19 AM · Dumps-Generation

Thu, May 3

ArielGlenn added a comment to T193399: Test all misc dumps cron jobs on snapshot1001 with php7/stretch.

A locally hacked version of the content translation dump script is running now, writing to a scratch area; it should complete in less than 12 hours, a few hours before the real (production) run starts.

Thu, May 3, 5:44 PM · Dumps-Generation
ArielGlenn updated the task description for T193399: Test all misc dumps cron jobs on snapshot1001 with php7/stretch.
Thu, May 3, 5:30 PM · Dumps-Generation

Wed, May 2

ArielGlenn added a comment to T193399: Test all misc dumps cron jobs on snapshot1001 with php7/stretch.

For anyone following along, I have a locally hacked copy of the cirrus dumps script which is running on the wikis mentioned above, writing to a scratch area; it will probably take a couple days to finish but I should be able to examine the first outputs tomorrow.

Wed, May 2, 5:07 PM · Dumps-Generation
ArielGlenn updated the task description for T193399: Test all misc dumps cron jobs on snapshot1001 with php7/stretch.
Wed, May 2, 2:16 PM · Dumps-Generation
ArielGlenn updated the task description for T193399: Test all misc dumps cron jobs on snapshot1001 with php7/stretch.
Wed, May 2, 1:08 PM · Dumps-Generation
ArielGlenn updated the task description for T193399: Test all misc dumps cron jobs on snapshot1001 with php7/stretch.
Wed, May 2, 9:13 AM · Dumps-Generation
ArielGlenn updated the task description for T193399: Test all misc dumps cron jobs on snapshot1001 with php7/stretch.
Wed, May 2, 7:38 AM · Dumps-Generation
ArielGlenn added a comment to T193399: Test all misc dumps cron jobs on snapshot1001 with php7/stretch.

My testing generally consists of running the job under php7 with output to a scratch directory and comparing the output against the most recent dump run under php5. In some cases I run the job only on selected wikis (commons, elwiki, enwiki, wikidatawiki, zhwiki) or in other ways restrict the job to smaller subsets so that we don't wait days and days doing it. I won't list specific results on this ticket unless there are issues, but rather just tick the checkbox as each test is completed.

Wed, May 2, 6:46 AM · Dumps-Generation
ArielGlenn updated the task description for T193399: Test all misc dumps cron jobs on snapshot1001 with php7/stretch.
Wed, May 2, 6:42 AM · Dumps-Generation

Tue, May 1

ArielGlenn moved T193399: Test all misc dumps cron jobs on snapshot1001 with php7/stretch from Backlog to Active on the Dumps-Generation board.
Tue, May 1, 4:27 PM · Dumps-Generation

Mon, Apr 30

ArielGlenn updated the task description for T193399: Test all misc dumps cron jobs on snapshot1001 with php7/stretch.
Mon, Apr 30, 5:27 PM · Dumps-Generation
ArielGlenn added a comment to T161509: Test php7 on snapshot1001 for dumps.

We see SET NAMES utf8mb4 in headers for mysql table dumps, rather than SET NAMES utf8. That's a side effect of going to stretch. I guess there could be some discussion in the future of what we want (binary, utf8mb4, something else), in any case, I don't see this as a stopper.

Mon, Apr 30, 5:26 PM · Patch-For-Review, Dumps-Generation
ArielGlenn added a comment to T178047: Investigate why wikidata abstracts dumps are so large, see if we can reduce the size somehow..

I'd like to wait for the first run. I'll retitle the task then too :-)

Mon, Apr 30, 3:31 PM · MW-1.32-release-notes (WMF-deploy-2018-05-01 (1.32.0-wmf.2)), MediaWiki-extensions-WikibaseRepository, Wikidata, Dumps-Generation