ArielGlenn (ariel)
User

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Friday

  • Clear sailing ahead.

User Details

User Since
Oct 8 2014, 7:09 PM (200 w, 6 d)
Availability
Available
IRC Nick
apergos
LDAP User
ArielGlenn
MediaWiki User
ArielGlenn [ Global Accounts ]

Recent Activity

Yesterday

ArielGlenn added a comment to T199121: RFC: Spec for representing multiple content objects per revision (MCR) in XML dumps.

I've updated the draft RFC to remove the 'final' schema, leaving the 'transitional' schema as the new schema proposal; I've munged the 'header changes' section leaving my question about possible changes to slot role names in there for comment. Still thinking about @daniel's proposal (a series of text tags with attributes).

Tue, Aug 14, 2:17 PM · Dumps-Generation, User-ArielGlenn, User-Daniel, TechCom-RFC, Multi-Content-Revisions, Structured-Data-Commons, Wikidata
ArielGlenn added a comment to T195289: Add Addshore & possibly other WMDE devs/deployers to the wikidata icinga contact list.

@Addshore and @Ladsgroup should be on the contact list (patch merged at the end of May!). @hoo was already on it. Is no one getting notifications?

Tue, Aug 14, 9:33 AM · Patch-For-Review, User-Addshore, Wikidata, Operations, monitoring

Mon, Aug 13

ArielGlenn added a comment to T199121: RFC: Spec for representing multiple content objects per revision (MCR) in XML dumps.

Thanks for the input so far!

Mon, Aug 13, 9:25 PM · Dumps-Generation, User-ArielGlenn, User-Daniel, TechCom-RFC, Multi-Content-Revisions, Structured-Data-Commons, Wikidata
ArielGlenn closed T201803: wikidata dumps broken with 'unexpected option: --full!' for revision history content dumps as Resolved.

The job is rerunning and output is being produced, closing this ticket.

Mon, Aug 13, 5:02 PM · MW-1.32-release-notes (WMF-deploy-2018-08-21 (1.32.0-wmf.18)), MediaWiki-Maintenance-scripts, Patch-For-Review, Dumps-Generation
ArielGlenn added a comment to T147169: Make sure Wikibase dump maintenance scripts solely use the "dump" db group.

does not try to refresh it periodically

But db1101 never was a dump host, but rc, and it used to have load 1. Also I checked and didn't have any connected client (I checked).

Mon, Aug 13, 1:27 PM · MW-1.32-release-notes (WMF-deploy-2018-07-10 (1.32.0-wmf.12)), Wikidata-Campsite, Wikidata-Ministry-Of-Magic, Wikidata, MediaWiki-extensions-WikibaseRepository
ArielGlenn added a comment to T147169: Make sure Wikibase dump maintenance scripts solely use the "dump" db group.

I will hazard a guess that the script initially retrieves the correct db for the dumps group but does not try to refresh it periodically (or on every failure). We could consider doing so after some number of consecutive failures; @hoo what do you think?

Mon, Aug 13, 10:51 AM · MW-1.32-release-notes (WMF-deploy-2018-07-10 (1.32.0-wmf.12)), Wikidata-Campsite, Wikidata-Ministry-Of-Magic, Wikidata, MediaWiki-extensions-WikibaseRepository
ArielGlenn added a comment to T147169: Make sure Wikibase dump maintenance scripts solely use the "dump" db group.

I didn't get a chance to check things then but all open connections to dbs now from snapshot1008 are to db1087, as they should be. This will take some digging.

Mon, Aug 13, 10:32 AM · MW-1.32-release-notes (WMF-deploy-2018-07-10 (1.32.0-wmf.12)), Wikidata-Campsite, Wikidata-Ministry-Of-Magic, Wikidata, MediaWiki-extensions-WikibaseRepository
ArielGlenn moved T201803: wikidata dumps broken with 'unexpected option: --full!' for revision history content dumps from Backlog to Active on the Dumps-Generation board.
Mon, Aug 13, 10:11 AM · MW-1.32-release-notes (WMF-deploy-2018-08-21 (1.32.0-wmf.18)), MediaWiki-Maintenance-scripts, Patch-For-Review, Dumps-Generation
ArielGlenn added a comment to T201803: wikidata dumps broken with 'unexpected option: --full!' for revision history content dumps.

2bd7259a2c88a2bcc30a9217770a64268e161305 is the commit which changed the behavior of maintenance scripts, merged on Aug 7th.

Mon, Aug 13, 6:40 AM · MW-1.32-release-notes (WMF-deploy-2018-08-21 (1.32.0-wmf.18)), MediaWiki-Maintenance-scripts, Patch-For-Review, Dumps-Generation
ArielGlenn created T201803: wikidata dumps broken with 'unexpected option: --full!' for revision history content dumps.
Mon, Aug 13, 6:34 AM · MW-1.32-release-notes (WMF-deploy-2018-08-21 (1.32.0-wmf.18)), MediaWiki-Maintenance-scripts, Patch-For-Review, Dumps-Generation

Fri, Aug 10

ArielGlenn added a comment to T201653: Missing documentation for pageviews dataset.

Ah ha! I hope whoever added the file will update it with the information you need. (I wonder who that kind person was?)

Fri, Aug 10, 9:44 AM · Patch-For-Review, Analytics-Kanban, Datasets-General-or-Unknown, Documentation, Analytics
ArielGlenn added a comment to T199121: RFC: Spec for representing multiple content objects per revision (MCR) in XML dumps.

The draft at https://www.mediawiki.org/wiki/Requests_for_comment/Schema_update_for_multiple_content_objects_per_revision_(MCR)_in_XML_dumps is ready for a first round of comments by people on this ticket (or people just following along). Anything from 'that schema is wrong' to 'why did you use that color for diffs' to 'this wording is confusing', whatever you think is useful.

Fri, Aug 10, 9:21 AM · Dumps-Generation, User-ArielGlenn, User-Daniel, TechCom-RFC, Multi-Content-Revisions, Structured-Data-Commons, Wikidata
ArielGlenn added a comment to T201653: Missing documentation for pageviews dataset.

it's there for me. i just checked by clicking on the above link. (I also tried going directly from the analytics index page to see if the link is different and broken there but it worked for me from there too.

Fri, Aug 10, 9:20 AM · Patch-For-Review, Analytics-Kanban, Datasets-General-or-Unknown, Documentation, Analytics

Thu, Aug 9

ArielGlenn added a comment to T199252: Search engines continue to link to JS-redirect destination after Wikipedia copyright protest.

So a few things:

<snip>

  1. Sorry, I should've been watching closer and caught this sooner, but "dumps.wikimedia.org" is one of the handful of domains on the audited shortlist which don't use our standard cache cluster termination. This means the cache clusters don't service its traffic, so they can't really rewrite into it easily. We can perhaps add a backend just for this purpose, though (not move dumps to standard termination, just also have the dumps backend available for rewriting these particular requests into). I'll have to double-check there's no snags with that idea...
Thu, Aug 9, 12:59 PM · Patch-For-Review, SEO, Performance-Team, Operations, Traffic, Wikimedia-General-or-Unknown

Wed, Aug 8

ArielGlenn added a comment to T199252: Search engines continue to link to JS-redirect destination after Wikipedia copyright protest.

If you are going to store things on the dump web servers:

Wed, Aug 8, 6:50 PM · Patch-For-Review, SEO, Performance-Team, Operations, Traffic, Wikimedia-General-or-Unknown
ArielGlenn added a comment to T199121: RFC: Spec for representing multiple content objects per revision (MCR) in XML dumps.

@daniel I am about to steal a bunch of your preliminary work and comments in order to craft a workable proposal (see the link in the task description now); for this reason you are listed as a co-author of the RFC. If you would rather not, please say so and I'll just give a ton of credit in the document itself.

Wed, Aug 8, 3:40 PM · Dumps-Generation, User-ArielGlenn, User-Daniel, TechCom-RFC, Multi-Content-Revisions, Structured-Data-Commons, Wikidata
ArielGlenn updated the task description for T199121: RFC: Spec for representing multiple content objects per revision (MCR) in XML dumps.
Wed, Aug 8, 3:38 PM · Dumps-Generation, User-ArielGlenn, User-Daniel, TechCom-RFC, Multi-Content-Revisions, Structured-Data-Commons, Wikidata

Tue, Aug 7

ArielGlenn added a comment to T29653: Provide dumps using bittorrent.

This shouldn't run on the snapshot (dumps-generating) hosts; if it were to run anywhere it would run on the web server.

Hmm, why wouldn't those hosts be the right place to call mktorrent? It can be CPU intensive, so I don't think running it on a web server is a good idea.
(I have very little understanding of how the actual dumps generating process works fwiw)

Tue, Aug 7, 6:52 PM · Datasets-General-or-Unknown
ArielGlenn added a comment to T201119: For some categories, cat_pages is less than cat_subcats.

A bug indeed. Here you go: T18036

Tue, Aug 7, 10:15 AM · MediaWiki-Categories
ArielGlenn moved T184485: Stop logging autopatrol actions from Watching now to Done on the User-ArielGlenn board.
Tue, Aug 7, 10:01 AM · User-Ladsgroup, MW-1.32-release-notes (WMF-deploy-2018-05-29 (1.32.0-wmf.6)), MW-1.31-release-notes (WMF-deploy-2018-04-17 (1.31.0-wmf.30)), Wikidata-Ministry-Of-Magic-Tech-Debt, Wikidata-Ministry-Of-Magic, User-notice, TechCom-RFC (TechCom-Approved), User-ArielGlenn, MediaWiki-Patrolling, Patch-For-Review, MediaWiki-Logging
ArielGlenn moved T184854: hhvm memcached and php7 memcached extensions do not play well together from Watching now to Done on the User-ArielGlenn board.
Tue, Aug 7, 10:01 AM · PHP 7.0 support, Performance-Team (Radar), User-ArielGlenn, MediaWiki-Platform-Team
ArielGlenn moved T171541: Setup periodic rsync jobs from dumps generation hosts to labstore1006|7 from Short-term backlog to Done on the User-ArielGlenn board.
Tue, Aug 7, 10:01 AM · Patch-For-Review, User-ArielGlenn, Data-Services, Datasets-General-or-Unknown
ArielGlenn moved T179942: Move misc dump cron jobs from dataset1001 to dumpsgen1001 from Short-term backlog to Done on the User-ArielGlenn board.
Tue, Aug 7, 10:01 AM · Patch-For-Review, User-ArielGlenn, Dumps-Generation
ArielGlenn moved T180127: Reboot of dumps hosts from Short-term backlog to Done on the User-ArielGlenn board.
Tue, Aug 7, 10:01 AM · Datasets-General-or-Unknown, User-ArielGlenn, Operations
ArielGlenn moved T180102: Clean up temp files from dump of revision content from Short-term backlog to Done on the User-ArielGlenn board.
Tue, Aug 7, 10:01 AM · Patch-For-Review, Dumps-Generation, User-ArielGlenn
ArielGlenn moved T188242: Reboots of dumps/snapshot hosts from Short-term backlog to Done on the User-ArielGlenn board.
Tue, Aug 7, 10:00 AM · Datasets-General-or-Unknown, User-ArielGlenn, Operations
ArielGlenn moved T180934: Wikidata json dumps filling /var/log from Short-term backlog to Done on the User-ArielGlenn board.
Tue, Aug 7, 10:00 AM · User-ArielGlenn, Dumps-Generation, Wikidata
ArielGlenn moved T188647: Announce/Communicate dumps migration to labstore1006|7 to stakeholders from Short-term backlog to Done on the User-ArielGlenn board.
Tue, Aug 7, 10:00 AM · cloud-services-team (Kanban), Data-Services, User-ArielGlenn, Datasets-General-or-Unknown
ArielGlenn moved T188726: make sure all datasets in xmldatadumps/public/other on dataset1001 are accounted for on new labs boxes from Short-term backlog to Done on the User-ArielGlenn board.
Tue, Aug 7, 10:00 AM · Patch-For-Review, User-ArielGlenn, Data-Services, Datasets-General-or-Unknown
ArielGlenn moved T188915: Move wikitech images to swift from Short-term backlog to Done on the User-ArielGlenn board.
Tue, Aug 7, 10:00 AM · Patch-For-Review, User-ArielGlenn, Goal, cloud-services-team (FY2017-18), Cloud-Services
ArielGlenn moved T186756: Move labstore1006 and 1007 to 10G enabled racks in row A & D from Short-term backlog to Done on the User-ArielGlenn board.
Tue, Aug 7, 10:00 AM · Patch-For-Review, ops-eqiad, Data-Services, DC-Ops, User-ArielGlenn, Operations
ArielGlenn moved T184359: SentryHooks::onLogException sometimes gets passed Error instead of Exception, it should handle this from Short-term backlog to Done on the User-ArielGlenn board.
Tue, Aug 7, 10:00 AM · Patch-For-Review, User-ArielGlenn, Beta-Cluster-Infrastructure, Sentry
ArielGlenn moved T189283: Replace cron jobs from EZachte's home directory on stat1005 with rsync fetches from Short-term backlog to Done on the User-ArielGlenn board.
Tue, Aug 7, 10:00 AM · Patch-For-Review, User-ArielGlenn, Data-Services, Datasets-General-or-Unknown
ArielGlenn moved T189284: Stop serving slowparse logs from dumps distribution servers from Short-term backlog to Done on the User-ArielGlenn board.
Tue, Aug 7, 10:00 AM · Performance-Team (Radar), Patch-For-Review, User-ArielGlenn, Data-Services, Datasets-General-or-Unknown
ArielGlenn moved T189657: Make all dumps mirrors manifests use a single hiera dict with all the info in it from Short-term backlog to Done on the User-ArielGlenn board.
Tue, Aug 7, 9:59 AM · Patch-For-Review, User-ArielGlenn, Datasets-General-or-Unknown
ArielGlenn moved T189295: ICU 57 migration for wikis using non-default collation from Short-term backlog to Done on the User-ArielGlenn board.
Tue, Aug 7, 9:59 AM · User-ArielGlenn, Patch-For-Review, User-notice, User-Elukey, HHVM, Operations
ArielGlenn moved T195393: Run all jobs on PHP7 or HHVM from Short-term backlog to Done on the User-ArielGlenn board.
Tue, Aug 7, 9:59 AM · User-ArielGlenn, HHVM, MediaWiki-Platform-Team, Operations
ArielGlenn moved T196189: rack/setup/install snapshot1009 from Short-term backlog to Done on the User-ArielGlenn board.
Tue, Aug 7, 9:59 AM · User-ArielGlenn, Patch-For-Review, ops-eqiad, Datasets-General-or-Unknown, Operations
ArielGlenn moved T178046: Consider compressing uncompressed dump files (abstracts, siteinfo-namespaces) from This week to Done on the User-ArielGlenn board.
Tue, Aug 7, 9:59 AM · Patch-For-Review, User-ArielGlenn, Dumps-Generation
ArielGlenn moved T179857: Make sure rsynced dump status/html files don't contain links to files not yet copied over from This week to Done on the User-ArielGlenn board.
Tue, Aug 7, 9:59 AM · Patch-For-Review, User-ArielGlenn, Dumps-Generation
ArielGlenn moved T176370: Migrate to PHP 7 in WMF production from Short-term backlog to Watching now on the User-ArielGlenn board.
Tue, Aug 7, 9:59 AM · Core-Platform-Team, TechCom-RFC (TechCom-Approved), User-ArielGlenn, HHVM, Operations
ArielGlenn moved T199890: missed pages from kafka outage on July 11 2018 from Short-term backlog to Watching now on the User-ArielGlenn board.
Tue, Aug 7, 9:58 AM · User-ArielGlenn, Operations
ArielGlenn moved T194060: decommission dataset1001, ms1001 from Short-term backlog to Watching now on the User-ArielGlenn board.
Tue, Aug 7, 9:58 AM · Patch-For-Review, Operations, ops-eqiad, User-ArielGlenn, decommission
ArielGlenn moved T195392: Run all jobs on PHP7 from Short-term backlog to Watching now on the User-ArielGlenn board.
Tue, Aug 7, 9:58 AM · Core-Platform-Team, User-ArielGlenn, HHVM, Operations
ArielGlenn moved T197021: decommission snapshot1001 from Short-term backlog to Watching now on the User-ArielGlenn board.
Tue, Aug 7, 9:58 AM · Patch-For-Review, User-ArielGlenn, decommission, Operations, ops-eqiad

Mon, Aug 6

ArielGlenn added a comment to T201350: Access to dumps servers.

It's the labstore boxes you want, either 1006 or 1007 depending, and maybe you just want to make the file available and ask someone to drop it into the right location? And that would likely be someone on the WMCS team.

Mon, Aug 6, 7:15 PM · Patch-For-Review, Data-Services, SRE-Access-Requests, Operations
ArielGlenn added a comment to T196812: Make PolyGerrit the default ui.

Does 2.16 have the 'a new changset has been uploaded by so-and-so' feature for polygerrit, like the current ui does, or is that a later release?

Mon, Aug 6, 6:34 PM · User-notice, Release-Engineering-Team, Patch-For-Review, Gerrit
ArielGlenn updated subscribers of T199121: RFC: Spec for representing multiple content objects per revision (MCR) in XML dumps.

Adding @awight as an interested party (who works on eg the mw vagrant dumps role).

Mon, Aug 6, 5:07 PM · Dumps-Generation, User-ArielGlenn, User-Daniel, TechCom-RFC, Multi-Content-Revisions, Structured-Data-Commons, Wikidata
ArielGlenn added a comment to T199121: RFC: Spec for representing multiple content objects per revision (MCR) in XML dumps.

Because MCR content on Commons, and specifically the metadata storage piece, is set to go live on October 1st, and we likely will barely have an RFC out by that time if we are lucky, we will not be giving adoptees much time to convert their existing utilities to use the new schema. So I am initially leaning towards something like this:

Mon, Aug 6, 3:20 PM · Dumps-Generation, User-ArielGlenn, User-Daniel, TechCom-RFC, Multi-Content-Revisions, Structured-Data-Commons, Wikidata
ArielGlenn added a comment to T199121: RFC: Spec for representing multiple content objects per revision (MCR) in XML dumps.

Making clear here the correspondence between revisions, slots, content, text, and comparing that to the previous setup with just revisions and text.

Mon, Aug 6, 2:57 PM · Dumps-Generation, User-ArielGlenn, User-Daniel, TechCom-RFC, Multi-Content-Revisions, Structured-Data-Commons, Wikidata
ArielGlenn updated subscribers of T199121: RFC: Spec for representing multiple content objects per revision (MCR) in XML dumps.

@hoo I am adding you guessing that you will want to weigh in on the new schema. If this is outside your interest, go ahead and take yourself off.

Mon, Aug 6, 1:14 PM · Dumps-Generation, User-ArielGlenn, User-Daniel, TechCom-RFC, Multi-Content-Revisions, Structured-Data-Commons, Wikidata
ArielGlenn added a comment to T199121: RFC: Spec for representing multiple content objects per revision (MCR) in XML dumps.

Some initial comments/questions on slot-roles, content_models tables:

Mon, Aug 6, 1:12 PM · Dumps-Generation, User-ArielGlenn, User-Daniel, TechCom-RFC, Multi-Content-Revisions, Structured-Data-Commons, Wikidata
ArielGlenn added a comment to T199121: RFC: Spec for representing multiple content objects per revision (MCR) in XML dumps.

I'm adding here the tables and fields that need to be part of the dumps, both for export and for import, so everyone is on the same page.

Mon, Aug 6, 1:01 PM · Dumps-Generation, User-ArielGlenn, User-Daniel, TechCom-RFC, Multi-Content-Revisions, Structured-Data-Commons, Wikidata
ArielGlenn moved T29653: Provide dumps using bittorrent from Done to Backlog on the Datasets-General-or-Unknown board.
Mon, Aug 6, 8:27 AM · Datasets-General-or-Unknown
ArielGlenn updated subscribers of T29653: Provide dumps using bittorrent.

This shouldn't run on the snapshot (dumps-generating) hosts; if it were to run anywhere it would run on the web server. Looping in @Bstorm who is the point person for the labstore boxes (which handle web service) now.

Mon, Aug 6, 8:26 AM · Datasets-General-or-Unknown

Sat, Aug 4

ArielGlenn added a comment to T199252: Search engines continue to link to JS-redirect destination after Wikipedia copyright protest.

When i chatted with @Imarlier via Hangout, we talked about running these as one-offs as needed to fix specific issues. I don't know if that is still the plan, perhaps he can weigh in.

Sat, Aug 4, 7:19 AM · Patch-For-Review, SEO, Performance-Team, Operations, Traffic, Wikimedia-General-or-Unknown
Krinkle awarded T29653: Provide dumps using bittorrent a Orange Medal token.
Sat, Aug 4, 12:32 AM · Datasets-General-or-Unknown

Thu, Aug 2

ArielGlenn updated subscribers of T199121: RFC: Spec for representing multiple content objects per revision (MCR) in XML dumps.

Adding @brion as someone who knows these schemas well (thanks in advance!)

Thu, Aug 2, 6:29 PM · Dumps-Generation, User-ArielGlenn, User-Daniel, TechCom-RFC, Multi-Content-Revisions, Structured-Data-Commons, Wikidata
ArielGlenn added a comment to T201034: Colours on wikimediafoundation.org present brand recognition and accessibility issues.
the foundation defines no colors at https://meta.wikimedia.org/wiki/Brand .

"they are used as part of the Wikimedia Foundation business card design. "

Thu, Aug 2, 6:19 PM · wikimediafoundation.org
ArielGlenn moved T199121: RFC: Spec for representing multiple content objects per revision (MCR) in XML dumps from Backlog to Active on the Dumps-Generation board.
Thu, Aug 2, 3:12 PM · Dumps-Generation, User-ArielGlenn, User-Daniel, TechCom-RFC, Multi-Content-Revisions, Structured-Data-Commons, Wikidata
ArielGlenn added a comment to T199121: RFC: Spec for representing multiple content objects per revision (MCR) in XML dumps.

Notes on timing: it i expected that commons MCR writes (metadata for media) will be happening by Oct 1, so it would be really really nice to have an rfc approved and code written by then. That's a pretty short time frame given that it's summer vacation right now, but otherwise data about some media won't show up n the dujmps.

Thu, Aug 2, 2:31 PM · Dumps-Generation, User-ArielGlenn, User-Daniel, TechCom-RFC, Multi-Content-Revisions, Structured-Data-Commons, Wikidata
ArielGlenn claimed T199121: RFC: Spec for representing multiple content objects per revision (MCR) in XML dumps.
Thu, Aug 2, 2:22 PM · Dumps-Generation, User-ArielGlenn, User-Daniel, TechCom-RFC, Multi-Content-Revisions, Structured-Data-Commons, Wikidata
ArielGlenn added a comment to T199121: RFC: Spec for representing multiple content objects per revision (MCR) in XML dumps.

Existing proposals (which were the occasion for my comments at the link above): https://www.mediawiki.org/wiki/Multi-Content_Revisions/Dumps

Thu, Aug 2, 2:19 PM · Dumps-Generation, User-ArielGlenn, User-Daniel, TechCom-RFC, Multi-Content-Revisions, Structured-Data-Commons, Wikidata
ArielGlenn added projects to T199121: RFC: Spec for representing multiple content objects per revision (MCR) in XML dumps: User-ArielGlenn, Dumps-Generation.
Thu, Aug 2, 2:17 PM · Dumps-Generation, User-ArielGlenn, User-Daniel, TechCom-RFC, Multi-Content-Revisions, Structured-Data-Commons, Wikidata
ArielGlenn added a comment to T198356: Generate daily diffs for recently changed categories.

I've run the script manually with the above change applied; results are available in the expected location.

Thu, Aug 2, 9:58 AM · MW-1.32-release-notes (WMF-deploy-2018-07-10 (1.32.0-wmf.12)), Patch-For-Review, Discovery-Wikidata-Query-Service-Sprint, User-Smalyshev, Discovery, Wikidata-Query-Service, Wikidata

Wed, Aug 1

ArielGlenn added a comment to T198356: Generate daily diffs for recently changed categories.

This is now deployed; I'll check tomorrow that the dailies ran ok, and we'll know about the fulls over the weekend.

Wed, Aug 1, 6:11 PM · MW-1.32-release-notes (WMF-deploy-2018-07-10 (1.32.0-wmf.12)), Patch-For-Review, Discovery-Wikidata-Query-Service-Sprint, User-Smalyshev, Discovery, Wikidata-Query-Service, Wikidata

Mon, Jul 30

ArielGlenn added a comment to T199252: Search engines continue to link to JS-redirect destination after Wikipedia copyright protest.

Google gets updates from us more than once a day; I don't know how their update pipeline works, but they certainly have or could have access to the data more or less live. We should talk to them and find out where the problem is.

Mon, Jul 30, 8:44 AM · Patch-For-Review, SEO, Performance-Team, Operations, Traffic, Wikimedia-General-or-Unknown
ArielGlenn edited projects for T174802: Archive and drop education program (ep_*) tables on all wikis, added: Datasets-General-or-Unknown; removed Dumps-Generation.
Mon, Jul 30, 8:37 AM · Datasets-General-or-Unknown, Data-Services, DBA
ArielGlenn moved T200146: add hewiki to 'big wikis' for xml/sql dumps from Blocked/Stalled/Waiting for event to Done on the Dumps-Generation board.
Mon, Jul 30, 8:36 AM · Patch-For-Review, Dumps-Generation
ArielGlenn closed T200146: add hewiki to 'big wikis' for xml/sql dumps as Resolved.

This is deployed.

Mon, Jul 30, 8:36 AM · Patch-For-Review, Dumps-Generation
ArielGlenn closed T200146: add hewiki to 'big wikis' for xml/sql dumps, a subtask of T199204: Check for slow meta-history runs for small wikis and see about speedups, as Resolved.
Mon, Jul 30, 8:36 AM · Dumps-Generation
ArielGlenn moved T142435: generate and email monthly stats to xmldumps mailing list from Blocked/Stalled/Waiting for event to Done on the Dumps-Generation board.
Mon, Jul 30, 8:31 AM · Dumps-Generation
ArielGlenn closed T142435: generate and email monthly stats to xmldumps mailing list as Resolved.

The mail arrived and looks fine, closing this.

Mon, Jul 30, 8:31 AM · Dumps-Generation
ArielGlenn moved T200180: Move all misc dump cron jobs from primary nfs dumpsdata server to secondary from Backlog to Active on the Dumps-Generation board.
Mon, Jul 30, 8:30 AM · Patch-For-Review, Dumps-Generation

Wed, Jul 25

ArielGlenn updated subscribers of T174802: Archive and drop education program (ep_*) tables on all wikis.

We've never had requests for specific tables like these; the ep tables aren't dumped as part of the regular dumps either.

Wed, Jul 25, 10:35 AM · Datasets-General-or-Unknown, Data-Services, DBA

Mon, Jul 23

ArielGlenn triaged T200180: Move all misc dump cron jobs from primary nfs dumpsdata server to secondary as Normal priority.
Mon, Jul 23, 8:36 AM · Patch-For-Review, Dumps-Generation

Sun, Jul 22

ArielGlenn added a subtask for T199204: Check for slow meta-history runs for small wikis and see about speedups: T200146: add hewiki to 'big wikis' for xml/sql dumps.
Sun, Jul 22, 6:54 AM · Dumps-Generation
ArielGlenn added a parent task for T200146: add hewiki to 'big wikis' for xml/sql dumps: T199204: Check for slow meta-history runs for small wikis and see about speedups.
Sun, Jul 22, 6:54 AM · Patch-For-Review, Dumps-Generation
ArielGlenn moved T200146: add hewiki to 'big wikis' for xml/sql dumps from Active to Blocked/Stalled/Waiting for event on the Dumps-Generation board.
Sun, Jul 22, 6:52 AM · Patch-For-Review, Dumps-Generation
ArielGlenn added a comment to T200146: add hewiki to 'big wikis' for xml/sql dumps.

This should happen after the current run is completed.

Sun, Jul 22, 6:52 AM · Patch-For-Review, Dumps-Generation
ArielGlenn moved T200146: add hewiki to 'big wikis' for xml/sql dumps from Backlog to Active on the Dumps-Generation board.
Sun, Jul 22, 6:52 AM · Patch-For-Review, Dumps-Generation
ArielGlenn triaged T200146: add hewiki to 'big wikis' for xml/sql dumps as Normal priority.
Sun, Jul 22, 6:48 AM · Patch-For-Review, Dumps-Generation
ArielGlenn moved T142435: generate and email monthly stats to xmldumps mailing list from Up Next to Blocked/Stalled/Waiting for event on the Dumps-Generation board.

This is deployed but I'll wait to close it until we see the first email arrive in a few days.

Sun, Jul 22, 6:29 AM · Dumps-Generation

Fri, Jul 20

ArielGlenn added a comment to T199204: Check for slow meta-history runs for small wikis and see about speedups.

I have sent mail about adding hewiki to the 'big wikis' list for processing; this is set to happen for the August 1st run.

Fri, Jul 20, 5:54 AM · Dumps-Generation
ArielGlenn moved T154914: Add .nt to DCAT-AP for Wikidata dumps from Backlog to Blocked/Stalled/Waiting for event on the Dumps-Generation board.
Fri, Jul 20, 5:45 AM · Dumps-Generation, User-LokalProfil, Patch-For-Review, Wikidata
ArielGlenn moved T198676: Add versioning to DCAT-AP config from Backlog to Blocked/Stalled/Waiting for event on the Dumps-Generation board.
Fri, Jul 20, 5:45 AM · Patch-For-Review, Dumps-Generation, User-LokalProfil
ArielGlenn added a comment to T198676: Add versioning to DCAT-AP config.

What's the status on this? Anything needed to get it moving?

Fri, Jul 20, 5:45 AM · Patch-For-Review, Dumps-Generation, User-LokalProfil
ArielGlenn added a comment to T154914: Add .nt to DCAT-AP for Wikidata dumps.

What's the status on this? Anything needed to get it moving?

Fri, Jul 20, 5:44 AM · Dumps-Generation, User-LokalProfil, Patch-For-Review, Wikidata
ArielGlenn moved T197460: Download link for (Hebrew) Wikipedia 'List of all page titles' results in 503 error from Backlog to Done on the Dumps-Generation board.
Fri, Jul 20, 5:41 AM · Dumps-Generation
ArielGlenn closed T197460: Download link for (Hebrew) Wikipedia 'List of all page titles' results in 503 error as Invalid.

I'm going to go ahead and close this; if it's observed again and it's not the result of too many connections, it can be re-opened.

Fri, Jul 20, 5:41 AM · Dumps-Generation
ArielGlenn moved T196063: Be smart about creation of temp stub files for the corresponding page output content from Blocked/Stalled/Waiting for event to Done on the Dumps-Generation board.
Fri, Jul 20, 5:39 AM · Patch-For-Review, Dumps-Generation
ArielGlenn closed T196063: Be smart about creation of temp stub files for the corresponding page output content as Resolved.

A dump run has completed since this was deployed, and it works fine. Closing.

Fri, Jul 20, 5:39 AM · Patch-For-Review, Dumps-Generation
ArielGlenn moved T198792: snapshot1005 does not power back up from Active to Done on the Dumps-Generation board.
Fri, Jul 20, 5:38 AM · Dumps-Generation, Patch-For-Review, DC-Ops, ops-eqiad, Operations
ArielGlenn moved T199117: track certain dump job runtimes over time from Active to Done on the Dumps-Generation board.
Fri, Jul 20, 5:37 AM · Patch-For-Review, Dumps-Generation
ArielGlenn closed T199117: track certain dump job runtimes over time as Resolved.

Emailed report showed up with all the info i need. Closing.

Fri, Jul 20, 5:37 AM · Patch-For-Review, Dumps-Generation

Thu, Jul 19

ArielGlenn added a comment to T199117: track certain dump job runtimes over time.

Well, 'next time' turned out to be 5 minutes later, too twitchy to leave it for tomorrow. Run should happen tomorrow morning so I'll check the results then.

Thu, Jul 19, 9:15 PM · Patch-For-Review, Dumps-Generation
ArielGlenn added a comment to T199117: track certain dump job runtimes over time.

I'll get a report on the longer running jobs for enwiki, wikidatawiki and the 'big wikis' for now. This should run tomorrow before the 20th dump jobs kick off.
I should add one more job that gives me the slowest 40, say, page-meta-history bz2 content dumps on all wikis, so i can track those. Next time.

Thu, Jul 19, 8:36 PM · Patch-For-Review, Dumps-Generation
ArielGlenn closed T45647: Sometimes (at peak usage?), dumps.wikimedia.org becomes very slow for users (sometimes unresponsive) as Declined.

Given that the hosting setup for this service is different now, this might as well be closed. If folks notice problems in the future they can create a new task.

Thu, Jul 19, 3:15 PM · Operations, Datasets-General-or-Unknown
ArielGlenn closed T45647: Sometimes (at peak usage?), dumps.wikimedia.org becomes very slow for users (sometimes unresponsive), a subtask of T122917: Provide a good download service of dumps from Wikimedia, as Declined.
Thu, Jul 19, 3:15 PM · Operations, Datasets-General-or-Unknown
ArielGlenn added a comment to T198356: Generate daily diffs for recently changed categories.

The dblist fix has been deployed, off to test the actual bash script now. Until now it's all been manual runs across the dblist with direct calls to the maintenance script.

Thu, Jul 19, 11:50 AM · MW-1.32-release-notes (WMF-deploy-2018-07-10 (1.32.0-wmf.12)), Patch-For-Review, Discovery-Wikidata-Query-Service-Sprint, User-Smalyshev, Discovery, Wikidata-Query-Service, Wikidata