User Details
- User Since
- Oct 8 2014, 7:09 PM (498 w, 2 d)
- Availability
- Available
- IRC Nick
- apergos
- LDAP User
- ArielGlenn
- MediaWiki User
- ArielGlenn [ Global Accounts ]
Thu, Apr 25
Wed, Apr 24
Reviving this old patch, let's see what happens.
Tue, Apr 23
Tue, Apr 9
The above patch fails CI for three tests not in the Database group. All three call User::isAllowed() at some point, which now would require database access. Two tests are in ConfirmEdit and one is in core (DumpableObjectsTest). I could add al lthree to @group Database or I could mock out CentralAuthHooks and have the onUserGetRights method always return true in those tests, or maybe there's some better approach. Poking @Tgr for advice or a pointer.
Mar 14 2024
Note that for any of those issue links in the task description, I get "Access is denied to this issue". But the issue linked to in T359957#9625065 is visible at least.
Mar 12 2024
Mar 6 2024
Feb 27 2024
Feb 26 2024
The cloning procedure is done for db14 but we are currently hunting around for the replication password, not where the docs ( https://wikitech.wikimedia.org/wiki/Nova_Resource:Deployment-prep/Databases#Starting_Replication ) say it should be, not anywhere in that repo ever, apparently.
In the end @TheresNoTime figured it out: puppet was starting mariadb automatically when we didn't want it running and hence creating that file complained of in the error above. The cloning process looks like it's going ok at the moment.
Just to explain to folks who might be following along, what's happening: the primary server (db13) will be cloned (via mariabackup --innobackupex) to the new replica; a new instance is bring created now for that. While that is happening, replication will be stopped and the primary will remain read-only. I am guessing that this will be a matter of some hours rather than a day. More updates as available.
Feb 22 2024
1 and 2 are both role dumps::generation::server::spare and have been so since at least last July. See https://gerrit.wikimedia.org/r/c/operations/puppet/+/936379 and https://gerrit.wikimedia.org/r/c/operations/puppet/+/893265
While any nfs spare could in theory be swapped in for any production host, we have other newer spares with decent size filesystems for that; these are idle and can go at any time.
Feb 21 2024
I'd prefer the beta be kept in the name, making it clear that these are wikis on the deployment cluster.
Feb 15 2024
This is in progress still, notes so I don't forget:
- we run 7zs after the bz2 history files, that job will remain untouched
- need to adjust the script that does history backfil to not move files into place if there is a temp or bz2 file that appeared in the meantime (protection in case the fillin script and the main worker both reach the same files)
- could do md5s of these files as we go but let's see how much we gain without, it would be safer as far as avoiding overlaps
Feb 13 2024
We may want to test the behaviour when going from logged in on a wiki on beta.wmflabs.org (let's say en.wikipedia) and then visiting some-language.some-wiki.beta.wmcloud.org which is not designated as the "representative wiki" for that wiki family, and see if the behaviour is different from visiting the "representative wiki" immediately after login on en.wp.beta.wmflabs.org. These scenarios behave differently for me in production.
Feb 5 2024
As I try to help plan this out, I have accumulated some questions.
Jan 29 2024
Listing more extensions not named above that appear in Special:Versions and seem to me to not be needed at loginwiki:
- 3D
- CodeEditor
- CodeMirror
- ElectronPdfService
- Global Usage
- Kartographer
- MediaModeration
- RevisionSlider
- TemplateStyles
- TextExtracts
- TwoColConflict
and maybe others.
Jan 23 2024
Jan 17 2024
Verified for Firefox: I logged out, deleted all *wik*org cookies except for non SUL sites (phab, etherpad and so on), set Enhanced Tracking Protection to "custom" and chose "All cross-site cookies" (see image below). After login at en.wikipedia, after the end of the redirect/session creation chain for commons.wikimedia.org, I have session cookies for commons with session id, UserId, UserName stored locally. When I go to visit commons,wm.o, these cookies are sent to the web server and I am logged in as a result.
Firefox version: 121.0, linux.
Jan 10 2024
While T17294 seems stalled, is there anything do be done for CentralAuth in the meantime?
Jan 4 2024
Sounds good to me, I'll lurk until input is called for :-)
Dec 18 2023
The one thing I didn't think to check. Of course it works fine on other accounts. Closing!
The above patch went out with last week's train, and is on all wikis (1.42.0-wmf.9) but the behaviour is unchanged so I'll need to look into this further.
Dec 14 2023
I wonder why I thought he was in mine? Wishful thinking maybe. Hope it all goes well!
We missed you, I'm guessing that you got notice of this too late to make the training? We can reschedule in any case.
Dec 11 2023
I get confused by the name 'domain' every time. 'database' or 'dbname' is much clearer imo.
Nov 14 2023
We ought to decide about the rest of the name too, i.e. "virtual-whatgoeshere?" I liked Tim's use of the extension name with case preserved (see https://gerrit.wikimedia.org/r/c/mediawiki/extensions/LoginNotify/+/968800) but maybe you all have other preferences.
Nov 13 2023
Native speaker chiming in to say I agree: dependency means literally that A depends on B, relation is vague and can mean all kinds of things. Not going to weigh in on other aspects of the name though, carry on :-)
What do folks think of something like the above? (Untested)
Nov 7 2023
I note that User::getRights() was deprecated in 1.34 and removed in 1.38. Apparently we are intended to use PermissionManager::getUserPermissions() instead.
Nov 6 2023
Took a stab at the GlobalBlocking change first, as it's smaller and simpler to my eyes. Not tested whatsoever.
Oct 5 2023
Um there is no Thurs Oct 8. There is Thurs Oct 5 (today) and Thurs Oct 12, 19, 26... wonder if you meant any of these?
Oct 2 2023
If this is partly for ease of installation by first-time patch submitters, we should bear in mind that the new developer also has to jump through the gerrit setup and wikitech account creation hoops, and setting up an ssh key etc. If one already has these things, then we can indeed get the install time down to something very short.
Sep 25 2023
Sep 19 2023
Just a note that sometimes size/page count/visible rev count might go down, if a large batch of pages are deleted for e.g. copyvio (more likely to occur on a small wiki).
Sep 13 2023
We could make sure that for commonswiki, the setting config "sevenzipprefetch" is 0. I'll need to check that this is one of the settings that can be overriden, and that the code will recognize 0 as a 'false' value. This should get done before next month's full run.
There's one item in the checklist left before this task can be closed. And basically the holdup is just about getting the signoff from Tyler that the deployment trainings were completed; then we can get the rest of that item done.
Sep 12 2023
To expand on this a bit more: we saw the same error and stack trace on a slightly different page range, but with the identical symptoms. Logstash link here: https://logstash.wikimedia.org/goto/62b164dd91e2763a0a402d02087be836 Running the job hangs at the same point every time, even if nothing else is happening on the host; there aren't a particularly large number of revisions for the problem page, and their size isn't very large either. As before, using bz2 prfetch files permits the job to run to completion.
So the patches went around and I checked that they are on snapshot03, but unfortunately I still see the error:
2023-09-12 05:20:33: enwiki (ID 14793) 683 pages (694.3|694.3/sec all|curr), 1000 revs (1016.6|1016.6/sec all|curr), ETA 2023-09-12 05:30:22 [max 600437] MWUnknownContentModelException from line 192 of /srv/mediawiki/php-master/includes/content/ContentHandlerFactory.php: The content model 'JadeJudgment' is not registered on this wiki. See https://www.mediawiki.org/wiki/Content_handlers to find out which extensions handle this content model. #0 /srv/mediawiki/php-master/includes/content/ContentHandlerFactory.php(247): MediaWiki\Content\ContentHandlerFactory->validateContentHandler('JadeJudgment', NULL) #1 /srv/mediawiki/php-master/includes/content/ContentHandlerFactory.php(181): MediaWiki\Content\ContentHandlerFactory->createContentHandlerFromHook('JadeJudgment') #2 /srv/mediawiki/php-master/includes/content/ContentHandlerFactory.php(93): MediaWiki\Content\ContentHandlerFactory->createForModelID('JadeJudgment') #3 /srv/mediawiki/php-master/includes/export/XmlDumpWriter.php(474): MediaWiki\Content\ContentHandlerFactory->getContentHandler('JadeJudgment') #4 /srv/mediawiki/php-master/includes/export/XmlDumpWriter.php(402): XmlDumpWriter->writeSlot(Object(MediaWiki\Revision\SlotRecord), 1) #5 /srv/mediawiki/php-master/includes/export/WikiExporter.php(554): XmlDumpWriter->writeRevision(Object(stdClass), Array) #6 /srv/mediawiki/php-master/includes/export/WikiExporter.php(492): WikiExporter->outputPageStreamBatch(Object(Wikimedia\Rdbms\MysqliResultWrapper), Object(stdClass)) #7 /srv/mediawiki/php-master/includes/export/WikiExporter.php(316): WikiExporter->dumpPages('page_id >= 1900...', false) #8 /srv/mediawiki/php-master/includes/export/WikiExporter.php(208): WikiExporter->dumpFrom('page_id >= 1900...', false) #9 /srv/mediawiki/php-master/maintenance/includes/BackupDumper.php(355): WikiExporter->pagesByRange(190001, 195001, false) #10 /srv/mediawiki/php-master/maintenance/dumpBackup.php(82): BackupDumper->dump(1, 1) #11 /srv/mediawiki/php-master/maintenance/includes/MaintenanceRunner.php(685): DumpBackup->execute() #12 /srv/mediawiki/php-master/maintenance/run.php(51): MediaWiki\Maintenance\MaintenanceRunner->run() #13 /srv/mediawiki/multiversion/MWScript.php(159): require_once('/srv/mediawiki/...') #14 {main}
Perhaps the override isn't being respected, or the usage isn't quite right?
Sep 8 2023
the ops-dumps email alias ought to get notified about things like this; that way all the right people will see it.
Sep 7 2023
Just a quick note that this breaks testing of dumps for enwiki in deployment-prep. We can work around it by testing only on other wikis, but it would be nice for this to be cleaned up.
We missed you today for the training. I'm guessing that something came up? Go ahead and reschedule, if you are still interested!
Aug 31 2023
Aug 29 2023
Thanks for the fix(es), everything is working as expected now.
Aug 28 2023
Verified that with those same files from the above command the error is still present, nothing in the MW codebase has changed whatever the underlying issue is.
Related: T324463
Who runs the findBadBlobs.php script in cases like this? It would be nice to get that done.
Not doing this, since we now have WME (Enterprise) dumps in HTML format available for public download.
Going ahead and closing this.
Aug 24 2023
grep on mwmaint1002 for php, looking for long running stuff, gives me only
Jul11 0:00 /bin/bash /usr/local/bin/mwscript eval.php --wiki=commonswiki
The others are all Aug 22 or 23rd just fyi.
I"m presuming you didn't see any instances of this in the meantime, @awight ? Can we close this?
Hey @Sgs we missed you this morning at the deployment window for training. Or were you going to do the UTC late window this time?
Aug 22 2023
I can try to dust off and restructure the troubleshooting guide on wikitech for the sql/xml dumps, if that would be helpful. This would by no means be a replacement for the runbook, but more of a minimal guide if people get stuck. Having a document specifically for dumps newcomers is great and I hope it will be expanded over time!
Aug 17 2023
Counterpoint: knowing the config settings doesn't mean understanding the code activated by those changes or its possible impacts. At least, not for me. Some areas I know, and some I don't.
Aug 16 2023
Aug 14 2023
Note that since dumps snapshot instances are sorta-kinda mediawiki instances, this affects them too.
Aug 9 2023
Aug 7 2023
Aug 5 2023
Jul 27 2023
This training happened, though it was a lot less interactive and useful than it could have been because no patches were scheduled and no one showed up with a patch last minute, in spite of me begging :-D But we went through a description of all the steps, looked at all the relevant dashboards and hosts and commands, so there's that :-)
Jul 19 2023
Dumpsdata1007, running bullseye, is now the fallback host for sql/xml and misc dumps. This means all hosts in production (not spares) are on bullseye now and this task can be closed after a day or so just to make sure things are stable.
Jul 17 2023
Sounds great to me, thanks!
Jul 13 2023
Just for my understanding, it looks like the new patch would exception out in the case where there is a failure with the last connection of whatever sort. Am I reading that right? And if so, how does that help us in the current situation? Sorry for whatever I am missing here. Thanks!
Hey @elukey (or anyone else watching who wants to chime in), I've got a recipe that might maybe possibly could work. (See patch above.) But I have questions. Some recipes in the repo deal with lvm partitions by "unknown ignore" instead of "lvmpv keep", and I wonder which is better. Some recipes without swap specify "d-i partman-basicfilesystems/no_swap boolean false" and some do not, and I wonder which is right. And last but not least, is it still the procedure for testing before merge, to announce "hey I'm testing on installX00Y and disabling puppet for awhile" in the channel and hoping no one speaks up? Thanks in advance!
Jul 12 2023
A note that I did a test run of sql/xml dumps on deployment-prep with the new icu version and it looks fine to me, though I didn't check for any weird details of category sorting or whatever.
Jul 11 2023
See also https://phabricator.wikimedia.org/T341045 for the context. @WDoranWMF please sign off just in case that's needed. Thanks!
Jul 10 2023
It sounds like the reuse-parts.cfg script is the way to go. Let me poke around and see how that's used elsewhere, and I'll come back if I get stuck. Thanks!
Jul 9 2023
One more to add:
Jul 4 09:16:50 dumpsgen: extensions/CirrusSearch/maintenance/DumpIndex.php failed for /mnt/dumpsdata/otherdumps/cirrussearch/20230703/commonswiki-20230703-cirrussearch-content.json.gz
@JEbe-WMF you will need to folllow the instructions here https://wikitech.wikimedia.org/wiki/SRE/Clinic_Duty/Access_requests#Checklist and create a task, feel free to add me as a subscriber and link this one to it. Make sure you ask for membership in the wmf ldap group. That will give you icinga/grafana/logstash access.
Dan and Xabriel already are members of the wmf group, giving access to grafana and icinga (though contact info might need to be added for executing commands on icinga). Jennifer is not yet in the group.
Swapped dumpsdata1003 in as the live nfs share for misc dumps.
Jul 6 2023
@JEbe-WMF and @xcollazo you should both sign up for MediaWiki deployment training here: https://phabricator.wikimedia.org/project/board/5265/ and get scheduled for that. Once that's done, we can add you to the deployers list in puppet. (Dan you are already a deployer so you're off the hook ;-) )