User Details
- User Since
- Aug 2 2021, 1:52 PM (245 w, 1 d)
- Availability
- Available
- IRC Nick
- Emperor
- LDAP User
- MVernon
- MediaWiki User
- MVernon (WMF) [ Global Accounts ]
Yesterday
Same failure mode after the BIOS upgrade - post-installer boot is fine, after puppet it gets to:
Booting from Hard drive C: GRUB
and there it remains forever.
Tried a BIOS upgrade from 2.12.2 to 2.24.0. That didn't cause the system to become bootable, but trying yet another reimage.
@WMDE-leszek analytics_privatedata_users isn't an LDAP group, it's a shell group, so it wouldn't appear in the ldap listing (for instance - https://ldap.toolforge.org/user/mvernon is me, and you'll see it's not listed there against me either). I think you probably want data engineering for help debugging superset dashboard access.
As before, post-installer boot was fine, but after puppet it gets as far as:
Booting from Hard drive C: GRUB
At the suggestion of @elukey on IRC, I am trying a firmware downgrade to 6.10.30.20 (the version 5.0.20.0 that this system started with wasn't obviously available).
My volunteer account has acquired the PersonalDashboard, and my volunteer-hat would appreciate a way to note that I've looked at an edit and it was OK (and was a bit surprised there wasn't a UI element to do so).
Mon, Apr 13
Hi @Jclark-ctr could you take another look at the disks on these two systems, please? There should be 24 JBOD spinning disks visible to the OS, but neither host has that:
apus-be1005 has 23 (i.e. one missing)
mvernon@apus-be1005:~$ grep -c ' sd' /proc/partitions 23
Fri, Apr 10
Thanks @Jhancock.wm :)
I've eyeballed the discussion here - AFAICT apus is behaving as expected? I haven't seen persistent lag between the two clusters, but during bursts of activity replication between the DCs is asynchronous (by design). The problem is the registry (due to caching connections) writing to both clusters at the same time but assuming it's only writing to one, and thus being thrown by asynchronous replication between the clusters.
Thu, Apr 9
@Jeff_G please open new tickets when reporting new issues, unless you're really 100% sure you've got a recurrence of exactly the same issue again - it's really easy to merge tickets that turn out to be duplicates, but very hard to un-merge when you've got two different issues on the same ticket (as has happened here).
Wed, Apr 8
A wrinkle here is that ferm doesn't get reloaded on the other swift nodes (presumably because the config for ferm hasn't actually changed, because the hostname of the node is unchanged), so you have to do that by cumin-hand before the reimaged node works again.
Tue, Apr 7
@Jhancock.wm that should be fine, thanks!
Thu, Apr 2
Thanks for the quick fixes @Jclark-ctr :-)
Wed, Apr 1
Mon, Mar 30
Wed, Mar 25
Tue, Mar 24
A quick back-of-the-envelope is about 73TB for commons transcoded buckets.
Mon, Mar 23
Thanks, I'm going to optimistically close this ticket then :)
I'm guessing you don't have an exact timestamp for the error? I'm afraid it's going to be almost impossible to say anything useful about this, because there's likely nothing in the logs that will be findable (since it's a stash issue, I can't even search for the object path in the logs). Sorry.
Fri, Mar 20
Right, then the existing thanos-swift infrastructure has no-where near the SSD capacity to support that use case.
What sort of storage volume are we talking about here?
The thanos-swift cluster has some lowlatency storage, which is largely unused; each server has 2x200G available for the "lowlatency" storage policy, which equates to about 1TB of usable capacity (given x3 replication). Currently only the chartmuseum account is using any of that capacity.
Thu, Mar 19
@hnowlan can I push this up your stack, please? Willy wants all procurement requests for next FY done by end of next week (i.e. 27 March).
Wed, Mar 18
I remain grateful that we have spare disks available, so thanks again :)
Mar 13 2026
I got into the host via the serial console. Some notes:
@Reedy you did the 1.43 backports (at least according to gerrit), can you have a look at this, please? I can open a new subtask for tracking if that's easier.
Mar 12 2026
Thanks! New disk is configured and backfilling fine.
Mar 11 2026
Hi, sorry this got dropped - do feel free to poke.
Thanks :)
@ayounsi I re-imaged with the --move-vlan argument 3 codfw nodes today, and everything went well, so I think this is done now, thanks!
Imaging of both systems was OK once the relevant disk got wiped.
Hi @Jhancock.wm I'm afraid this is the problem we've seen with Dell before (but that I hoped they were going to correct), where they send us systems with a Windows EFI partition on one of the spinning disks.
Puppet says (amongst other things):
Notice: /Stage[main]/Profile::Swift::Storage::Configure_disks/Exec[mkfs-pci-0000:50:00.0-scsi-0:2:17:0]/returns: mkfs.xfs: /dev/disk/by-path/pci-0000:50:00.0-scsi-0:2:17:0-part1 appears to contain an existing filesystem (vfat). Notice: /Stage[main]/Profile::Swift::Storage::Configure_disks/Exec[mkfs-pci-0000:50:00.0-scsi-0:2:17:0]/returns: mkfs.xfs: Use the -f option to force overwrite. Error: '/usr/sbin/mkfs -t xfs -m crc=1 -m finobt=0 -i size=512 /dev/disk/by-path/pci-0000:50:00.0-scsi-0:2:17:0-part1' returned 1 instead of one of [0] Error: /Stage[main]/Profile::Swift::Storage::Configure_disks/Exec[mkfs-pci-0000:50:00.0-scsi-0:2:17:0]/returns: change from 'notrun' to ['0'] failed: '/usr/sbin/mkfs -t xfs -m crc=1 -m finobt=0 -i size=512 /dev/disk/by-path/pci-0000:50:00.0-scsi-0:2:17:0-part1' returned 1 instead of one of [0] (corrective)
If I mount that partition and have a look:
mvernon@ms-be2095:~$ sudo mount /dev/disk/by-path/pci-0000\:50\:00.0-scsi-0\:2\:17\:0-part1 /mnt/ mvernon@ms-be2095:~$ ls /mnt/ EFI EFI.BAK mvernon@ms-be2095:~$ ls /mnt/EFI Boot Microsoft PEBoot
Which is the pattern we've seen before. I've wiped the offending partition and disk (in this case sudo wipefs -a /dev/sdr1 && sudo wipefs -a /dev/sdr, it was sdd on ms-be2096), and will now reimage.
Ah, I just put 10:00 EST into date. You're probably right, but a confirmation would be helpful :)
Can I check this is 15:00 UTC (particularly given daylight confusion...), please? Once it's done I'll check ms-be1091 [the frontends can just be repooled again afterwards]
Mar 10 2026
I'm currently trying to get some quotes to put an expansion request in for next FY for apus, primarily to enable us to have a small solid-state-only storage pool to use for bucket indexes and similar, which should give us better performance and reliability with buckets with larger numbers of objects in.
Mar 9 2026
Brilliant, thanks! Replacement is back in service and refilling now.
I see from T414407 you're thinking about moving gerrit to k8s. I do wonder if this is the sort of thing that k8s persistent volume claims are intended for?
The Apus cluster does not currently support CephFS, I'm afraid. It wouldn't be straightforward to add support either - Apus does multi-site replication at the RGW/S3 level, the underlying Ceph clusters (1 in eqiad, 1 in codfw) don't talk to each other except via https/S3 communication between the RGWs. So even if we added MDS (metadata servers, the things you need to run CephFS on top of Ceph), you'd have two separate filesystems, one per DC. There is snapshot mirroring, but I don't think it's what you'd want here. Data Platform Engineering's cluster has CephFS (per wikitech), but I don't know if they do any sort of cross-site stuff with it.
Mar 6 2026
@Ladsgroup they're only a tiny number of files, but XCF will probably likewise need addressing?
FWIW, I have no objection to your doing so.
Mar 4 2026
Hi @Raymond_Ndibe - I've removed your old key now (so it'll be removed from production systems in the next 20 minutes or so).
{{done}}
{{done}}
Is this maintenance happening at 15:00 UTC today?
Yes, they look good now, thank you!
Mar 3 2026
Looking at 1095, the drives appear in the web-iDRAC as "NonRAID Disk 0" and the Storage Overview says 26 "Non-RAID Disks".
@Jclark-ctr sorry, I was wrong, the disks are now setup incorrectly - it looks like you've set them up as a set of RAID-0 arrays, but these systems are meant to be JBOD - so no virtual disks at all, all non-RAID. Can you re-do both of these systems thus, please? We've moved to JBOD-only for swift (and Ceph) backends entirely.
@Jclark-ctr yep, both look good now, thanks!
Perhaps instead for the odd non-web format (which seems to include XCF and TIFF currently, begging the WebP question for a moment) we should only generate standard-size thumbnails, and the commons file interface should drop the "original-size-converted-to-PNG" option entirely; mediaviewer would need adjusting to request the largest-standard-size-smaller-than-original too. These aren't formats intended for images-for-display on commons (per https://commons.wikimedia.org/wiki/Commons:File_types#Images).
@Jclark-ctr can you take another look at these, please? In neither system can the OS see any of the spinning disks, which should be available as JBOD devices - at a guess that still needs setting up in the storage controller.
Mar 2 2026
So that's a "thumbnail the same size as original, rather than original" issue (the original image is 1074px wide) - you should be being shown the original image, but are instead getting a non-standard size thumbnail. I had thought mediaviewer had fixed this.
Feb 27 2026
@Atieno Whilst SRE is driving WE 5.4.10, we do need support from other teams in P&T as appropriate to get this work done - Is the MW interfaces team not best placed to address this issue, please?
Feb 26 2026
Feb 25 2026
I think he is not - the former is now self-service via IDM.
OK, I've tagged Data-Engineering, since I think this is their ballpark now. Hopefully they can help :)
Feb 24 2026
At least so far, no issues with sync getting far behind either.
I spent quite a bit of time with codesearch last quarter trying to track down thumbnail size (ab)use, but we can't possibly hope to find (or fix) every single externally-written bit of software.
Feb 23 2026
codfw cluster done, too.
eqiad cluster done.
Yep, setting preseed to expect UEFI booting fixed things.
Looking at the access groups documentation, analytics-privatedata-users should be sufficient for dashboards with private data.
Feb 21 2026
@Jhancock.wm these nodes are swift frontends in the ms cluster, so should be ms-fe* not moss-fe* (moss* is a legacy name that should never apply to new nodes).