Page MenuHomePhabricator
Paste P13012

(An Untitled Masterwork)
ActivePublic

Authored by Gehel on Oct 16 2020, 8:26 AM.
Tags
None
Referenced Files
F32386336: raw-paste-data.txt
Oct 16 2020, 8:26 AM
Subscribers
None
<•ebernhardson> not sure readahead from puppet took appropriately, `blockdev --getra /dev/sd?` reports readahead as 128 512-byte sectors (64kb), but the puppet patch specified read_ahead_kb=16 in the udev rule. not sure why that's the result :S
11:59 PM volans → volans|off
Friday, October 16th, 2020
12:23 AM ⇐ chippy quit (~chippy@unaffiliated/chippy) Ping timeout: 246 seconds
12:41 AM — •ebernhardson finally got around to checking the readahead....really i dunno what exactly udev is doing. Tempted to replace with a plain call to `blockdev --setra N`
2:07 AM ⇐ ottomata quit (sid347637@gateway/web/irccloud.com/x-jtzdhsmfedszcysf)
2:08 AM — •ebernhardson updated 'full reindex' docs on wikitech Search to clarify it's mostly a historical process we no longer use
2:09 AM <•ryankemper> thanks!
2:14 AM <•ryankemper> ebernhardson: So I think we should try to rule out whether the udev rule is not working correctly or if it's just the actual command that it's doing that isn't working
2:14 AM <•ryankemper> Having a surprisingly hard time finding where udev log output would be though
2:14 AM <•ebernhardson> hmm, i'm not sure where that is either :) looking
2:15 AM <•ryankemper> Anyway I see the rule file at `/etc/udev/rules.d/40-elasticsearch-readahead.rules`, but yeah log-wise `journalctl` isn't turning up much and there's no `/var/log/udev` or anything
2:15 AM <•ryankemper> the conf lives at `/etc/udev/udev.conf` but all that file does is set log level to info, so it must be writing to whatever the default location is
2:16 AM <•ebernhardson> in theory `udevadm test ...` should tell us, but it's not telling me anything related :S
2:17 AM <•ebernhardson> hmm, actually it does, from `udevadm test /sys/block/md1` (md1 is specified in udev rules.d): ATTR '/sys/devices/virtual/block/md1/bdi/read_ahead_kb' writing '16' /etc/udev/rules.d/40-elasticsearch-readahead.rules:1
2:18 AM <•ebernhardson> reading that path clearly says 128 though, hmm...could try forcing a udev reload like modules/udev/manifests/init.pp does
2:20 AM <•ryankemper> Trying that on `elastic2037`
2:22 AM <•ryankemper> `/sbin/udevadm control --reload && /sbin/udevadm trigger` didn't seem to have any effect on `/sys/devices/virtual/block/md1/bdi/read_ahead_kb`, might want to try restarting the udev systemd unit itself...lemme see if `udevadm test /sys/block/md1` has any new output first
2:23 AM <•ryankemper> Oh
2:23 AM <•ryankemper> https://www.irccloud.com/pastebin/rqO7QU99/
Plain Text • 2 lines raw | line numbers



2:23 AM <•ebernhardson> heh
2:23 AM <•ebernhardson> yea that'll do it :)
2:23 AM <•ryankemper> https://www.irccloud.com/pastebin/bW81pxFk/
Plain Text • 2 lines raw | line numbers



2:24 AM <•ryankemper> That file is owned by root...but I'd have expected udev to be running with root privs (perhaps naively)
2:25 AM <•ebernhardson> hmm, sytemd-udevd claims to be root
2:25 AM <•ebernhardson> (i don't know thats the right one, but the name lines up :)
2:26 AM <•ebernhardson> could try `echo 16 |sudo tee .../read_ahead_kb` to make sure root can write to it. /sys/ is special and can reject access for things it doesn't want to allow writes to for other reasons
2:26 AM <•ryankemper> >
/sys/ is special and can reject access for things it doesn't want to allow writes to for other reasons
2:26 AM <•ryankemper> TIL
2:27 AM <•ryankemper> Well with sudo it took just fine
2:27 AM <•ryankemper> https://www.irccloud.com/pastebin/bnMgWZyh/
Plain Text • 6 lines raw | line numbers



2:27 AM <•ebernhardson> hmm, so the problem is certainly that udev wants to change it but doesn't seem to have root perms then
2:27 AM <•ebernhardson> well, kinda-sorta certainly :)
2:28 AM <•ryankemper> Yup
2:28 AM <•ebernhardson> as to sys, it and /proc are both special not-really filesystems but exposed as filesystems because on unix everything is a file. Because of that the file attributes are more like a "best representation" as opposed to the rules for access
2:29 AM <•ryankemper> I see, I'm pretty familiar with `procsfs` with respect to cgroups and stuff but don't know a lot about `sys`
2:29 AM <•ryankemper> Anyway so now the question is what actual process is modifying the file
2:29 AM <•ebernhardson> indeed
2:30 AM <•ebernhardson> maybe we can log rejected file access somehow?
2:31 AM <•ebernhardson> (not sure how, but that would hopefully include a pid)
2:32 AM <•ebernhardson> hmm, no doesn't look like anytihng thats not either fully external, or requires kernel tracing (which i dont think we ship kernels with)
2:33 AM <•ryankemper> Yeah, I've used `falco` for that but it requires a kernel module to be configured
2:33 AM <•ryankemper> https://stackoverflow.com/a/49695117 kind of describes the issue we're seeing
2:34 AM <•ryankemper> If that comment is to be believed then a full `udev` restart instead of just the reload would unstick things
2:34 AM <•ryankemper> I wish it was more clear why that'd be the case tho
2:34 AM <•ebernhardson> huh, doesn't generally seem very dangerous to try
2:36 AM <•ryankemper> Tried it, didn't seem to make a difference
2:36 AM <•ryankemper> https://www.irccloud.com/pastebin/MhisLtdK/
Plain Text • 15 lines raw | line numbers



2:38 AM <•ryankemper> Hmm actually the rules file is wrong on this instance
2:38 AM <•ryankemper> https://www.irccloud.com/pastebin/uvkTQYzQ/
Plain Text • 2 lines raw | line numbers



2:38 AM <•ebernhardson> also, there are two files there
2:39 AM <•ebernhardson> 40-elasticsearch-readahead.rules and elasticsearch-readahead.rules
2:39 AM <•ryankemper> Ah
2:39 AM <•ebernhardson> i suspect at some point priorities were added, and the old version not cleaned up
2:40 AM — •ebernhardson doesn't check scrollback to see how many times i looked over that
2:40 AM <•ryankemper> Yeah because on this instance we didn't get permission denied, it successfully wrote 16 and then wrote 128 right after
2:41 AM <•ebernhardson> so really should have been looking at `udevadm test /sys/block/md1 2>&1 | grep read_ahead` instead of paging through the output, would have been obvious. Next time :)
2:43 AM — •ebernhardson thinks again puppet needs an everygreen-directory kind of thing :)
2:44 AM <•ryankemper> I don't quite understand why `/etc/udev/rules.d/elasticsearch-readahead.rules` is 128 but `/etc/udev/rules.d/40-elasticsearch-readahead.rules` is 16
2:45 AM <•ryankemper> Since in puppet it's `udev::rule { 'elasticsearch-readahead':` that sets it to 16
2:45 AM <•ryankemper> In other words I'd expect that to be reversed
2:45 AM <•ebernhardson> ryankemper: sec i can probably find a culprit patch
2:46 AM <•ebernhardson> https://gerrit.wikimedia.org/r/c/operations/puppet/+/447566
2:47 AM <•ebernhardson> ryankemper: not exactly the culprit, but in that patch i have a file { '/etc/udev/rules.d/elasticsearch-readahead.rules': ...}, so that created the unprioritized file
2:47 AM <•ebernhardson> at some point that was replaced with the current udev::rule { ... } which included the priority, but the old one created in linked patch wasn't cleaned up
2:48 AM <•ryankemper> But what makes the `40-elastic...` one be there
2:48 AM <•ebernhardson> ryankemper: the current udev::rule
2:49 AM <•ryankemper> Oh, what's the significance of the -40? I might be missing some understanding of how udev rules work
2:49 AM <•ebernhardson> it's a convention of the foo.d approach to daemon configuration, the daemon promises to load in sorted order, and you give files a leading NN to control load order
2:49 AM <•ryankemper> Ah, I see
2:50 AM <•ryankemper> And I just found `modules/udev/manifests/rule.pp` where priorioty of 40 is set
2:50 AM <•ebernhardson> yup that's the one
2:50 AM <•ryankemper> Okay, so in summary I think there were two related problems:
2:50 AM <•ryankemper> (1) The dangling priority-less file
2:50 AM <•ryankemper> (2) - The permission denied occurring due to systemd-udevd not restarting fully
2:51 AM <•ryankemper> That line up with your thoughts?
2:51 AM <•ebernhardson> yea. I suspect 1 might fix it, but partially because 2 is odd and doesn't seem to make sense yet :)
2:52 AM <•ryankemper> ebernhardson: So should be as simple as `sudo rm -fv /etc/udev/rules.d/elasticsearch-readahead.rules` as far as (1) is concerned yeah?
2:53 AM <•ebernhardson> yea, on all the different hosts with cirrus profile
2:55 AM <•ryankemper> So, we no longer have the colliding rules but permission denied is still happening actually...even with the restart
2:55 AM <•ebernhardson> hmm,
2:56 AM <•ryankemper> Nevermind
2:56 AM <•ryankemper> So the command as written in puppet to reload is
2:56 AM <•ryankemper> `sudo /sbin/udevadm control --reload && /sbin/udevadm trigger`, which fails
2:56 AM <•ryankemper> `sudo /sbin/udevadm control --reload && sudo /sbin/udevadm trigger` succeeds
2:57 AM <•ebernhardson> ahh, good call
2:57 AM <•ryankemper> So I assume I'll want to open up a patch for that
2:58 AM <•ebernhardson> makes me wonder, where have udev rules worked?
2:58 AM <•ebernhardson> i guess on restart they would have still loaded
2:59 AM <•ebernhardson> i guess i would add people from trafficserver and swift (have udev::rule defs) to the patch just so they know it might get applied on their instances if they dont restart basically ever
3:00 AM <•ebernhardson> actually, it wont actually apply, because it's only on notify
3:03 AM <•ryankemper> Another small bit of cleanup stuff to do
3:03 AM <•ryankemper> https://www.irccloud.com/pastebin/Mr23FTKl/
Plain Text • 12 lines raw | line numbers



3:04 AM <•ryankemper> `cloudelastic100[5,6]` were created more recently and thus lack `read_ahead_kb` as intended but `cloudelastic100[1-4]` have it, so I think I just want to nuke the file from those instances
3:04 AM <•ebernhardson> sure
3:06 AM <•ryankemper> https://www.irccloud.com/pastebin/9SgqiVOx/
Plain Text • 4 lines raw | line numbers



3:06 AM <•ryankemper> sys being funky is relevant now :P
3:07 AM <•ebernhardson> ryankemper: you can't delete things from /sys
3:07 AM <•ryankemper> Anyway, can figure that bit out later, but all the important nodes (codfw/eqiad) are looking correct now
3:07 AM <•ebernhardson> ryankemper: did you mean to delete from /etc/udevd/rules.d/ ?
3:08 AM <•ryankemper> Well the rule has already been nuked yet the readahead is still there I believe
3:09 AM <•ebernhardson> yup, reading everything it all looks sane there
3:09 AM <•ryankemper> Actually I was assuming cloudelastic shouldnt have them at all since 5 and 6 don't have them
3:09 AM <•ryankemper> i wonder if there's something misconfigured with `cloudelastic100[5,6]` actually
3:09 AM <•ebernhardson> looking, i only checked 1001
3:10 AM <•ryankemper> Actually they all have the rule, weird
3:11 AM <•ryankemper> Very bizarre
3:12 AM <•ebernhardson> in which way?
3:12 AM <•ebernhardson> i'm thinking it is setup wrong, dm-0 looks like wrong device
3:16 AM <•ryankemper> Yeah `lsblk` looks way different
3:16 AM <•ebernhardson> ahh, yea the disks are setup completely differently. I don't know why but for some reason partitioning disks is a major problem and we always end up with things different
3:17 AM <•ebernhardson> (not really major, but i mean i remember it constantly being brought up and changed and having custom fixes for this or that)
3:18 AM <•ryankemper> Yeah it seems to be a major pain point
3:18 AM <•ryankemper> Creating a quick phab ticket
3:19 AM <•ebernhardson> i'd avoid thinking about the structure and assume that part still works :) I'm not sure which level the readahead has to be set at though ...
3:19 AM <•ebernhardson> sure, a phab ticket makes sense
3:25 AM <•ryankemper> https://phabricator.wikimedia.org/T265699
3:25 AM <•ryankemper> Will try to circle back and make the ticket writeup less overwhelming at some point :P
3:25 AM ⇐ Nettrom quit (~nettrom@wikipedia/Nettrom) Quit: bye
3:25 AM <•ryankemper> Anyway so for codfw and eqiad the readahead is solved, so we'll want to do another round of restarts
3:26 AM <•ebernhardson> to change the udev rule there is a hiera variable, i'd try changing it to vg-2, or maybe md0
3:26 AM <•ryankemper> Okay so basically have a different hiera variable for 1005-1006 as compared with 1001-1004?
3:26 AM <•ebernhardson> err, it's not vg ..umm
3:27 AM <•ebernhardson> dm-2?
3:27 AM <•ryankemper> Finally I opened up https://gerrit.wikimedia.org/r/c/operations/puppet/+/634390 to add `sudo` to trigger
3:27 AM <•ebernhardson> in hieradata/role/common/elasticsearch/cloudelastic.yaml , would have to change to a regex rule i guess to target 1-4 vs 5-6
3:32 AM <•ebernhardson> i don't know if puppet natively supports regex hiera, but hieradata/regex.yaml is where we centralize all the regex definitions
3:36 AM <•ryankemper> Okay I reorganized https://phabricator.wikimedia.org/T265699 to make it emphasize the need to change hiera
3:36 AM <•ryankemper> Pretty vague but that will be part of the fun for the eventual ticket assignee :P
3:36 AM <•ebernhardson> :)