User Details
- User Since
- Oct 15 2019, 4:02 PM (140 w, 5 d)
- Availability
- Available
- IRC Nick
- rzl
- LDAP User
- RLazarus
- MediaWiki User
- Unknown
Mon, Jun 13
Fri, Jun 10
Mon, Jun 6
Wed, Jun 1
@EChetty Hi from the SLO project! Thanks for this -- Will asked me to have a look at the schema, from the POV of where we're going next with the edit save SLO. Broadly I think this is going to have everything we need, just a couple of clarification questions.
Sat, May 28
May 16 2022
Great, thanks!
May 13 2022
Hmm, also: As a group access change, this should be reviewed and approved in the Infrastructure-Foundations team meeting.
We're past the European work day, so I don't expect a response from Lukasz (who's OOO) or Alex before Monday. I'm sure next week's SRE clinician will pick this up once it's signed off.
Hi @Dmantena, you should be all set now:
May 11 2022
This should be all set!
May 10 2022
Added to ldap/nda:
rzl@mwmaint1002:~$ ldapsearch -x cn=wmf | grep esther-akinloose member: uid=esther-akinloose,ou=people,dc=wikimedia,dc=org
Thanks both! Proceeding.
After consulting with SRE colleagues, I stand corrected -- the email address on the account is fine, and we'll just use the wikimedia.org address in our own records. Going ahead!
@RoccoMo Hi from the SRE team! Thanks for the request, we'll get you sorted out shortly.
I think she already has a wikitech account: https://wikitech.wikimedia.org/wiki/User:Esther_Akinloose
Thanks both @Joe and @thcipriani for the ping -- agreed clinic duty is as good a route for this as any.
May 9 2022
You're both in the wmf group already, so nothing to do there:
Just picking up SRE clinic duty for the week -- I'm so sorry this has been sitting for so long! I'll ask around and try to find out what happened here.
Grabbing this from @jhathaway as I've taken over SRE clinic duty for this week. This is actually the right template for the use case, since Superset access is via LDAP, not SSH (even though PII access is controlled with a posix group). We can go right ahead with it.
Apr 26 2022
Apr 18 2022
Apr 14 2022
Yeah -- I can do the implementation but I'm not sure if we've settled on what we want it to look like.
Apr 12 2022
Apr 8 2022
@MoritzMuehlenhoff Checking in -- have you had any time to take a look at this?
Apr 5 2022
Mar 31 2022
Draft incident report: https://wikitech.wikimedia.org/wiki/Incident_documentation/2022-03-31_api_errors
Mar 29 2022
Another way I'd like to improve this is to deal with Puppet skew on the two hosts.
Mar 28 2022
Found another example of this, in case the extra data helps -- thanks @MSantos and @ECohen_WMDE for pointing me to the right task. (Moved here from T228612.)
Mar 27 2022
Mar 24 2022
Mar 23 2022
Hmm, the 1.21.1 build didn't work out of the box. Running build-envoy-deb buster future got me this:
Mar 22 2022
As in T300324#7752134, I've rolled out all the k8s services where Envoy version was the only diff. We're now up to 1.18 everywhere, except for k8s services with other undeployed changes, and I'll follow up with those at the end.
Mar 20 2022
Mar 17 2022
Mar 16 2022
From the time sliders it looks like the issue is that all or part of the pad gets deleted and replaced by a character, at these revisions respectively:
Mar 14 2022
Mar 11 2022
Mar 10 2022
Oh, yep, it's strip_matching_host_port in the HTTP connection manager: https://www.envoyproxy.io/docs/envoy/latest/api-v3/extensions/filters/network/http_connection_manager/v3/http_connection_manager.proto
I just upgraded thanos-fe to envoy 1.18.3, but out of the box I see the same behavior:
Mar 9 2022
Mar 7 2022
Mar 4 2022
1.15.4 is still running in a few places on k8s -- after bumping the default version, I rolled out all services where that was the only diff. Some services had some undeployed changes from who knows how long ago, so I left them untouched (T265979 for that problem in general).
Mar 3 2022
To take a step back, the varnish slo dashboard linked in the description didn't actually originate from a template. Presumably this one was a manual fork of the original etcd slo example dashboard that's been manually adjusted.
Mar 1 2022
Feb 25 2022
Feb 24 2022
Thanks for letting us know! We did indeed have this issue again for a few minutes earlier (intermittently between 02:36 and 03:00 UTC) but things are back to normal now. Sorry for the inconvenience, and more permanent solutions are in progress to keep this from happening again.
Feb 23 2022
This came up again in T301507.
Feb 17 2022
Feb 14 2022
Done, thanks!
Feb 12 2022
Feb 9 2022
Found another example of this, in case the extra data helps -- thanks @MSantos for pointing me to the right task.