Page MenuHomePhabricator
Feed Advanced Search

Jan 28 2021

wkandek added a project to T273139: decommission rdb100[56].eqiad.wmnet: serviceops.
Jan 28 2021, 1:05 AM · SRE, ops-eqiad, decommission-hardware
wkandek added a project to T273140: decommission rdb200[3456].codfw.wmnet: serviceops.
Jan 28 2021, 1:05 AM · Patch-For-Review, SRE, ops-codfw, DC-Ops, serviceops, decommission-hardware
wkandek created T273140: decommission rdb200[3456].codfw.wmnet.
Jan 28 2021, 1:04 AM · Patch-For-Review, SRE, ops-codfw, DC-Ops, serviceops, decommission-hardware
wkandek created T273139: decommission rdb100[56].eqiad.wmnet.
Jan 28 2021, 1:02 AM · SRE, ops-eqiad, decommission-hardware
wkandek renamed T273138: decommission scb200[1234].codfw.wmnet from decommission rdb100[56].eqiad.wmnet to decommission scb200[1234].eqiad.wmnet.
Jan 28 2021, 12:57 AM · serviceops, decommission-hardware
wkandek updated the task description for T273138: decommission scb200[1234].codfw.wmnet.
Jan 28 2021, 12:53 AM · serviceops, decommission-hardware
wkandek created T273138: decommission scb200[1234].codfw.wmnet.
Jan 28 2021, 12:51 AM · serviceops, decommission-hardware
wkandek created T273137: decommission thumbor100[34].eqiad.wmnet.
Jan 28 2021, 12:47 AM · serviceops, decommission-hardware
wkandek created T273136: decommission scb100[1234].eqiad.wmnet.
Jan 28 2021, 12:44 AM · serviceops, decommission-hardware

Jan 14 2021

wkandek edited Description on SRE.
Jan 14 2021, 1:58 PM

Jan 11 2021

wkandek closed T271554: k8splay project has broken puppet because of incorrect FQDNs as Resolved.
Jan 11 2021, 5:55 PM · Cloud-VPS
wkandek added a comment to T271554: k8splay project has broken puppet because of incorrect FQDNs.

Instances deleted.

Jan 11 2021, 5:51 PM · Cloud-VPS

Dec 16 2020

wkandek awarded T269960: Schema properties in client code loads the whole item in every page view a Mountain of Wealth token.
Dec 16 2020, 6:27 PM · MW-1.37-notes (1.37.0-wmf.9; 2021-06-07), User-Ladsgroup, MW-1.36-notes (1.36.0-wmf.21; 2020-12-08), Performance-Team (Radar), Wikidata

Nov 29 2020

wkandek added a member for Service-deployment-requests: wkandek.
Nov 29 2020, 7:24 AM

Aug 18 2020

wkandek closed Restricted Task, a subtask of T257066: Extension:Score / Lilypond is disabled on all wikis, as Resolved.
Aug 18 2020, 1:55 PM · User-notice-archive, Patch-For-Review, MW-1.36-notes (1.36.0-wmf.26; 2021-01-12), Security-Team, Security, WMF-General-or-Unknown, MediaWiki-extensions-Score, SRE

Aug 7 2020

wkandek added a comment to T256863: restbase2009 down.

Will, is the unexpected and unknown state due to the Cassandra database state. I remember this being discussed a couple of days ago.

Aug 7 2020, 4:03 PM · RESTBase, SRE, ops-codfw

Jul 24 2020

wkandek added a comment to T258775: All wtp and parse servers have a bad partition scheme..

Back to 76%: php-1.36.0-wmf.1 and .41 are now way smaller.

Jul 24 2020, 5:31 PM · SRE, serviceops
wkandek closed T254530: Evaluate Locust Stress Test Tool as Resolved.
Jul 24 2020, 1:39 AM · serviceops

Jul 23 2020

wkandek triaged T257906: Move testreduce away from scandium to a separate Buster Ganeti VM as Medium priority.
Jul 23 2020, 10:57 PM · Parsoid (Tracking), Patch-For-Review, Parsoid-Tests, serviceops, SRE
wkandek moved T257906: Move testreduce away from scandium to a separate Buster Ganeti VM from Incoming 🐫 to Doing 😎 on the serviceops board.
Jul 23 2020, 10:56 PM · Parsoid (Tracking), Patch-For-Review, Parsoid-Tests, serviceops, SRE

Jul 19 2020

wkandek added a comment to T258336: db1082 crashed.
PST   UTC    Lag     Minutes LagMinutes   Lag Decay Rate/minute
12:30 19:30  17:43
12:56 19:56  12:44      0:26        299      11.50
13:42 20:42   5:38      0:46        426       9.26
13:52 20:52   4:31      0:10         67       6.70
14:08 21:08   1:20      0:16        191      11.94
Jul 19 2020, 9:57 PM · SRE, DBA
wkandek added a comment to T258336: db1082 crashed.

Should dbtree reflect the replication lag on db1124?

Jul 19 2020, 6:27 PM · SRE, DBA

Jul 14 2020

wkandek updated subscribers of T257948: https://blog.wikimedia.org/ returning blank page?.

Christopher Koerner:e-mail: 35 minutes ago
Once I fix what is broken all the posts from blog.wikimedia.org will be at diff.wikimedia.org with redirects.

Jul 14 2020, 6:51 PM · Diff-blog
wkandek added a comment to T257948: https://blog.wikimedia.org/ returning blank page?.

slack conversation in OCG-General indicates it is part of the blog migration and will be addressed soon.

Jul 14 2020, 6:50 PM · Diff-blog

Jun 23 2020

wkandek added a comment to T255927: db1088 crashed.

HP says, the server should not reboot due to battery failure: https://support.hpe.com/hpesc/public/docDisplay?docId=mmr_kc-0126260#:~:text=POST%20Error%3A%20313%20%2D%20HPE%20Smart,other%20reasons%20for%20a%20reboot.

Jun 23 2020, 10:09 PM · DBA, SRE
wkandek added a comment to T255927: db1088 crashed.

Should a BBU failure cause a reboot?

Jun 23 2020, 5:28 PM · DBA, SRE

Jun 16 2020

wkandek created T255511: mcrouter memcached flapping in gutter pool.
Jun 16 2020, 12:21 AM · SRE-Sprint-Week-Sustainability-March2023, Sustainability (Incident Followup), serviceops

Jun 11 2020

wkandek added a comment to T255179: Session failures ("invalid CSRF token") preventing edits, login, logout, etc due to kask outage.

https://grafana.wikimedia.org/d/000001590/sessionstore?orgId=1&from=1591900083237&to=1591903561732

Jun 11 2020, 9:10 PM · Wikimedia-Incident, Platform Engineering, MediaWiki-Core-AuthManager, MediaWiki-User-login-and-signup, User-DannyS712

Jun 4 2020

wkandek triaged T254530: Evaluate Locust Stress Test Tool as Medium priority.
Jun 4 2020, 10:52 PM · serviceops
wkandek moved T254530: Evaluate Locust Stress Test Tool from Incoming 🐫 to Doing 😎 on the serviceops board.
Jun 4 2020, 10:51 PM · serviceops
wkandek added a comment to T254530: Evaluate Locust Stress Test Tool.

Locust was first setup locally under Virtualbox VMs and functionally tested. Vagrant was used to generate the VMs and puppet was used for the software installation and configuration. Puppet is also used to propagate changes in the test specification file.
Vagrant was also used in the port to AWS and functional testing works well. But it is cumbersome to generate 100s of VMs with Vagrant as API throttling seems to kick in it. It is easier and far faster to clone of on the Locust worker machines and use the AWS API to
generate additional workers. That way 1000 machines be brought up in under 1 hour. The cost of such a test is fairly low: under 10 USD/hour.

Jun 4 2020, 10:47 PM · serviceops
wkandek added a parent task for T254530: Evaluate Locust Stress Test Tool: Unknown Object (Task).
Jun 4 2020, 10:34 PM · serviceops
wkandek created T254530: Evaluate Locust Stress Test Tool.
Jun 4 2020, 10:32 PM · serviceops

May 21 2020

wkandek added a comment to T249352: Onboarding Wolfgang Kandek.

pwstore installed, access pending regeneration of pgp key. Task can be closed.

May 21 2020, 6:31 PM · LDAP-Access-Requests, SRE-Access-Requests, SRE

May 15 2020

wkandek added a comment to T252577: Maxmind data update issues for DNS (and others?).

Just FYI: my machine is being served from esams again.

May 15 2020, 3:04 AM · SRE, Traffic

May 14 2020

wkandek added a comment to P11193 SRE onboarding chat demo.

from the slide: curl -H "Content-Type: application/yaml" --data-binary @data https://blubberoid.wikimedia.org/v1/hello

May 14 2020, 1:29 PM

Apr 27 2020

wkandek updated the task description for T249352: Onboarding Wolfgang Kandek.
Apr 27 2020, 10:41 PM · LDAP-Access-Requests, SRE-Access-Requests, SRE
wkandek updated the task description for T249352: Onboarding Wolfgang Kandek.
Apr 27 2020, 9:46 PM · LDAP-Access-Requests, SRE-Access-Requests, SRE
wkandek updated the task description for T249352: Onboarding Wolfgang Kandek.
Apr 27 2020, 8:09 PM · LDAP-Access-Requests, SRE-Access-Requests, SRE
wkandek added a comment to T249352: Onboarding Wolfgang Kandek.

RSA Key from yubikey:

Apr 27 2020, 12:14 AM · LDAP-Access-Requests, SRE-Access-Requests, SRE
wkandek added a comment to T249352: Onboarding Wolfgang Kandek.
gpg --fingerprint 9B51CE0772203719B26C8ED3EEABB9556398421F
pub   rsa4096 2020-04-23 [SC]
      9B51 CE07 7220 3719 B26C  8ED3 EEAB B955 6398 421F
uid           [ultimate] Wolfgang Kandek <wkandek@wikimedia.org>
sub   rsa4096 2020-04-23 [E]
sub   rsa4096 2020-04-23 [A]
Apr 27 2020, 12:13 AM · LDAP-Access-Requests, SRE-Access-Requests, SRE

Apr 20 2020

wkandek added a watcher for serviceops: wkandek.
Apr 20 2020, 3:56 PM

Apr 13 2020

wkandek updated the task description for T249352: Onboarding Wolfgang Kandek.
Apr 13 2020, 8:23 PM · LDAP-Access-Requests, SRE-Access-Requests, SRE
wkandek updated the task description for T249352: Onboarding Wolfgang Kandek.
Apr 13 2020, 8:23 PM · LDAP-Access-Requests, SRE-Access-Requests, SRE
wkandek updated the task description for T249352: Onboarding Wolfgang Kandek.
Apr 13 2020, 6:58 PM · LDAP-Access-Requests, SRE-Access-Requests, SRE