Page MenuHomePhabricator

Dzahn (Daniel Zahn)
Operations EngineerAdministrator

Projects (20)

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Tuesday

  • Clear sailing ahead.

User Details

User Since
Sep 30 2014, 4:39 PM (280 w, 5 d)
Roles
Administrator
Availability
Available
IRC Nick
mutante
LDAP User
Dzahn
MediaWiki User
Unknown

Recent Activity

Yesterday

Dzahn added a comment to T210411: Applayer services without TLS.

meanwhile there is another one in ATS backend.yaml.

Sat, Feb 15, 3:26 AM · Patch-For-Review, serviceops, Operations, Traffic
Dzahn updated the task description for T210411: Applayer services without TLS.
Sat, Feb 15, 3:25 AM · Patch-For-Review, serviceops, Operations, Traffic

Fri, Feb 14

Dzahn added a project to T244410: LDAP access to the wmf group for CherRaye Glenn (superset, turnilo, hue): Analytics.
Fri, Feb 14, 9:59 PM · Analytics, LDAP-Access-Requests, Operations
Dzahn reassigned T244410: LDAP access to the wmf group for CherRaye Glenn (superset, turnilo, hue) from jijiki to elukey.
Fri, Feb 14, 9:58 PM · Analytics, LDAP-Access-Requests, Operations
Dzahn renamed T244410: LDAP access to the wmf group for CherRaye Glenn (superset, turnilo, hue) from LDAP access to the wmf group for CherRaye Glenn to LDAP access to the wmf group for CherRaye Glenn (superset, turnilo, hue).
Fri, Feb 14, 9:58 PM · Analytics, LDAP-Access-Requests, Operations
Dzahn reopened T244410: LDAP access to the wmf group for CherRaye Glenn (superset, turnilo, hue) as "Open".

@Ottomata or @elukey Please sync users for Hue access ^.

Fri, Feb 14, 9:58 PM · Analytics, LDAP-Access-Requests, Operations
Dzahn updated subscribers of T244410: LDAP access to the wmf group for CherRaye Glenn (superset, turnilo, hue).

@CGlenn You can but there is an extra step involved where @Ottomata or @elukey need to sync users (per https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Hue#Access)

Fri, Feb 14, 9:57 PM · Analytics, LDAP-Access-Requests, Operations
Dzahn added a comment to T240715: Configure prometheus monitoring for Ceph.

new checks have been added to Icinga:

Fri, Feb 14, 9:51 PM · Patch-For-Review, Epic, cloud-services-team (Kanban)
Dzahn updated the task description for T244626: vm requests for APT repo / webserver.
Fri, Feb 14, 7:10 PM · Patch-For-Review, Operations, serviceops-radar, vm-requests
Dzahn placed T245279: decommission kraz.wikimedia.org up for grabs.
Fri, Feb 14, 4:28 PM · decommission, serviceops, Operations, Analytics
Dzahn changed the status of T245279: decommission kraz.wikimedia.org, a subtask of T244719: Create a replacement for kraz.wikimedia.org, from Open to Stalled.
Fri, Feb 14, 4:28 PM · serviceops, Operations, vm-requests, User-Elukey, Analytics
Dzahn changed the status of T245279: decommission kraz.wikimedia.org from Open to Stalled.

Setting to stalled to reflect that. We should change status to Open when it's ready to go.

Fri, Feb 14, 4:28 PM · decommission, serviceops, Operations, Analytics
Dzahn added a comment to T244719: Create a replacement for kraz.wikimedia.org.

@MoritzMuehlenhoff wrote:
Given that Luca also had an error during initial setup related to name resolution, this sounds like some error related to the DNS records for the new host?

Fri, Feb 14, 4:23 PM · serviceops, Operations, vm-requests, User-Elukey, Analytics
Dzahn claimed T245279: decommission kraz.wikimedia.org.
Fri, Feb 14, 4:21 PM · decommission, serviceops, Operations, Analytics
Dzahn renamed T245279: decommission kraz.wikimedia.org from decom kraz.wikimedia.org to decommission kraz.wikimedia.org.
Fri, Feb 14, 4:20 PM · decommission, serviceops, Operations, Analytics
Dzahn created T245279: decommission kraz.wikimedia.org.
Fri, Feb 14, 4:19 PM · decommission, serviceops, Operations, Analytics

Wed, Feb 12

Dzahn added a comment to T244719: Create a replacement for kraz.wikimedia.org.

The primary network interface is missing from /etc/network/interfaces. There is only loopback in there. Why that is is another question.

Wed, Feb 12, 6:46 PM · serviceops, Operations, vm-requests, User-Elukey, Analytics
Dzahn added a comment to T244719: Create a replacement for kraz.wikimedia.org.
Debug: Augeas[ens5_v6_token](provider=augeas): sending command 'set' with params ["/files/etc/network/interfaces/iface[. = 'ens5']/pre-up", "/sbin/ip token set ::208:80:153:62 dev ens5"]
Debug: Augeas[ens5_v6_token](provider=augeas): Put failed on one or more files, output from /augeas//error:
Debug: Augeas[ens5_v6_token](provider=augeas): /augeas/files/etc/network/interfaces/error = put_failed
Debug: Augeas[ens5_v6_token](provider=augeas): /augeas/files/etc/network/interfaces/error/path = /files/etc/network/interfaces/
Debug: Augeas[ens5_v6_token](provider=augeas): /augeas/files/etc/network/interfaces/error/lens = /usr/share/augeas/lenses/dist/interfaces.aug:125.13-.63:
Debug: Augeas[ens5_v6_token](provider=augeas): /augeas/files/etc/network/interfaces/error/message = Failed to match tree under /
Debug: Augeas[ens5_v6_token](provider=augeas): Closed the augeas connection
Error: /Stage[main]/Profile::Standard/Interface::Add_ip6_mapped[main]/Augeas[ens5_v6_token]: Could not evaluate: Saving failed, see debug
Wed, Feb 12, 6:42 PM · serviceops, Operations, vm-requests, User-Elukey, Analytics
Dzahn awarded T241109: wikibugs needs restart almost everyday a Barnstar token.
Wed, Feb 12, 6:10 PM · Operations, Wikibugs
Dzahn added a comment to T244792: Determine any impacts to SRE from OIT's planned move to JumpCloud for LDAP.

the mail team is already offloading aliases from the mx servers to Google per T122144

Wed, Feb 12, 6:02 PM · User-jbond, Security-Team, Operations

Tue, Feb 11

Dzahn added a comment to T243800: gerritro user getting access denied from gerrit1002.

@Marostegui I made some changes to make the db_user and db_pass configurable for gerrit. Thing is just i don't know the clear text version of the hashed password for 'gerritro'. I took a look at the relevant m2-master behind dbproxies, db1132 and i see

Tue, Feb 11, 10:45 PM · Patch-For-Review, Operations, Gerrit
Dzahn closed T244410: LDAP access to the wmf group for CherRaye Glenn (superset, turnilo, hue) as Invalid.

This should already work https://phabricator.wikimedia.org/T236209#5597161 Please confirm the credentials are working on Wikitech wiki and you are using the exact same ones on superset. Let us know if any remaining issues.

Tue, Feb 11, 10:09 PM · Analytics, LDAP-Access-Requests, Operations
Dzahn added a comment to T244410: LDAP access to the wmf group for CherRaye Glenn (superset, turnilo, hue).

@CGlenn This has already been done as part of T236209.

Tue, Feb 11, 10:04 PM · Analytics, LDAP-Access-Requests, Operations
Dzahn updated the task description for T244148: Add Itamar Givon to the ldap/wmde group.
Tue, Feb 11, 10:02 PM · WMF-Legal, LDAP-Access-Requests, Operations
Dzahn closed T244148: Add Itamar Givon to the ldap/wmde group as Resolved.

@ItamarWMDE This is done. You are a member of both "wmde" and "nda" LDAP groups like other WMDE employees before.

Tue, Feb 11, 10:02 PM · WMF-Legal, LDAP-Access-Requests, Operations
Dzahn renamed T244490: access to Superset for Alex Hollender from Get access to Superset to access to Superset for Alex Hollender.
Tue, Feb 11, 9:59 PM · Operations, LDAP-Access-Requests
Dzahn committed rLPRI4995e183e007: remove gerrit db_pass from passwords module, moved to private hiera (authored by Dzahn).
remove gerrit db_pass from passwords module, moved to private hiera
Tue, Feb 11, 9:04 PM
Dzahn committed rLPRI6b140607b134: add fake db_pass for Gerrit (authored by Dzahn).
add fake db_pass for Gerrit
Tue, Feb 11, 9:03 PM
Dzahn added a comment to T244802: evaluate deploying the MediaWiki ArchiveLeaf extension in production.

Thank you very much @Peachey88

Tue, Feb 11, 8:40 PM · Wikimedia-Extension-setup, Wikimedia-extension-review-queue, serviceops-radar, Internet-Archive
Dzahn added a comment to T240771: Create a wiki for Wikimedia User Group Nigeria.
Tue, Feb 11, 6:58 PM · MW-1.35-notes (1.35.0-wmf.14; 2020-01-07), Google-Code-in-2019, User-Urbanecm, Wiki-Setup (Create)

Mon, Feb 10

Dzahn created T244802: evaluate deploying the MediaWiki ArchiveLeaf extension in production.
Mon, Feb 10, 10:51 PM · Wikimedia-Extension-setup, Wikimedia-extension-review-queue, serviceops-radar, Internet-Archive
Dzahn added a comment to T242606: No mw canary servers in codfw.

done now?

Mon, Feb 10, 10:40 PM · Operations, serviceops
Dzahn added a comment to T242606: No mw canary servers in codfw.

I added 2 more canary appservers. now we have:

Mon, Feb 10, 10:39 PM · Operations, serviceops
Dzahn updated the task description for T242606: No mw canary servers in codfw.
Mon, Feb 10, 10:39 PM · Operations, serviceops
Dzahn updated the task description for T242606: No mw canary servers in codfw.
Mon, Feb 10, 10:37 PM · Operations, serviceops
Dzahn added a comment to T218412: Define a mediawiki "version".

The problem i see with that is how do you define what a "small" change is? It feels like when we call things "trivial" but then there are unexpected changes anyways. The very nature of them being unexpected means they would all be called small or trivial before the fact.

Mon, Feb 10, 8:03 PM · Release-Engineering-Team (Deployment services), Release-Engineering-Team-TODO, serviceops, Scap
Dzahn added a comment to T244719: Create a replacement for kraz.wikimedia.org.

Just to confirm. It should still have a public IP in wikimedia.org ?

Mon, Feb 10, 7:51 PM · serviceops, Operations, vm-requests, User-Elukey, Analytics
Dzahn added a comment to T244407: Requesting access to sites from Google Search Console.

@kzimmerman Gotcha, I created a request for them at https://wikimedia.zendesk.com/hc/en-us/requests/new

Mon, Feb 10, 7:43 PM · Operations, SRE-Access-Requests
Eevans awarded T244508: Request for +2 access to mediawiki-config a Party Time token.
Mon, Feb 10, 7:37 PM · Release-Engineering-Team, Operations, SRE-Access-Requests, Gerrit-Privilege-Requests
Dzahn closed T244508: Request for +2 access to mediawiki-config as Resolved.

@Eevans You should have +2 on the mw-config repo now. Probably after logging out and back in.

Mon, Feb 10, 7:30 PM · Release-Engineering-Team, Operations, SRE-Access-Requests, Gerrit-Privilege-Requests
Dzahn added a comment to T244508: Request for +2 access to mediawiki-config.

@jijiki Eric is already in the "deployment" shell users group and membership in the relevant Gerrit group normally goes with that since it makes little sense to trust people with deploying whatever they want but not let them merge config changes in Gerrit. Gerrit access requests can be handled by any Gerrit admin, Gerrit manager or for this specific group also anyone in the ops LDAP group as Urbanecm points out.

Mon, Feb 10, 7:27 PM · Release-Engineering-Team, Operations, SRE-Access-Requests, Gerrit-Privilege-Requests
Dzahn added a comment to T244407: Requesting access to sites from Google Search Console.

@kzimmerman Hi, in OITs LDAP record the manager is neither of them but listed as "dzierten". That's the only place we can check though. Should this be fixed in OIT?

Mon, Feb 10, 7:21 PM · Operations, SRE-Access-Requests
Dzahn updated the task description for T244766: SAL on wikitech missing data.
Mon, Feb 10, 5:46 PM · Stashbot, Operations
Dzahn created T244766: SAL on wikitech missing data.
Mon, Feb 10, 5:46 PM · Stashbot, Operations

Sat, Feb 8

Dzahn updated the task description for T244381: Requesting access to Deployment for Clarakosi.
Sat, Feb 8, 12:55 AM · SRE-Access-Requests, Operations
Dzahn updated the task description for T244381: Requesting access to Deployment for Clarakosi.
Sat, Feb 8, 12:52 AM · SRE-Access-Requests, Operations
Dzahn added a comment to T244626: vm requests for APT repo / webserver.

regarding disk requirements:

Sat, Feb 8, 12:27 AM · Patch-For-Review, Operations, serviceops-radar, vm-requests
Dzahn updated the task description for T244626: vm requests for APT repo / webserver.
Sat, Feb 8, 12:26 AM · Patch-For-Review, Operations, serviceops-radar, vm-requests
Dzahn added a subtask for T242602: Sort out plan for install* servers in edge sites: T244626: vm requests for APT repo / webserver.
Sat, Feb 8, 12:24 AM · Patch-For-Review, Operations
Dzahn added a parent task for T244626: vm requests for APT repo / webserver: T242602: Sort out plan for install* servers in edge sites.
Sat, Feb 8, 12:24 AM · Patch-For-Review, Operations, serviceops-radar, vm-requests
Dzahn added a parent task for T224576: Upgrade install servers to Buster: T242602: Sort out plan for install* servers in edge sites.
Sat, Feb 8, 12:24 AM · Patch-For-Review, Operations
Dzahn added a subtask for T242602: Sort out plan for install* servers in edge sites: T224576: Upgrade install servers to Buster.
Sat, Feb 8, 12:23 AM · Patch-For-Review, Operations
Dzahn edited projects for T228924: rack/setup/install ganeti10([09]|1[0-8]).eqiad.wmnet, added: serviceops; removed vm-requests.
Sat, Feb 8, 12:20 AM · serviceops, Operations
Dzahn moved T210582: New node request: oresrdb[12]003 from Backlog to stalled on the vm-requests board.
Sat, Feb 8, 12:19 AM · Scoring-platform-team, vm-requests, Operations, ORES
Dzahn moved T215421: Site: 1 VM request for recommender-systems from Backlog to stalled on the vm-requests board.
Sat, Feb 8, 12:19 AM · vm-requests, Operations
Dzahn moved T244357: Provision grafana VM in codfw from Backlog to VM created on the vm-requests board.
Sat, Feb 8, 12:19 AM · serviceops, vm-requests, observability, Operations
Dzahn added projects to T244626: vm requests for APT repo / webserver: vm-requests, serviceops-radar, Operations.
Sat, Feb 8, 12:18 AM · Patch-For-Review, Operations, serviceops-radar, vm-requests
Dzahn updated subscribers of T244626: vm requests for APT repo / webserver.

@Muehlenhoff This was the idea, right? Or was it to share with an existing webserver?

Sat, Feb 8, 12:17 AM · Patch-For-Review, Operations, serviceops-radar, vm-requests
Dzahn created T244626: vm requests for APT repo / webserver.
Sat, Feb 8, 12:17 AM · Patch-For-Review, Operations, serviceops-radar, vm-requests
Dzahn added a comment to T244357: Provision grafana VM in codfw.

VM looks up and running, all green in Icinga.

Sat, Feb 8, 12:11 AM · serviceops, vm-requests, observability, Operations

Fri, Feb 7

Dzahn updated the task description for T224576: Upgrade install servers to Buster.
Fri, Feb 7, 11:21 PM · Patch-For-Review, Operations
Dzahn closed T244390: VM requests for install_server replacements, a subtask of T224576: Upgrade install servers to Buster, as Resolved.
Fri, Feb 7, 8:43 PM · Patch-For-Review, Operations
Dzahn closed T244390: VM requests for install_server replacements as Resolved.

VMs have been created. OS install now worked at second attempt.

Fri, Feb 7, 8:43 PM · Operations, vm-requests
Dzahn updated subscribers of T243847: Add pcov PHP extension to wikimedia apt so it can be used in Wikimedia CI.
Fri, Feb 7, 8:28 PM · serviceops, Release-Engineering-Team-TODO, Continuous-Integration-Config, Release-Engineering-Team (CI & Testing services), Test-Coverage

Thu, Feb 6

Dzahn created T244545: Add x-request-id to httpd (apache) logs.
Thu, Feb 6, 10:56 PM · Operations, Traffic, serviceops
Dzahn moved T243009: Make scap skip restarting php-fpm when using --force from On-going to Follow-up on the Wikimedia-Incident board.
Thu, Feb 6, 10:46 PM · Wikimedia-Incident, Release-Engineering-Team-TODO (2020-01 to 2020-03 (Q3)), Release-Engineering-Team (Deployment services), Scap
Dzahn added a project to T243009: Make scap skip restarting php-fpm when using --force: Wikimedia-Incident.
Thu, Feb 6, 10:46 PM · Wikimedia-Incident, Release-Engineering-Team-TODO (2020-01 to 2020-03 (Q3)), Release-Engineering-Team (Deployment services), Scap
Dzahn updated subscribers of T224576: Upgrade install servers to Buster.

@Muehlenhoff Added them with private IPs, created VMs with the cookbook, then attempted OS install but on both of them it failed at the very end with GRUB install. I don't think this happened to me before on a ganeti VM.

Thu, Feb 6, 10:27 PM · Patch-For-Review, Operations
Dzahn added a comment to T242606: No mw canary servers in codfw.

@jijiki What do you think ? Is this good now? 4 of each type and in different rows/racks.

Thu, Feb 6, 10:24 PM · Operations, serviceops
Dzahn added a comment to T242606: No mw canary servers in codfw.

mw2163 and mw2271 have been turned into canary appservers now. As opposed to canary API appservers this means actual puppet changes which are:

Thu, Feb 6, 10:23 PM · Operations, serviceops
Dzahn updated the task description for T242606: No mw canary servers in codfw.
Thu, Feb 6, 10:20 PM · Operations, serviceops
Dzahn moved T244535: wikifeeds - fix the CPU limits so that it doesn't get starved from On-going to Follow-up on the Wikimedia-Incident board.
Thu, Feb 6, 9:21 PM · Wikimedia-Incident, serviceops, Wikifeeds
Dzahn added a project to T244535: wikifeeds - fix the CPU limits so that it doesn't get starved: Wikimedia-Incident.
Thu, Feb 6, 9:20 PM · Wikimedia-Incident, serviceops, Wikifeeds
Dzahn updated the task description for T244535: wikifeeds - fix the CPU limits so that it doesn't get starved.
Thu, Feb 6, 9:10 PM · Wikimedia-Incident, serviceops, Wikifeeds
Dzahn created T244535: wikifeeds - fix the CPU limits so that it doesn't get starved.
Thu, Feb 6, 9:09 PM · Wikimedia-Incident, serviceops, Wikifeeds
Dzahn added a comment to T244530: upgrade memory in ganeti100[5-8].eqiad.wmnet.

As of today these hosts are in site.pp with spare::system role and not in production yet. So while standard Icinga alerts should be downtimed, no actual Ganeti service would be affected and it could happen anytime. Before doing that just check in site.pp if they still have the spare role and not the ganeti role.

Thu, Feb 6, 7:56 PM · ops-eqiad, Operations
Dzahn added a comment to T242309: Onboarding Hugh Nowlan.

Aww, thanks for conforming and thanks Moritz for fixing it. This is exactly why i wanted to test it. The capitalization caught us a couple times before.

Thu, Feb 6, 4:25 PM · serviceops-radar, Core Platform Team Workboards (Clinic Duty Team), Operations, SRE-Access-Requests
Dzahn updated the task description for T244390: VM requests for install_server replacements.
Thu, Feb 6, 2:44 AM · Operations, vm-requests
Dzahn added a comment to T244390: VM requests for install_server replacements.

reverted / removed public IPs, added private IPs

Thu, Feb 6, 2:19 AM · Operations, vm-requests
Dzahn added a comment to T244390: VM requests for install_server replacements.

https://gerrit.wikimedia.org/r/c/operations/dns/+/570468

Thu, Feb 6, 2:18 AM · Operations, vm-requests
Dzahn changed the status of T241852: rack/setup/install new codfw mw systems from Open to Stalled.

currently blocked on T244438 , an installer issue on stretch that only happens on stretch and buster would not have a problem

Thu, Feb 6, 2:15 AM · ops-codfw, serviceops, Operations
Dzahn added a project to T244438: codfw: new mw servers not getting an IP when default to Stretch: serviceops-radar.
Thu, Feb 6, 2:13 AM · serviceops-radar, Operations, ops-codfw
Dzahn updated the task description for T224576: Upgrade install servers to Buster.
Thu, Feb 6, 2:13 AM · Patch-For-Review, Operations

Wed, Feb 5

Dzahn closed T244389: Request for +2 access to mediawiki-config as Resolved.

@Pchelolo This should work now.

Wed, Feb 5, 10:10 PM · SRE-Access-Requests, Operations, Gerrit-Privilege-Requests
Dzahn added a comment to T244389: Request for +2 access to mediawiki-config.

As @MarcoAurelio points out this normally goes together with getting the deployment admin group. Petr is already a member of that (and various other shell admin groups).

Wed, Feb 5, 10:10 PM · SRE-Access-Requests, Operations, Gerrit-Privilege-Requests
Dzahn claimed T244389: Request for +2 access to mediawiki-config.
Wed, Feb 5, 10:04 PM · SRE-Access-Requests, Operations, Gerrit-Privilege-Requests
Dzahn added a comment to T113785: Make the Shinken IRC alert and icinga-wm bots use colors.

test update

Wed, Feb 5, 9:38 PM · Operations, Shinken
Dzahn added a comment to T237109: Exempt 'wikibugs' IRC bot from flood rate limits.

test update

Wed, Feb 5, 9:36 PM · Wikibugs, wikimedia-irc-freenode
Dzahn added a comment to T149287: Heating alerts for mw servers in eqiad.

mw1267 was showing temperature issues today:

Wed, Feb 5, 8:30 PM · Operations, ops-eqiad
Dzahn added a comment to T243808: gerrit1002 running out of space.

See T243983. I added a second disk to this VM, it's an additional 10GB and mounted on /srv/dbdump. Hope that does it.

Wed, Feb 5, 7:59 PM · Operations, Gerrit
Dzahn added a parent task for T243808: gerrit1002 running out of space: T239151: Gerrit VM to test data migration.
Wed, Feb 5, 7:58 PM · Operations, Gerrit
Dzahn added a subtask for T239151: Gerrit VM to test data migration: T243808: gerrit1002 running out of space.
Wed, Feb 5, 7:58 PM · Gerrit, Operations, vm-requests
Dzahn added a comment to T244146: Remove mobrovac@wikimedia.org from techcom@wikimedia.org.

@jijiki On our end in private repo. puppetmaster1001:/srv/private/modules/privateexim/files (already done though)

Wed, Feb 5, 7:49 PM · Operations
Dzahn removed a project from T242309: Onboarding Hugh Nowlan: LDAP-Access-Requests.
Wed, Feb 5, 7:48 PM · serviceops-radar, Core Platform Team Workboards (Clinic Duty Team), Operations, SRE-Access-Requests
Dzahn closed T243802: Request for LDAP access to the WMF group for Sakti Pramudya as Resolved.

@SpramudyaDev You have been added to the "wmf" group. You should now be able to login with the same credentials used on Wikitech wiki.

Wed, Feb 5, 7:46 PM · Operations, LDAP-Access-Requests
Dzahn updated the task description for T244390: VM requests for install_server replacements.
Wed, Feb 5, 7:41 PM · Operations, vm-requests
Dzahn added a comment to T244390: VM requests for install_server replacements.

Hmm.. fair enough. That means my DNS change was not correct though, it defined public IPs as before.

Wed, Feb 5, 7:39 PM · Operations, vm-requests
Dzahn updated the task description for T242606: No mw canary servers in codfw.
Wed, Feb 5, 7:27 PM · Operations, serviceops
Dzahn added a comment to T242606: No mw canary servers in codfw.

The following are now declared canary API appservers in site.pp:

Wed, Feb 5, 7:06 PM · Operations, serviceops
Dzahn claimed T244381: Requesting access to Deployment for Clarakosi.
Wed, Feb 5, 6:53 PM · Operations, SRE-Access-Requests