Page MenuHomePhabricator

Dzahn (Daniel Zahn)
SRE for collaborative services

Projects (27)

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Wednesday

  • Clear sailing ahead.

User Details

User Since
Sep 30 2014, 4:39 PM (491 w, 5 d)
Availability
Available
IRC Nick
mutante
LDAP User
Dzahn
MediaWiki User
Mutante [ Global Accounts ]

Recent Activity

Fri, Mar 1

Dzahn changed the status of T358578: Add WMDE staff who have signed the NDA with the WMF to the WMF-NDA phabricator policy group, a subtask of T299839: Clarify whether members of ldap/nda should be added to #WMF-NDA, from Open to In Progress.
Fri, Mar 1, 8:43 PM · Infrastructure-Foundations, WMF-NDA-Requests
Dzahn changed the status of T358578: Add WMDE staff who have signed the NDA with the WMF to the WMF-NDA phabricator policy group from Open to In Progress.
Fri, Mar 1, 8:43 PM · WMF-NDA-Requests
Dzahn updated the task description for T358578: Add WMDE staff who have signed the NDA with the WMF to the WMF-NDA phabricator policy group.
Fri, Mar 1, 8:42 PM · WMF-NDA-Requests
Dzahn added a member for WMF-NDA: Lena_WMDE.
Fri, Mar 1, 8:40 PM
Dzahn added a member for WMF-NDA: Kris_Litson_WMDE.
Fri, Mar 1, 8:39 PM
Dzahn added a member for WMF-NDA: jdfraine.
Fri, Mar 1, 8:38 PM
Dzahn added a member for WMF-NDA: lojo_wmde.
Fri, Mar 1, 8:38 PM
Dzahn updated the task description for T358578: Add WMDE staff who have signed the NDA with the WMF to the WMF-NDA phabricator policy group.
Fri, Mar 1, 8:37 PM · WMF-NDA-Requests
Dzahn added a member for WMF-NDA: WMDE-Fisch.
Fri, Mar 1, 8:33 PM
Dzahn added a member for WMF-NDA: kai.nissen.
Fri, Mar 1, 8:32 PM
Dzahn added a member for WMF-NDA: CorinnaHillebrand_WMDE.
Fri, Mar 1, 8:31 PM
Dzahn added a member for WMF-NDA: Aline_Bruenger_WMDE.
Fri, Mar 1, 8:30 PM
Dzahn closed T358237: Ganeti VM for contint migration as Resolved.
Fri, Mar 1, 7:21 PM · Patch-For-Review, collaboration-services, SRE, Continuous-Integration-Infrastructure, vm-requests
Dzahn closed T358237: Ganeti VM for contint migration, a subtask of T334517: upgrade contint servers to bullseye, as Resolved.
Fri, Mar 1, 7:20 PM · Release-Engineering-Team (Radar), collaboration-services
Dzahn added a comment to T358237: Ganeti VM for contint migration.

This is done now. please use contint1003.eqiad.wmnet with private IP.

Fri, Mar 1, 7:19 PM · Patch-For-Review, collaboration-services, SRE, Continuous-Integration-Infrastructure, vm-requests
Dzahn renamed T358832: ProbeDown - vrts1002 from ProbeDown to ProbeDown - vrst1002.
Fri, Mar 1, 8:48 AM · collaboration-services

Thu, Feb 29

Dzahn committed rLPRI80478165d464: delete passwords for wikimania_scholarships, tor, private_static_site (authored by Dzahn).
delete passwords for wikimania_scholarships, tor, private_static_site
Thu, Feb 29, 10:11 PM
Dzahn closed T355388: Request to add STran to WMF-NDA group as Resolved.

Done. @STran has been added to the WMF-NDA group here in Phabricator (https://phabricator.wikimedia.org/project/manage/61/) Confirmed existing LDAP membership in "wmf" group and phab user is linked to LDAP.

Thu, Feb 29, 9:56 PM · WMF-NDA-Requests
Dzahn added a comment to T358091: Grant Access to Superset for ifeatu_nnaobi_wmde.

@Ifeatu_Nnaobi_WMDE You have also been added to the WMF-NDA group in Phabricator and can now see restricted tickets.

Thu, Feb 29, 9:53 PM · SRE, LDAP-Access-Requests
Dzahn added a member for WMF-NDA: STran.
Thu, Feb 29, 9:52 PM
Dzahn added a subtask for T349595: Clarify if NDAs (to access #WMF-NDA protected Phab tasks) are on paper or in Legalpad's L2 or both: T299839: Clarify whether members of ldap/nda should be added to #WMF-NDA.
Thu, Feb 29, 9:41 PM · WMF-Legal, Legalpad, User-AKlapper, Phabricator, WMF-NDA-Requests
Dzahn added a parent task for T299839: Clarify whether members of ldap/nda should be added to #WMF-NDA: T349595: Clarify if NDAs (to access #WMF-NDA protected Phab tasks) are on paper or in Legalpad's L2 or both.
Thu, Feb 29, 9:41 PM · Infrastructure-Foundations, WMF-NDA-Requests
Dzahn added a comment to T349595: Clarify if NDAs (to access #WMF-NDA protected Phab tasks) are on paper or in Legalpad's L2 or both.

This would also help to solve T358578

Thu, Feb 29, 9:41 PM · WMF-Legal, Legalpad, User-AKlapper, Phabricator, WMF-NDA-Requests
Dzahn added a comment to T299839: Clarify whether members of ldap/nda should be added to #WMF-NDA.

This is basically the reason we got T358578.

Thu, Feb 29, 9:40 PM · Infrastructure-Foundations, WMF-NDA-Requests
Dzahn added a comment to T358578: Add WMDE staff who have signed the NDA with the WMF to the WMF-NDA phabricator policy group.

@WMDE-leszek By the way this is also T299839 and T349595

Thu, Feb 29, 9:39 PM · WMF-NDA-Requests
Dzahn added a parent task for T358578: Add WMDE staff who have signed the NDA with the WMF to the WMF-NDA phabricator policy group: T299839: Clarify whether members of ldap/nda should be added to #WMF-NDA.
Thu, Feb 29, 9:39 PM · WMF-NDA-Requests
Dzahn added a subtask for T299839: Clarify whether members of ldap/nda should be added to #WMF-NDA: T358578: Add WMDE staff who have signed the NDA with the WMF to the WMF-NDA phabricator policy group.
Thu, Feb 29, 9:39 PM · Infrastructure-Foundations, WMF-NDA-Requests
Dzahn closed T358562: Add Fring to WMF-NDA group as Resolved.

Done over at T358578#9589085

Thu, Feb 29, 9:37 PM · WMF-NDA-Requests
Dzahn added a member for WMF-NDA: Fring.
Thu, Feb 29, 9:31 PM
Dzahn added a comment to T358578: Add WMDE staff who have signed the NDA with the WMF to the WMF-NDA phabricator policy group.

@WMDE-leszek T358091 is done and I added Ifeatu to WMF-NDA in Phabricator.

Thu, Feb 29, 9:28 PM · WMF-NDA-Requests
Dzahn added a member for WMF-NDA: Ifeatu_Nnaobi_WMDE.
Thu, Feb 29, 9:28 PM
Dzahn closed T358091: Grant Access to Superset for ifeatu_nnaobi_wmde as Resolved.
[mwmaint1002:~] $ ldapsearch -x mail=ifeatu*wikimedia.de
Thu, Feb 29, 9:25 PM · SRE, LDAP-Access-Requests
Dzahn closed T358584: Grant Access to nda, wmde for Frederik Ring as Resolved.
Thu, Feb 29, 8:56 PM · SRE, LDAP-Access-Requests
Dzahn added a comment to T358584: Grant Access to nda, wmde for Frederik Ring.

Hi all. This is done. Frederik (frri in LDAP) has been added to the groups nda and wmde. Everything should work like for other WMDE employees.

Thu, Feb 29, 8:54 PM · SRE, LDAP-Access-Requests
Dzahn added a comment to T358584: Grant Access to nda, wmde for Frederik Ring.
[mwmaint1002:~] $ ldapsearch -x mail=fred*wikimedia.de
..
uidNumber: 43019
..
Thu, Feb 29, 8:42 PM · SRE, LDAP-Access-Requests
Dzahn added a comment to T358578: Add WMDE staff who have signed the NDA with the WMF to the WMF-NDA phabricator policy group.

@Aklapper I'm about to handle this. This is the equivalent to what you added for WMF users but for WMDE/NDA. I think we have a ticket?

Thu, Feb 29, 8:39 PM · WMF-NDA-Requests
Dzahn closed T353298: migrate Icinga checks for planet as Resolved.
Thu, Feb 29, 7:18 PM · collaboration-services
Dzahn added a comment to T353298: migrate Icinga checks for planet.

This nicely shows the actual content updates as a graph. As pointed out by godog you can see the actual data by removing the "> 86400"-part from the end of the expression and run the query:

Thu, Feb 29, 7:15 PM · collaboration-services
Dzahn renamed T358787: PowerSupplyFailure - an-coord1003 from PowerSupplyFailure to PowerSupplyFailure - an-coord1003.
Thu, Feb 29, 7:10 PM · SRE, ops-eqiad
Dzahn renamed T358785: ProbeDown - vrst1002 from ProbeDown to ProbeDown - vrst1002.
Thu, Feb 29, 6:10 PM · collaboration-services
Dzahn closed T358785: ProbeDown - vrst1002 as Resolved.

A silence was forgotten during test upgrade of znuny.

Thu, Feb 29, 6:09 PM · collaboration-services
Dzahn moved T358091: Grant Access to Superset for ifeatu_nnaobi_wmde from NDA Pending to Code Review Pending on the LDAP-Access-Requests board.
Thu, Feb 29, 4:12 PM · SRE, LDAP-Access-Requests
Dzahn moved T358584: Grant Access to nda, wmde for Frederik Ring from NDA Pending to Code Review Pending on the LDAP-Access-Requests board.
Thu, Feb 29, 4:12 PM · SRE, LDAP-Access-Requests

Wed, Feb 28

Dzahn added a project to T358610: Update wmf/stable to Phorge upstream's 2023.49 stable release: collaboration-services.
Wed, Feb 28, 7:44 PM · collaboration-services, Patch-For-Review, User-brennen, Release-Engineering-Team, Phabricator
Dzahn added a comment to T358237: Ganeti VM for contint migration.

The VM has been created and releng-roots already have shell access.

Wed, Feb 28, 7:18 PM · Patch-For-Review, collaboration-services, SRE, Continuous-Integration-Infrastructure, vm-requests
Dzahn committed rLPRI07dfbddf842c: delete passwords::tendril and passwords::bugzilla (authored by Dzahn).
delete passwords::tendril and passwords::bugzilla
Wed, Feb 28, 7:12 PM
Dzahn committed rLPRIeb1c69e8723b: delete passwords::racktables (authored by Dzahn).
delete passwords::racktables
Wed, Feb 28, 5:09 PM
Dzahn committed rLPRIc67f3524e76f: delete passwords::etherpad (authored by Dzahn).
delete passwords::etherpad
Wed, Feb 28, 1:45 PM

Tue, Feb 27

Dzahn added a comment to T323073: Make https://git.wikimedia.org not redirect to Phabricator Diffusion.

Fair enough, though as an example, just by substracting "User:MarkAHershberger/sandbox" and "MediaWiki 1.24/Extension branchpoints" it's under 500 instead of 1500 right away.

Tue, Feb 27, 10:31 PM · Patch-For-Review, Diffusion, Release-Engineering-Team, collaboration-services
Dzahn claimed T101522: Update outdated information in top banner on static-bugzilla subpages?.
Tue, Feb 27, 8:47 PM · collaboration-services, Wikimedia-Bugzilla
Dzahn moved T358237: Ganeti VM for contint migration from Backlog to Work in Progress (Tracking tasks) on the collaboration-services board.
Tue, Feb 27, 8:45 PM · Patch-For-Review, collaboration-services, SRE, Continuous-Integration-Infrastructure, vm-requests
Dzahn moved T354688: Request a spare host for Collab services (Phorge, Gerrit, Contint) from Backlog to Work in Progress (Tracking tasks) on the collaboration-services board.
Tue, Feb 27, 8:41 PM · collaboration-services
Dzahn added a comment to T354688: Request a spare host for Collab services (Phorge, Gerrit, Contint).

We are adding 2 hosts to next year's FY24-25 CapEx document by March 8th.

Tue, Feb 27, 8:41 PM · collaboration-services
Dzahn added a comment to T323073: Make https://git.wikimedia.org not redirect to Phabricator Diffusion.

we should be using descriptors not software names for sites (so tasks.wikimedia.org not phabricator.… / phorge.… .but I fear no-one else agrees.

Tue, Feb 27, 8:09 PM · Patch-For-Review, Diffusion, Release-Engineering-Team, collaboration-services
Dzahn moved T358091: Grant Access to Superset for ifeatu_nnaobi_wmde from Backlog to NDA Pending on the LDAP-Access-Requests board.
Tue, Feb 27, 7:13 PM · SRE, LDAP-Access-Requests
Dzahn moved T358584: Grant Access to nda, wmde for Frederik Ring from Backlog to NDA Pending on the LDAP-Access-Requests board.
Tue, Feb 27, 7:13 PM · SRE, LDAP-Access-Requests
Dzahn updated subscribers of T358091: Grant Access to Superset for ifeatu_nnaobi_wmde.

@Ifeatu_Nnaobi_WMDE please send an email to Katie Francis (@KFrancis ) (https://meta.wikimedia.org/wiki/User:KFrancis_(WMF)) and she will get back to you about signing an NDA.

Tue, Feb 27, 7:12 PM · SRE, LDAP-Access-Requests
Dzahn placed T358091: Grant Access to Superset for ifeatu_nnaobi_wmde up for grabs.
Tue, Feb 27, 7:10 PM · SRE, LDAP-Access-Requests
Dzahn updated subscribers of T358584: Grant Access to nda, wmde for Frederik Ring.

@Fring please send an email to Katie Francis (@KFrancis ) (https://meta.wikimedia.org/wiki/User:KFrancis_(WMF)) and she will get back to you about the NDA.

Tue, Feb 27, 7:09 PM · SRE, LDAP-Access-Requests
Dzahn added a comment to T323073: Make https://git.wikimedia.org not redirect to Phabricator Diffusion.

@hashar @Jelto Fair enough, I am not opposed to deleting it all. It would mean though that all of this goes away. See the rewrite rules here:

Tue, Feb 27, 6:59 PM · Patch-For-Review, Diffusion, Release-Engineering-Team, collaboration-services
Dzahn claimed T323073: Make https://git.wikimedia.org not redirect to Phabricator Diffusion.
Tue, Feb 27, 6:53 PM · Patch-For-Review, Diffusion, Release-Engineering-Team, collaboration-services
Dzahn added a comment to T340788: allow mwmaint/cumin hosts to connect to http on contint.

On https://gerrit.wikimedia.org/r/c/operations/puppet/+/964881 we said that we would discuss this again in a couple weeks. Let's do that.

Tue, Feb 27, 6:27 PM · Continuous-Integration-Infrastructure, collaboration-services
Dzahn moved T357572: scap install fails on new Phabricator/Phorge host due to missing user from Backlog to Work in Progress (Tracking tasks) on the collaboration-services board.
Tue, Feb 27, 6:26 PM · Patch-For-Review, User-brennen, Release-Engineering-Team (Now this 🫠), collaboration-services, Scap
Dzahn claimed T357572: scap install fails on new Phabricator/Phorge host due to missing user.
Tue, Feb 27, 5:50 PM · Patch-For-Review, User-brennen, Release-Engineering-Team (Now this 🫠), collaboration-services, Scap
Dzahn changed the status of T358237: Ganeti VM for contint migration, a subtask of T334517: upgrade contint servers to bullseye, from Stalled to Open.
Tue, Feb 27, 5:38 PM · Release-Engineering-Team (Radar), collaboration-services
Dzahn changed the status of T358237: Ganeti VM for contint migration from Stalled to Open.
Tue, Feb 27, 5:38 PM · Patch-For-Review, collaboration-services, SRE, Continuous-Integration-Infrastructure, vm-requests
Dzahn changed the status of T358237: Ganeti VM for contint migration from In Progress to Stalled.
Tue, Feb 27, 3:58 PM · Patch-For-Review, collaboration-services, SRE, Continuous-Integration-Infrastructure, vm-requests
Dzahn changed the status of T358237: Ganeti VM for contint migration, a subtask of T334517: upgrade contint servers to bullseye, from In Progress to Stalled.
Tue, Feb 27, 3:57 PM · Release-Engineering-Team (Radar), collaboration-services

Mon, Feb 26

Dzahn added a comment to T356799: Cannot edit wikipedia from my work computer.

@Rijikk The footer would be right under the "If you report this error to the Wikimedia System Administrators, please include the details below." message you quoted in the error page itself.

Mon, Feb 26, 11:48 PM · SRE, Traffic
Dzahn reopened T353298: migrate Icinga checks for planet as "Open".
Mon, Feb 26, 11:43 PM · collaboration-services
Dzahn closed T353298: migrate Icinga checks for planet as Resolved.

https://grafana-rw.wikimedia.org/alerting/list?search=planet

Mon, Feb 26, 11:19 PM · collaboration-services
Dzahn changed the status of T358237: Ganeti VM for contint migration from Open to In Progress.
Mon, Feb 26, 10:00 PM · Patch-For-Review, collaboration-services, SRE, Continuous-Integration-Infrastructure, vm-requests
Dzahn changed the status of T358237: Ganeti VM for contint migration, a subtask of T334517: upgrade contint servers to bullseye, from Open to In Progress.
Mon, Feb 26, 10:00 PM · Release-Engineering-Team (Radar), collaboration-services
Dzahn added a comment to T73388: Adminship of Hindi Wikipedia Mailing List.

I don't really see the value in moving tickets on workboards that have been resolved years ago. For me persoally this created hundreds of notifications but no benefit.

Mon, Feb 26, 6:42 PM · SRE, Hindi-Sites, Wikimedia-Mailing-lists
Dzahn added a comment to T321790: Allow tools to use phabricator webhooks.

Hmm. At the time that I filed this, netcat to google.com from a phabricator server didn't work. Not sure what changed since then.

Mon, Feb 26, 6:30 PM · collaboration-services, User-brennen, Release-Engineering-Team, Phabricator
Dzahn added a comment to T358237: Ganeti VM for contint migration.

I'll go with private IP but cloud VPS doesn't really seem feasible to me.

Mon, Feb 26, 5:47 PM · Patch-For-Review, collaboration-services, SRE, Continuous-Integration-Infrastructure, vm-requests

Thu, Feb 15

Dzahn assigned T357679: PuppetDisabled - vrts1002 to Arnoldokoth.
Thu, Feb 15, 4:11 PM · collaboration-services

Wed, Feb 14

Dzahn added a comment to T353298: migrate Icinga checks for planet.
9:01 < mutante> I have one Icinga check that uses "check_lastmod.py". A script I once wrote myself to check if a website has been updated recently enough. Just checks the last modified header of any website. 
                 How could I replace this to move it out of Icinga? Can i check any header with blackbox::http ?
19:05 < denisse> I think that Blackbox Exporter does not directly expose a way to check or alert based on the 'Last-Modified' HTTP header of a website.
19:07 < mutante> thanks denisse. it seems like there is no replacement for it
19:07 < denisse> But you can write a custom exporter that acts similarly to your "check_lastmod.py" script.
19:08 < denisse> Your custom exporter can send requests to your target website(s), check the 'Last-Modified' header, and then expose that information for Prometheus to scrape.
19:10 < mutante> hmm. yea, I don't think I want to build an entire package and all that for one check
19:12 < moritzm> mutante: check prometheus::node_textfile, it's a define I once added to simply setting up such a check without the need for a package
19:12 < mutante> moritzm: ah! thank you, will check
19:13 < denisse> :o good idea!
19:55 < cwhite> mutante: is this http endpoint currently probed with blackbox exporter?
20:02 < cwhite> if so, there's a chance that the last modified timestamp is available in Prometheus
20:08 < mutante> cwhite: there is "prometheus::blackbox::check::http { 'en.planet.wikimedia.org':", yes.  but there is also still  @monitoring::host { 'en.planet.wikimedia.org': and a 
                 "check_ssl_http_letsencrypt" on it. 
20:14 < cwhite> mutante: we have the last modified header in Prometheus
20:14 < cwhite> `probe_http_last_modified_timestamp_seconds{module="http_en_planet_wikimedia_org_ip4"}`
20:17 < mutante> cwhite: cool, thank you!
20:18 < mutante> I dont know how to use that but it sounds like I just need to rtfm then or something
20:25 < cwhite> mutante: The goal is to alert if the delta between last-modified and now exceeds a threshold?
20:30 < mutante> cwhite: yes, it was set to warn afer 24 hours and crit after 48 hours
20:35 < cwhite> mutante: you could try to warn on something like `probe_http_last_modified_timestamp_seconds{module="http_en_planet_wikimedia_org_ip4"} < time() - 86400`
20:35 < cwhite> then crit on 172800
20:36 < cwhite> see also: https://prometheus.io/docs/prometheus/latest/querying/functions/#time
20:40 < mutante> cwhite: thank you. so that would mean  https://wikitech.wikimedia.org/wiki/Alertmanager#Create_alerts  ?
20:41 < cwhite> Yep!  That should do the trick :)
20:42 < mutante> alright, thanks again
Wed, Feb 14, 6:35 PM · collaboration-services
Dzahn awarded T357413: miscweb wikikube staging namespace exceeding quota limits a Doubloon token.
Wed, Feb 14, 6:07 PM · collaboration-services
Dzahn added a comment to T316421: Upgrade etherpad.wikimedia.org to v1.9.7.

There is no puppet flag to enable or disable the process.

Wed, Feb 14, 5:51 PM · User-notice, Patch-For-Review, collaboration-services, SRE, Wikimedia-Etherpad
Dzahn added a comment to T316421: Upgrade etherpad.wikimedia.org to v1.9.7.

and prometheus-etherpad-exporter 0.7 as well. The etherpad-lite package also installed nodejs 18.19 automatically.

Wed, Feb 14, 5:19 PM · User-notice, Patch-For-Review, collaboration-services, SRE, Wikimedia-Etherpad
Dzahn added a comment to T355574: Many (all?) of the phabricator/tools scripts are in Python 2.

I think next is we can delete the "trello" scripts.

Wed, Feb 14, 12:37 AM · Patch-For-Review, collaboration-services, Python3-Porting, Phabricator, User-brennen, Release-Engineering-Team (Priority Backlog 📥)
Dzahn closed T355502: phabricator_task_dump.service Failed on phab1004 as Resolved.
00:30 < jinxer-wm> (SystemdUnitFailed) firing: phabricator_task_dump.service on phab1004:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - 
                   https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
00:33 < mutante> ^ this is just because the system didnt forget it failed 3 weeks ago
00:33 < mutante> the unit was just added back by me 
00:33 < mutante> I could have prevented it by doing a "systemctl reset-failed" before merging.
00:34 < mutante> I can tell because:       Active: failed (Result: exit-code) since Mon 2024-01-22 17:55:12 UTC; 3 weeks 1 days ago
00:35 < mutante> since I started the unit manually now it should be resolved
00:35 < jinxer-wm> (SystemdUnitFailed) resolved: phabricator_task_dump.service on phab1004:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - 
                   https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
00:35 < mutante> there we go
Wed, Feb 14, 12:37 AM · Phabricator (2024-02-13), User-brennen, Patch-For-Review, Dumps-Generation, collaboration-services
Dzahn closed T357490: SystemdUnitFailed - phabricator_task_dump.service, a subtask of T355502: phabricator_task_dump.service Failed on phab1004, as Resolved.
Wed, Feb 14, 12:36 AM · Phabricator (2024-02-13), User-brennen, Patch-For-Review, Dumps-Generation, collaboration-services
Dzahn closed T357490: SystemdUnitFailed - phabricator_task_dump.service as Resolved.
00:30 < jinxer-wm> (SystemdUnitFailed) firing: phabricator_task_dump.service on phab1004:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - 
                   https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
00:33 < mutante> ^ this is just because the system didnt forget it failed 3 weeks ago
00:33 < mutante> the unit was just added back by me 
00:33 < mutante> I could have prevented it by doing a "systemctl reset-failed" before merging.
00:34 < mutante> I can tell because:       Active: failed (Result: exit-code) since Mon 2024-01-22 17:55:12 UTC; 3 weeks 1 days ago
00:35 < mutante> since I started the unit manually now it should be resolved
00:35 < jinxer-wm> (SystemdUnitFailed) resolved: phabricator_task_dump.service on phab1004:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - 
                   https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
00:35 < mutante> there we go
Wed, Feb 14, 12:36 AM · collaboration-services
Dzahn closed T355502: phabricator_task_dump.service Failed on phab1004, a subtask of T355574: Many (all?) of the phabricator/tools scripts are in Python 2, as Resolved.
Wed, Feb 14, 12:36 AM · Patch-For-Review, collaboration-services, Python3-Porting, Phabricator, User-brennen, Release-Engineering-Team (Priority Backlog 📥)
Dzahn added a subtask for T355502: phabricator_task_dump.service Failed on phab1004: T357490: SystemdUnitFailed - phabricator_task_dump.service.
Wed, Feb 14, 12:33 AM · Phabricator (2024-02-13), User-brennen, Patch-For-Review, Dumps-Generation, collaboration-services
Dzahn added a parent task for T357490: SystemdUnitFailed - phabricator_task_dump.service: T355502: phabricator_task_dump.service Failed on phab1004.
Wed, Feb 14, 12:33 AM · collaboration-services
Dzahn renamed T357490: SystemdUnitFailed - phabricator_task_dump.service from SystemdUnitFailed to SystemdUnitFailed - phabricator_task_dump.service.
Wed, Feb 14, 12:32 AM · collaboration-services
Dzahn added a comment to T357490: SystemdUnitFailed - phabricator_task_dump.service.

I just reactivated this timer because the underlying issue was fixed (T355502#9540649).

Wed, Feb 14, 12:32 AM · collaboration-services
Dzahn added a comment to T355502: phabricator_task_dump.service Failed on phab1004.

The dump script works again. It was succesfully converted by Brennen.

Wed, Feb 14, 12:29 AM · Phabricator (2024-02-13), User-brennen, Patch-For-Review, Dumps-Generation, collaboration-services
Dzahn added a comment to T355574: Many (all?) of the phabricator/tools scripts are in Python 2.

The dump script works again. It was succesfully converted by Brennen.

Wed, Feb 14, 12:27 AM · Patch-For-Review, collaboration-services, Python3-Porting, Phabricator, User-brennen, Release-Engineering-Team (Priority Backlog 📥)
Dzahn added a comment to T356710: eqiad: 1 VM request for ncmonitor.
23:55 <+logmsgbot> !log dzahn@cumin1002 START - Cookbook sre.hosts.reimage for host ncmonitor1001.eqiad.wmnet with OS bookworm
00:01 <+logmsgbot> !log dzahn@cumin1002 START - Cookbook sre.hosts.downtime for 2:00:00 on ncmonitor1001.eqiad.wmnet with reason: host reimage
00:04 <+logmsgbot> !log dzahn@cumin1002 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ncmonitor1001.eqiad.wmnet with reason: host reimage
Wed, Feb 14, 12:17 AM · SRE, Traffic, vm-requests, Infrastructure-Foundations
Dzahn added a comment to T357449: ncmonitor1001 install issues (Ganeti VM fails to reboot after "gnt-instance modify").

The host should be usable now:

Wed, Feb 14, 12:16 AM · SRE-tools, Infrastructure-Foundations, Ganeti
Dzahn changed the status of T357449: ncmonitor1001 install issues (Ganeti VM fails to reboot after "gnt-instance modify") from Open to In Progress.
Wed, Feb 14, 12:04 AM · SRE-tools, Infrastructure-Foundations, Ganeti
Dzahn renamed T357449: ncmonitor1001 install issues (Ganeti VM fails to reboot after "gnt-instance modify") from Ganeti VM fails to reboot after "gnt-instance modify" to ncmonitor1001 install issues (Ganeti VM fails to reboot after "gnt-instance modify").
Wed, Feb 14, 12:03 AM · SRE-tools, Infrastructure-Foundations, Ganeti
Dzahn added a comment to T357449: ncmonitor1001 install issues (Ganeti VM fails to reboot after "gnt-instance modify").

I did another wmf-reimage cookbook run on this host and the installation finished, including the grub install. I can't explain why it wouldn't work for Brett and Sukhbir earlier (after the partman change was already merged) but it worked now.

Wed, Feb 14, 12:03 AM · SRE-tools, Infrastructure-Foundations, Ganeti

Tue, Feb 13

Dzahn claimed T346607: GitLab email confirmation mail ends up in spam folder.
Tue, Feb 13, 11:54 PM · Release-Engineering-Team (Radar), GitLab (Infrastructure), collaboration-services
Dzahn moved T346607: GitLab email confirmation mail ends up in spam folder from Consultation to Work in Progress on the collaboration-services board.
Tue, Feb 13, 11:53 PM · Release-Engineering-Team (Radar), GitLab (Infrastructure), collaboration-services
Dzahn moved T346607: GitLab email confirmation mail ends up in spam folder from Backlog to Consultation on the collaboration-services board.
Tue, Feb 13, 11:53 PM · Release-Engineering-Team (Radar), GitLab (Infrastructure), collaboration-services