Page MenuHomePhabricator

Jelto (jwodstrcil)
User

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Friday

  • Clear sailing ahead.

User Details

User Since
Jun 7 2021, 7:25 AM (188 w, 2 d)
Availability
Available
LDAP User
Jelto
MediaWiki User
JWodstrcil (WMF) [ Global Accounts ]

Recent Activity

Yesterday

Jelto created T383691: Relabel codfw kubernetes nodes.
Tue, Jan 14, 3:48 PM · SRE, ops-codfw, Kubernetes, Prod-Kubernetes, DC-Ops, serviceops
Jelto updated the task description for T383595: Relabel codfw kubernetes nodes.
Tue, Jan 14, 9:52 AM · SRE, ops-codfw, Kubernetes, Prod-Kubernetes, DC-Ops, serviceops

Mon, Jan 13

Jelto created T383595: Relabel codfw kubernetes nodes.
Mon, Jan 13, 5:47 PM · SRE, ops-codfw, Kubernetes, Prod-Kubernetes, DC-Ops, serviceops
Jelto updated the task description for T383341: Relabel codfw kubernetes nodes.
Mon, Jan 13, 3:33 PM · SRE, ops-codfw, Kubernetes, Prod-Kubernetes, DC-Ops, serviceops
Jelto updated the task description for T383341: Relabel codfw kubernetes nodes.
Mon, Jan 13, 12:00 PM · SRE, ops-codfw, Kubernetes, Prod-Kubernetes, DC-Ops, serviceops

Fri, Jan 10

Jelto updated the task description for T383341: Relabel codfw kubernetes nodes.
Fri, Jan 10, 3:38 PM · SRE, ops-codfw, Kubernetes, Prod-Kubernetes, DC-Ops, serviceops
Jelto closed T383397: SystemdUnitFailed (partial backup gitlab2002) as Resolved.

expected due to the update in T383263

Fri, Jan 10, 12:14 PM · collaboration-services
Jelto renamed T383397: SystemdUnitFailed (partial backup gitlab2002) from SystemdUnitFailed to SystemdUnitFailed (partial backup gitlab2002).
Fri, Jan 10, 12:14 PM · collaboration-services
Jelto closed T383396: ProbeDown (gitlab2002) as Resolved.

expected due to the update in T383263

Fri, Jan 10, 12:14 PM · collaboration-services
Jelto renamed T383396: ProbeDown (gitlab2002) from ProbeDown to ProbeDown (gitlab2002).
Fri, Jan 10, 12:13 PM · collaboration-services
Jelto updated the task description for T383341: Relabel codfw kubernetes nodes.
Fri, Jan 10, 11:54 AM · SRE, ops-codfw, Kubernetes, Prod-Kubernetes, DC-Ops, serviceops
Jelto closed T383228: wikikube-worker2022 move vlan failed due to netbox timeout, a subtask of T377877: Migrate wikikube-codfw to containerd, as Resolved.
Fri, Jan 10, 9:23 AM · collaboration-services, Prod-Kubernetes, Kubernetes, serviceops
Jelto closed T383228: wikikube-worker2022 move vlan failed due to netbox timeout as Resolved.

Great thanks @elukey and Cathal for the netbox research and re reimage. wikikube-worker2022 looks good now, I update the bgp setting, homered and pooled the node. So this issue is resolved now and the host is on bookworm + containerd.

Fri, Jan 10, 9:23 AM · collaboration-services, Kubernetes, Infrastructure-Foundations
Jelto added a comment to T383051: hw troubleshooting: "Comm Error: backplane 0" for wikikube-worker1243.eqiad.wmnet.

Thanks @Jclark-ctr for the quick help and running the reimage one more time. The host looks good to me now.

Fri, Jan 10, 8:39 AM · SRE, DC-Ops, ops-eqiad, collaboration-services, Prod-Kubernetes, Kubernetes, serviceops
Jelto added a comment to T381878: hw troubleshooting: "Comm Error: backplane 0" for wikikube-worker1081.eqiad.wmnet.

Thanks @Jclark-ctr for the quick help and running the reimage one more time. The host looks good to me now.

Fri, Jan 10, 8:38 AM · SRE, DC-Ops, ops-eqiad, Prod-Kubernetes, Kubernetes, serviceops
Jelto added a comment to T381789: hw troubleshooting: "Comm Error: backplane 0" for wikikube-worker1073.eqiad.wmnet.

Thanks @Jclark-ctr for the quick help and running the reimage one more time. The host looks good to me now.

Fri, Jan 10, 8:34 AM · SRE, ops-eqiad, DC-Ops, Prod-Kubernetes, Kubernetes, serviceops
Jelto added a comment to T381770: hw troubleshooting: "Comm Error: backplane 0" for wikikube-worker1069.eqiad.wmnet.

Thanks @Jclark-ctr for the quick help and running the reimage one more time. The host looks good to me now.

Fri, Jan 10, 8:32 AM · SRE, ops-eqiad, DC-Ops, Prod-Kubernetes, Kubernetes, serviceops
Jelto added a comment to T381676: hw troubleshooting: "Comm Error: backplane 0" for wikikube-worker1057.eqiad.wmnet.

Thanks @Jclark-ctr for the quick help and running the reimage one more time. The host looks good to me now.

Fri, Jan 10, 8:29 AM · serviceops, SRE, ops-eqiad, DC-Ops

Thu, Jan 9

Jelto created T383341: Relabel codfw kubernetes nodes.
Thu, Jan 9, 3:28 PM · SRE, ops-codfw, Kubernetes, Prod-Kubernetes, DC-Ops, serviceops
Jelto added a comment to T383339: hw troubleshooting: Comm Error: backplane 0 for wikikube-worker2192.codfw.wmnet.

The following commands have to be executed when the host is back (just noting it down so I don't forget it):

Thu, Jan 9, 3:23 PM · SRE, ops-codfw, Kubernetes, DC-Ops
Jelto added a comment to T383339: hw troubleshooting: Comm Error: backplane 0 for wikikube-worker2192.codfw.wmnet.

Similar to issues in eqiad, like T381878

Thu, Jan 9, 3:11 PM · SRE, ops-codfw, Kubernetes, DC-Ops
Jelto added a subtask for T377877: Migrate wikikube-codfw to containerd: T383339: hw troubleshooting: Comm Error: backplane 0 for wikikube-worker2192.codfw.wmnet.
Thu, Jan 9, 3:10 PM · collaboration-services, Prod-Kubernetes, Kubernetes, serviceops
Jelto added a parent task for T383339: hw troubleshooting: Comm Error: backplane 0 for wikikube-worker2192.codfw.wmnet: T377877: Migrate wikikube-codfw to containerd.
Thu, Jan 9, 3:10 PM · SRE, ops-codfw, Kubernetes, DC-Ops
Jelto created T383339: hw troubleshooting: Comm Error: backplane 0 for wikikube-worker2192.codfw.wmnet.
Thu, Jan 9, 3:10 PM · SRE, ops-codfw, Kubernetes, DC-Ops
Jelto added a comment to T381878: hw troubleshooting: "Comm Error: backplane 0" for wikikube-worker1081.eqiad.wmnet.

@Jelto i performed flea power drain and looks to image properly the critical status has cleared will update dell but looks good for now

Thu, Jan 9, 2:39 PM · SRE, DC-Ops, ops-eqiad, Prod-Kubernetes, Kubernetes, serviceops
Jelto closed T383293: SystemdUnitFailed (backup-restore.service) as Resolved.

related to T383263

Thu, Jan 9, 7:50 AM · collaboration-services
Jelto renamed T383293: SystemdUnitFailed (backup-restore.service) from SystemdUnitFailed to SystemdUnitFailed (backup-restore.service).
Thu, Jan 9, 7:50 AM · collaboration-services

Wed, Jan 8

Jelto added projects to T383228: wikikube-worker2022 move vlan failed due to netbox timeout: Kubernetes, collaboration-services.
Wed, Jan 8, 3:57 PM · collaboration-services, Kubernetes, Infrastructure-Foundations
Jelto added a comment to T383228: wikikube-worker2022 move vlan failed due to netbox timeout.

Thanks @elukey for doublechecking the PXE settings in the BIOS. I tried the reimage with --use-http-for-dhcp but this resulted in the same behavior: the DHCP request timed out and the system booted from disk.

Wed, Jan 8, 3:56 PM · collaboration-services, Kubernetes, Infrastructure-Foundations
Jelto added a comment to T379119: [Spike] Fetch Topics for Articles in History.

Thanks for digging into the cdincludes parameter option! Bundling the API calls into one indeed seems like a good idea for better performance, especially if the result is cached for the most frequently accessed sites.

Wed, Jan 8, 3:49 PM · collaboration-services, Wikipedia-iOS-App-Backlog (iOS Release FY2024-25)
Jelto added a comment to T383228: wikikube-worker2022 move vlan failed due to netbox timeout.

Thanks for checking the chanelog!

Wed, Jan 8, 2:19 PM · collaboration-services, Kubernetes, Infrastructure-Foundations
Jelto added a subtask for T377877: Migrate wikikube-codfw to containerd: T383228: wikikube-worker2022 move vlan failed due to netbox timeout.
Wed, Jan 8, 2:06 PM · collaboration-services, Prod-Kubernetes, Kubernetes, serviceops
Jelto added a parent task for T383228: wikikube-worker2022 move vlan failed due to netbox timeout: T377877: Migrate wikikube-codfw to containerd.
Wed, Jan 8, 2:06 PM · collaboration-services, Kubernetes, Infrastructure-Foundations
Jelto created T383228: wikikube-worker2022 move vlan failed due to netbox timeout.
Wed, Jan 8, 2:06 PM · collaboration-services, Kubernetes, Infrastructure-Foundations

Mon, Dec 23

Jelto added a comment to T382610: Low disk space: doc1003 / doc2002.

I removed some logfiles and apt cache. Both hosts have 7.5G free space, this should be enough for the holiday break

Mon, Dec 23, 8:09 AM · Release-Engineering-Team, collaboration-services

Thu, Dec 19

Jelto added a comment to T382420: Comm Error: backplane 0 when reimaging wikikube-worker2190.

The host responses normally and a reimage worked. Thanks @Jhancock.wm for the quick help!

Thu, Dec 19, 8:02 AM · SRE, ops-codfw, DC-Ops, collaboration-services, Prod-Kubernetes, Kubernetes, serviceops

Wed, Dec 18

Jelto created T382422: Relabel codfw kubernetes nodes.
Wed, Dec 18, 3:39 PM · SRE, ops-codfw, Kubernetes, Prod-Kubernetes, DC-Ops, serviceops
Jelto created T382420: Comm Error: backplane 0 when reimaging wikikube-worker2190.
Wed, Dec 18, 3:20 PM · SRE, ops-codfw, DC-Ops, collaboration-services, Prod-Kubernetes, Kubernetes, serviceops
Jelto closed T382230: CI token credentials exposed by public git config as Resolved.

The .git folder was removed from all design miscweb sites and container images. For the other miscweb sites a similar .dockerignore was added as a precaution.

Wed, Dec 18, 3:00 PM · SecTeam-Processed, Vuln-Infoleak, collaboration-services, Release-Engineering-Team, Security, Security-Team
Jelto updated subscribers of T382230: CI token credentials exposed by public git config.
Wed, Dec 18, 8:33 AM · SecTeam-Processed, Vuln-Infoleak, collaboration-services, Release-Engineering-Team, Security, Security-Team

Tue, Dec 17

Jelto added a comment to T382230: CI token credentials exposed by public git config.

I added .dockerignore files to all miscweb projects.

Tue, Dec 17, 3:40 PM · SecTeam-Processed, Vuln-Infoleak, collaboration-services, Release-Engineering-Team, Security, Security-Team
Jelto added a comment to T382230: CI token credentials exposed by public git config.
Tue, Dec 17, 9:39 AM · SecTeam-Processed, Vuln-Infoleak, collaboration-services, Release-Engineering-Team, Security, Security-Team

Mon, Dec 16

Jelto moved T374448: Make sure GitLab scales with more usage from Work in Progress to Backlog on the collaboration-services board.
Mon, Dec 16, 4:42 PM · GitLab (Infrastructure), collaboration-services
Jelto added a project to T377877: Migrate wikikube-codfw to containerd: collaboration-services.
Mon, Dec 16, 4:39 PM · collaboration-services, Prod-Kubernetes, Kubernetes, serviceops
Jelto updated the task description for T382230: CI token credentials exposed by public git config.
Mon, Dec 16, 2:54 PM · SecTeam-Processed, Vuln-Infoleak, collaboration-services, Release-Engineering-Team, Security, Security-Team
Jelto added a comment to T382230: CI token credentials exposed by public git config.

Fixes for the remaining two design miscweb sites:

Mon, Dec 16, 2:53 PM · SecTeam-Processed, Vuln-Infoleak, collaboration-services, Release-Engineering-Team, Security, Security-Team
Jelto lowered the priority of T382230: CI token credentials exposed by public git config from Unbreak Now! to High.

A fix was deployed which removed the .git folder from design.wikimedia.org.

Mon, Dec 16, 1:32 PM · SecTeam-Processed, Vuln-Infoleak, collaboration-services, Release-Engineering-Team, Security, Security-Team
Jelto added a comment to T382230: CI token credentials exposed by public git config.

I opened https://gitlab.wikimedia.org/repos/sre/miscweb/design-landing-page/-/merge_requests/4 which uses a dedicated builder variant to remove the .git folder and to fix the apache config.

Mon, Dec 16, 11:54 AM · SecTeam-Processed, Vuln-Infoleak, collaboration-services, Release-Engineering-Team, Security, Security-Team
Jelto updated subscribers of T382230: CI token credentials exposed by public git config.

I was not able to find the linked API key in GitLab. Unfortunately the Token information API is not available in 17.4 (our version) and needs at least 17.5.

Mon, Dec 16, 10:53 AM · SecTeam-Processed, Vuln-Infoleak, collaboration-services, Release-Engineering-Team, Security, Security-Team
Jelto added a comment to T382230: CI token credentials exposed by public git config.

I tried to use the token but I get Access denied:

remote: HTTP Basic: Access denied. The provided password or token is incorrect or your account has 2FA enabled and you must use a personal access token instead of a password. See https://gitlab.wikimedia.org/help/topics/git/troubleshooting_git#error-on-git-fetch-http-basic-access-denied
fatal: Authentication failed for 'https://gitlab.wikimedia.org/repos/sre/miscweb/design-landing-page.git/'
Mon, Dec 16, 9:52 AM · SecTeam-Processed, Vuln-Infoleak, collaboration-services, Release-Engineering-Team, Security, Security-Team
Jelto claimed T382230: CI token credentials exposed by public git config.
Mon, Dec 16, 9:31 AM · SecTeam-Processed, Vuln-Infoleak, collaboration-services, Release-Engineering-Team, Security, Security-Team
Jelto added a comment to T382230: CI token credentials exposed by public git config.

Thanks for raising this issue. I think it makes sense to block that folder in apache. However we should avoid putting this file in the container image at all by fixing the blubber config. Because even if we are blocking the path in apache, you could just pull the image and extract the token locally.

Mon, Dec 16, 9:31 AM · SecTeam-Processed, Vuln-Infoleak, collaboration-services, Release-Engineering-Team, Security, Security-Team

Dec 12 2024

Jelto updated the task description for T377877: Migrate wikikube-codfw to containerd.
Dec 12 2024, 1:33 PM · collaboration-services, Prod-Kubernetes, Kubernetes, serviceops
Jelto updated the task description for T377876: Migrate wikikube-eqiad to containerd.
Dec 12 2024, 1:32 PM · collaboration-services, Prod-Kubernetes, Kubernetes, serviceops

Dec 11 2024

Jelto updated the task description for T381967: Relabel codfw kubernetes nodes.
Dec 11 2024, 3:03 PM · SRE, ops-codfw, Kubernetes, Prod-Kubernetes, DC-Ops, serviceops
Jelto closed T381983: SystemdUnitFailed (GitLab replicas backup-restore.service) as Resolved.

Related to the upgrade in T381969. This will resolve on Friday when the upgrade is finished.

Dec 11 2024, 2:46 PM · collaboration-services
Jelto renamed T381983: SystemdUnitFailed (GitLab replicas backup-restore.service) from SystemdUnitFailed to SystemdUnitFailed (GitLab replicas backup-restore.service).
Dec 11 2024, 2:46 PM · collaboration-services
Jelto added a comment to T350793: move commons-query.wikimedia.org and query.wikidata.org to kubernetes.

Thanks for finding and fixing the issue with common-query @bking , @Dzahn . I'm glad this is unrelated to the gui migration and not a blocker for the migration of the remaining services.

Dec 11 2024, 2:43 PM · Data-Platform-SRE (2025.01.11 - 2025.01.31), Patch-For-Review, User-ItamarWMDE, Wikidata, wmde-wikidata-tech, Wikidata Query UI, GitLab (Pipeline Services Migration🐤), collaboration-services
Jelto created T381967: Relabel codfw kubernetes nodes.
Dec 11 2024, 11:59 AM · SRE, ops-codfw, Kubernetes, Prod-Kubernetes, DC-Ops, serviceops
Jelto added a comment to T377876: Migrate wikikube-eqiad to containerd.

^ I used the wrong task ID. This depool was for codfw.

Dec 11 2024, 9:08 AM · collaboration-services, Prod-Kubernetes, Kubernetes, serviceops

Dec 10 2024

Jelto added a comment to T381504: Relabel eqiad kubernetes nodes.

@Jelto heads up, these are showing up in a netbox report.

Device is Active in Netbox but is missing from PuppetDB (should be ('decommissioning', 'inventory', 'offline', 'planned', 'staged', 'failed'))

wikikube-worker1057
wikikube-worker1069
wikikube-worker1073
wikikube-worker1081

Dec 10 2024, 3:52 PM · SRE, collaboration-services, ops-eqiad, Kubernetes, Prod-Kubernetes, DC-Ops, serviceops
Jelto updated the task description for T381504: Relabel eqiad kubernetes nodes.
Dec 10 2024, 2:33 PM · SRE, collaboration-services, ops-eqiad, Kubernetes, Prod-Kubernetes, DC-Ops, serviceops
Jelto added a comment to T381878: hw troubleshooting: "Comm Error: backplane 0" for wikikube-worker1081.eqiad.wmnet.

The following commands have to be executed when the host is back (just noting it down so I don't forget it):

Dec 10 2024, 2:04 PM · SRE, DC-Ops, ops-eqiad, Prod-Kubernetes, Kubernetes, serviceops
Jelto created T381878: hw troubleshooting: "Comm Error: backplane 0" for wikikube-worker1081.eqiad.wmnet.
Dec 10 2024, 2:03 PM · SRE, DC-Ops, ops-eqiad, Prod-Kubernetes, Kubernetes, serviceops
Jelto updated the task description for T381504: Relabel eqiad kubernetes nodes.
Dec 10 2024, 10:01 AM · SRE, collaboration-services, ops-eqiad, Kubernetes, Prod-Kubernetes, DC-Ops, serviceops

Dec 9 2024

Jelto updated the task description for T381504: Relabel eqiad kubernetes nodes.
Dec 9 2024, 5:23 PM · SRE, collaboration-services, ops-eqiad, Kubernetes, Prod-Kubernetes, DC-Ops, serviceops
Jelto added a comment to T381789: hw troubleshooting: "Comm Error: backplane 0" for wikikube-worker1073.eqiad.wmnet.

The following commands have to be executed when the host is back (just noting it down so I don't forget it):

Dec 9 2024, 5:19 PM · SRE, ops-eqiad, DC-Ops, Prod-Kubernetes, Kubernetes, serviceops
Jelto created T381789: hw troubleshooting: "Comm Error: backplane 0" for wikikube-worker1073.eqiad.wmnet.
Dec 9 2024, 5:18 PM · SRE, ops-eqiad, DC-Ops, Prod-Kubernetes, Kubernetes, serviceops
Jelto added a comment to T381770: hw troubleshooting: "Comm Error: backplane 0" for wikikube-worker1069.eqiad.wmnet.

The following commands have to be executed when the host is back (just noting it down so I don't forget it):

Dec 9 2024, 1:35 PM · SRE, ops-eqiad, DC-Ops, Prod-Kubernetes, Kubernetes, serviceops
Jelto updated the task description for T381504: Relabel eqiad kubernetes nodes.
Dec 9 2024, 1:32 PM · SRE, collaboration-services, ops-eqiad, Kubernetes, Prod-Kubernetes, DC-Ops, serviceops
Jelto created T381770: hw troubleshooting: "Comm Error: backplane 0" for wikikube-worker1069.eqiad.wmnet.
Dec 9 2024, 1:15 PM · SRE, ops-eqiad, DC-Ops, Prod-Kubernetes, Kubernetes, serviceops
Jelto updated the task description for T381504: Relabel eqiad kubernetes nodes.
Dec 9 2024, 10:21 AM · SRE, collaboration-services, ops-eqiad, Kubernetes, Prod-Kubernetes, DC-Ops, serviceops
Jelto updated the task description for T381504: Relabel eqiad kubernetes nodes.
Dec 9 2024, 7:35 AM · SRE, collaboration-services, ops-eqiad, Kubernetes, Prod-Kubernetes, DC-Ops, serviceops
Jelto added a comment to T381676: hw troubleshooting: "Comm Error: backplane 0" for wikikube-worker1057.eqiad.wmnet.

The following commands have to be executed when the host is back (just noting it down so I don't forget it):

Dec 9 2024, 7:21 AM · serviceops, SRE, ops-eqiad, DC-Ops

Dec 6 2024

Jelto added a comment to T377876: Migrate wikikube-eqiad to containerd.

wikikube-worker1057 is stuck because of Comm Error: backplane 0, see T381676.

Dec 6 2024, 4:51 PM · collaboration-services, Prod-Kubernetes, Kubernetes, serviceops
Jelto created T381676: hw troubleshooting: "Comm Error: backplane 0" for wikikube-worker1057.eqiad.wmnet.
Dec 6 2024, 4:50 PM · serviceops, SRE, ops-eqiad, DC-Ops
Jelto updated the task description for T381504: Relabel eqiad kubernetes nodes.
Dec 6 2024, 12:15 PM · SRE, collaboration-services, ops-eqiad, Kubernetes, Prod-Kubernetes, DC-Ops, serviceops
Jelto updated the task description for T381504: Relabel eqiad kubernetes nodes.
Dec 6 2024, 10:28 AM · SRE, collaboration-services, ops-eqiad, Kubernetes, Prod-Kubernetes, DC-Ops, serviceops

Dec 5 2024

Jelto updated the task description for T381504: Relabel eqiad kubernetes nodes.
Dec 5 2024, 6:12 PM · SRE, collaboration-services, ops-eqiad, Kubernetes, Prod-Kubernetes, DC-Ops, serviceops
Jelto added a comment to T381591: Add dbrant to wmf-deployment.

Thanks for the quick help @taavi and @thcipriani !

Dec 5 2024, 4:18 PM · Gerrit-Privilege-Requests
Jelto updated the task description for T381591: Add dbrant to wmf-deployment.
Dec 5 2024, 4:12 PM · Gerrit-Privilege-Requests
Dbrant awarded T381591: Add dbrant to wmf-deployment a Love token.
Dec 5 2024, 4:12 PM · Gerrit-Privilege-Requests
Jelto created T381591: Add dbrant to wmf-deployment.
Dec 5 2024, 4:10 PM · Gerrit-Privilege-Requests
Jelto updated the task description for T381504: Relabel eqiad kubernetes nodes.
Dec 5 2024, 2:37 PM · SRE, collaboration-services, ops-eqiad, Kubernetes, Prod-Kubernetes, DC-Ops, serviceops
Jelto updated the task description for T381504: Relabel eqiad kubernetes nodes.
Dec 5 2024, 10:33 AM · SRE, collaboration-services, ops-eqiad, Kubernetes, Prod-Kubernetes, DC-Ops, serviceops

Dec 4 2024

Jelto created T381504: Relabel eqiad kubernetes nodes.
Dec 4 2024, 3:47 PM · SRE, collaboration-services, ops-eqiad, Kubernetes, Prod-Kubernetes, DC-Ops, serviceops
Jelto added a comment to T350793: move commons-query.wikimedia.org and query.wikidata.org to kubernetes.

Yesterday, there were reports of commons-query.wikimedia.org being unavailable, so we rolled back the query-scholarly switch. However, the issue with commons-query persists and continues to return upstream request timeout.

Dec 4 2024, 7:42 AM · Data-Platform-SRE (2025.01.11 - 2025.01.31), Patch-For-Review, User-ItamarWMDE, Wikidata, wmde-wikidata-tech, Wikidata Query UI, GitLab (Pipeline Services Migration🐤), collaboration-services

Dec 3 2024

Jelto updated the task description for T381268: Relabel eqiad kubernetes nodes.
Dec 3 2024, 6:24 PM · SRE, collaboration-services, ops-eqiad, Kubernetes, Prod-Kubernetes, DC-Ops, serviceops
Jelto updated the task description for T350793: move commons-query.wikimedia.org and query.wikidata.org to kubernetes.
Dec 3 2024, 3:23 PM · Data-Platform-SRE (2025.01.11 - 2025.01.31), Patch-For-Review, User-ItamarWMDE, Wikidata, wmde-wikidata-tech, Wikidata Query UI, GitLab (Pipeline Services Migration🐤), collaboration-services
Jelto added a comment to T350793: move commons-query.wikimedia.org and query.wikidata.org to kubernetes.

The migration of query-scholarly.wikidata.org to Wikikube has been successfully completed. Basic test queries, headers, and the custom configuration are functioning as expected. A big thanks to @ItamarWMDE and @Lucas_Werkmeister_WMDE for their support!

Dec 3 2024, 3:23 PM · Data-Platform-SRE (2025.01.11 - 2025.01.31), Patch-For-Review, User-ItamarWMDE, Wikidata, wmde-wikidata-tech, Wikidata Query UI, GitLab (Pipeline Services Migration🐤), collaboration-services
Jelto updated the task description for T381268: Relabel eqiad kubernetes nodes.
Dec 3 2024, 2:14 PM · SRE, collaboration-services, ops-eqiad, Kubernetes, Prod-Kubernetes, DC-Ops, serviceops
Jelto updated the task description for T381268: Relabel eqiad kubernetes nodes.
Dec 3 2024, 9:53 AM · SRE, collaboration-services, ops-eqiad, Kubernetes, Prod-Kubernetes, DC-Ops, serviceops
Jelto added a comment to T317341: Findings in Security Readiness Reviews of Trusted GitLab Runners.

It should be fine to make the task public.

Dec 3 2024, 7:30 AM · SecTeam-Processed, Vuln-Misconfiguration, Security-Team, Security, collaboration-services, GitLab (CI & Job Runners)

Dec 2 2024

Jelto added a comment to T377876: Migrate wikikube-eqiad to containerd.

Reimage of wikikube-worker1006 failed because the node is not in icinga/alertmanager. I'll try to find out why tomorrow

Dec 2 2024, 4:49 PM · collaboration-services, Prod-Kubernetes, Kubernetes, serviceops
Jelto added a comment to T350793: move commons-query.wikimedia.org and query.wikidata.org to kubernetes.

Great thanks, I sent you an invite for tomorrow 14:00 CET.

Dec 2 2024, 1:57 PM · Data-Platform-SRE (2025.01.11 - 2025.01.31), Patch-For-Review, User-ItamarWMDE, Wikidata, wmde-wikidata-tech, Wikidata Query UI, GitLab (Pipeline Services Migration🐤), collaboration-services
Jelto created T381268: Relabel eqiad kubernetes nodes.
Dec 2 2024, 1:22 PM · SRE, collaboration-services, ops-eqiad, Kubernetes, Prod-Kubernetes, DC-Ops, serviceops
Jelto added a project to T377876: Migrate wikikube-eqiad to containerd: collaboration-services.
Dec 2 2024, 10:37 AM · collaboration-services, Prod-Kubernetes, Kubernetes, serviceops
Jelto merged T381176: SystemdUnitFailed into T381156: SystemdUnitFailed (partial-backup.service gitlab2002).
Dec 2 2024, 7:29 AM · collaboration-services
Jelto merged task T381176: SystemdUnitFailed into T381156: SystemdUnitFailed (partial-backup.service gitlab2002).
Dec 2 2024, 7:29 AM · collaboration-services
Jelto closed T381156: SystemdUnitFailed (partial-backup.service gitlab2002) as Resolved.

this is resolved

Dec 2 2024, 7:29 AM · collaboration-services