User Details
- User Since
- Jun 7 2021, 7:25 AM (188 w, 2 d)
- Availability
- Available
- LDAP User
- Jelto
- MediaWiki User
- JWodstrcil (WMF) [ Global Accounts ]
Yesterday
Mon, Jan 13
Fri, Jan 10
expected due to the update in T383263
expected due to the update in T383263
Great thanks @elukey and Cathal for the netbox research and re reimage. wikikube-worker2022 looks good now, I update the bgp setting, homered and pooled the node. So this issue is resolved now and the host is on bookworm + containerd.
Thanks @Jclark-ctr for the quick help and running the reimage one more time. The host looks good to me now.
Thanks @Jclark-ctr for the quick help and running the reimage one more time. The host looks good to me now.
Thanks @Jclark-ctr for the quick help and running the reimage one more time. The host looks good to me now.
Thanks @Jclark-ctr for the quick help and running the reimage one more time. The host looks good to me now.
Thanks @Jclark-ctr for the quick help and running the reimage one more time. The host looks good to me now.
Thu, Jan 9
The following commands have to be executed when the host is back (just noting it down so I don't forget it):
Similar to issues in eqiad, like T381878
related to T383263
Wed, Jan 8
Thanks @elukey for doublechecking the PXE settings in the BIOS. I tried the reimage with --use-http-for-dhcp but this resulted in the same behavior: the DHCP request timed out and the system booted from disk.
Thanks for digging into the cdincludes parameter option! Bundling the API calls into one indeed seems like a good idea for better performance, especially if the result is cached for the most frequently accessed sites.
Thanks for checking the chanelog!
Mon, Dec 23
I removed some logfiles and apt cache. Both hosts have 7.5G free space, this should be enough for the holiday break
Thu, Dec 19
The host responses normally and a reimage worked. Thanks @Jhancock.wm for the quick help!
Wed, Dec 18
The .git folder was removed from all design miscweb sites and container images. For the other miscweb sites a similar .dockerignore was added as a precaution.
Tue, Dec 17
I added .dockerignore files to all miscweb projects.
Mon, Dec 16
Fixes for the remaining two design miscweb sites:
A fix was deployed which removed the .git folder from design.wikimedia.org.
I opened https://gitlab.wikimedia.org/repos/sre/miscweb/design-landing-page/-/merge_requests/4 which uses a dedicated builder variant to remove the .git folder and to fix the apache config.
I was not able to find the linked API key in GitLab. Unfortunately the Token information API is not available in 17.4 (our version) and needs at least 17.5.
I tried to use the token but I get Access denied:
remote: HTTP Basic: Access denied. The provided password or token is incorrect or your account has 2FA enabled and you must use a personal access token instead of a password. See https://gitlab.wikimedia.org/help/topics/git/troubleshooting_git#error-on-git-fetch-http-basic-access-denied fatal: Authentication failed for 'https://gitlab.wikimedia.org/repos/sre/miscweb/design-landing-page.git/'
Thanks for raising this issue. I think it makes sense to block that folder in apache. However we should avoid putting this file in the container image at all by fixing the blubber config. Because even if we are blocking the path in apache, you could just pull the image and extract the token locally.
Dec 12 2024
Dec 11 2024
Related to the upgrade in T381969. This will resolve on Friday when the upgrade is finished.
^ I used the wrong task ID. This depool was for codfw.
Dec 10 2024
The following commands have to be executed when the host is back (just noting it down so I don't forget it):
Dec 9 2024
The following commands have to be executed when the host is back (just noting it down so I don't forget it):
The following commands have to be executed when the host is back (just noting it down so I don't forget it):
The following commands have to be executed when the host is back (just noting it down so I don't forget it):
Dec 6 2024
wikikube-worker1057 is stuck because of Comm Error: backplane 0, see T381676.
Dec 5 2024
Thanks for the quick help @taavi and @thcipriani !
Dec 4 2024
Yesterday, there were reports of commons-query.wikimedia.org being unavailable, so we rolled back the query-scholarly switch. However, the issue with commons-query persists and continues to return upstream request timeout.
Dec 3 2024
The migration of query-scholarly.wikidata.org to Wikikube has been successfully completed. Basic test queries, headers, and the custom configuration are functioning as expected. A big thanks to @ItamarWMDE and @Lucas_Werkmeister_WMDE for their support!
It should be fine to make the task public.
Dec 2 2024
Reimage of wikikube-worker1006 failed because the node is not in icinga/alertmanager. I'll try to find out why tomorrow
Great thanks, I sent you an invite for tomorrow 14:00 CET.
this is resolved