Page MenuHomePhabricator

Lack of SRE resources on Wikimedia websites
Closed, InvalidPublic

Description

This Sunday 26/1/2020, access to Wikimedia-affiliated projects have been heavily disrupted, seemingly due to reliability issue with the infrastucture supporting the Wikimedia websites.

Is the SRE team out of office for all Sundays? How does the 10th most visited website in the world can be disrupted for so many long hours without anyone there to fix the issue?

I assume there were people working on this today, but it seems on the other hand that there is a shortage of people available for urgent needs in the site reliability engineering team.

Event Timeline

Everyone en-route to Wikimedia All Hands starting Monday 27?

Jdforrester-WMF subscribed.

Is the SRE team out of office for all Sundays?

No.

How does the 10th most visited website in the world can be disrupted for so many long hours without anyone there to fix the issue?

Dozens of people helped fix it. Thanks for your support.

Despite this task, on behalf of everyone else, I'd like to thank everyone in SRE and beyond who helped fix this including the many who sent in information to help. Outages happen, well done everyone and an even bigger thanks to those who keep us up in your volunteer roles without a bat of an eyelid!

Dozens of people helped fix it. Thanks for your support.

You're welcome. I'm not paid personally when I fix bugs on fr.wikt gadgets and js (sometimes due to rough updates from the DevOps team at Wikimedia).

I do of course hearthily thank volunteers having helped on this.
That is first of all the responsibility of the staff, though.

Dozens of people helped fix it. Thanks for your support.

You're welcome. I'm not paid personally when I fix bugs on fr.wikt gadgets and js (sometimes due to rough updates from the DevOps team at Wikimedia).

Then you should understand the power of volunteers.

I do of course hearthily thank volunteers having helped on this.
That is first of all the responsibility of the staff, though.

Yes, They will have led this but those out of office and volunteering will step up as always.

How does the 10th most visited website in the world can be disrupted for so many long hours

@Automatik: Please see https://www.mediawiki.org/wiki/Bug_management/Phabricator_etiquette for better venues where to bring up and discuss general meta issues. Thanks a lot! :)

Dinoguy1000 renamed this task from Lack of SRE resources on WIkimedia websites to Lack of SRE resources on Wikimedia websites.Jan 27 2020, 9:14 PM