Page MenuHomePhabricator

Monitorize availability of Wikimedia websites that are not hosted by the WMF
Closed, DeclinedPublic

Description

It would be great for the Wikimedia movement that the WMF monitorized the availability of Wikimedia websites which the WMF doesn't host directly, such as:

  • websites of some Wikimedia chapters that use their own infrastructure, servers or hosting services (mainly, those which use a specific domain name for their chapter, such as wikimedia.de, wikimedia.es or wikimedia.org.uk),
  • websites of user groups and thematic organizations,
  • websites of contests like wikilovesmonuments.org or wikilovesearth.org.

A ping and a simple curl --head could be enough. At the same time, some Wikimedia chapters could be encouraged to monitorize the availability of websites hosted by the WMF from their countries by running some code provided by the WMF.

Event Timeline

Why should the SRE team put energy into monitoring sites that the foundation doesn't operate?

Gehel subscribed.

The main monitoring that we have is not public, so there is not much use in monitoring sites not managed by WMF. https://status.wikimedia.org/ is public, but is hosted by CA (if I'm right).

Lowering priority as this does not seem to match our main goals. This does not preclude having that conversation and raising priority again if it make sense.

I'm not talking about a sophisticated system, but about a simple curl --head that gets the status code for every site. That's something I can run on Tool Labs or on any personal hosting, but I thought it would be better that the Operations team assumed and centralized this (effortless) task in the WMF side.

Nothing is effortless - our cognitive and alerting load is high as it is.

The problem with operations monitoring sites we don't control is that there isn't anything we can actually do to fix them. Monitoring a service is part of the task of administrating that service. I would presume the administrators of these sites do their own monitoring, which alerts those who can actually do something about any problems.

websites of some Wikimedia chapters that use their own infrastructure, servers or hosting services

I do not speak on behalf of WMF, but I would like to add my own personal thoughts:

WMF offers hosting and support for all chapters and wikis/projects that are related or in any way a benefit for the movement. We host more than 800 projects, so adding a few more is in no way an overhead, plus it integrates with existing monitoring and security procedures, plus there is a team of staffed professionals providing support 24/7 (aside from the invaluable help of numerous volunteers); with the exact same level of support as any main project.

If you do not want to get our production services, we even allow you to self-host your projects in places like labs and tool labs (and there we only provide "hosting"/"VM" support), but we still provide the infrastructure.

Despite that, there are some members that still prefer host their own servers, for many 100% valid and totally legit reasons (data protection, international laws, they just do not want to not depend on the Foundation's infrastructure, they want to practice sysadmin skills). Absolutely no problem with that, but in that case I think support from Operations should not be expected (we cannot monitor something we do not have anything to say in, and there may be even legal considerations regarding that). We also had numerous requests already from other webs we do not have nothing to do with, just because they use Mediawiki.

Again, it is only my own opinion, but the Foundation does not want to control what the users do, but it it is undoubtedly *way* more efficient to have 15 people supporting 900 wikis than them spending resources separately.

I would not be against someone asking for a grant to implement this if it helps, but if you want server managing, the first thing Operations will probably ask is to host it. What happens when monitoring fails? How can operations fix anything it has no control in any way? What if someone has privacy concerns about the checks (which whom, unlike labs or other wikis, we have agreed explicit privacy policies? What if adding that looks like we endorse certain projects, but they do not follow the strict data and user privacies required for al WMF projects? Should this have the same priority as properly fixing WMF's own infrastructure (which is in no way short of needing extra help)?

That's something I can run on Tool Labs or on any personal hosting

Agreed, which is why I would ask for this task to be declined. You can even reuse our own monitoring system which of course is publicly available.

I've created a simple tool for this purpose that is already running on Tool Labs.

I will try to continue developing it in a few days. However, any contribution is welcome.

https://tools.wmflabs.org/status/

https://github.com/davidabian/wm-status

Marking this task as declined.