Request creation of community-labs-monitoring labs project
Closed, ResolvedPublic

Description

Project Name: community-labs-monitoring
Purpose: T53434: Implement a system to monitor tools on tool-labs
Wikitech Username of requestor: Matthewrbowker

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 18 2016, 6:36 PM

I'm unclear what the plan is here, we are moving towards a model of prometheus integration with k8s. Is this to pursue something separate from that?

Oh, no. This is for specifically for Tool Labs and Labs instances. The intention here is to provide an easy way for volunteers to set up monitoring for their tools and labs instances.

My plan is to allow volunteers to log in via OAuth and "register" their tools and labs instances.

For the tools, I'm going to run HTTP checks - looking for a 200, 301, or 302. If I don't receive those, I'm going to generate an email sent to the tool email address.

For the Labs instances, I'm going to run ping and ssh checks. I'm looking for OK responses - if I don't receive them I'm going to generate an email also.

This is independent of any current checking system. iginga.wikimedia.org is blocked off for volunteers (Requires NDA as far as I know) and shinken doesn't monitor anything aside from the "tools" project instances (as far as I can tell - the interface for that tool is very convoluted)

shinken can be used for other projects, extdist.wmflabs.org uses it for example.

shinken can be used for other projects, extdist.wmflabs.org uses it for example.

Are there plans to expand shinken to tool labs? Running http checks?

Are there plans to allow volunteers to request shinken to monitor their instances?

I'm not being patronizing, I'm legitimately curious. Right now, I use Uptime Robot to monitor my tools on Tool Labs. I have no way of telling if a labs instance is "down," see comments of T148420 .

I'm attempting to fill a hole here - the ability for volunteers to set up in-house monitoring of their tools and labs instances. If there are plans to provide that functionality in the future, I will gladly close this task.

Andrew added a subscriber: Andrew.Oct 24 2016, 4:03 PM

@Matthewrbowker the Labs operators discussed this a bit today. In general we /try/ to provide Labs monitoring tools but they're admittedly not so great, so you are welcomed (and encouraged) to work on alternatives.

That said, a more specific project name would be better -- 'status' is painfully vague. Thanks!

Matthewrbowker renamed this task from Request creation of status labs project to Request creation of community-labs-monitoring labs project.Oct 24 2016, 4:40 PM
Matthewrbowker updated the task description. (Show Details)

@Andrew Re IRC, I've changed the requested name. Thank you :)

Andrew closed this task as Resolved.Oct 31 2016, 3:08 PM
Andrew claimed this task.

Sorry for the delay!

This is done now. @Matthewrbowker, you are set as a projectadmin in the new project; you can add new users or projectadmins on wikitech as you see fit.