Add a jobrunner server to the Scap canary pool
Open, LowPublic
Actions

Assigned To

None

Authored By

	Krinkle
	Aug 4 2017, 3:40 AM

Description

Follows-up:

Ensure that (if not already) at least one job runner is included in the list of canary servers that Scap uses for deploying MediaWiki code. This alone will already be an improvement, as any hits for mediawiki/exception, mediawiki/error or hhvm that only happen in job runner context would then be caught early.
Include ERROR (and higher) severity messages from the mediawiki/runJobs channel in the Logstash query for canary monitoring.
Once the jobrunner and jobchron service logs are indexed by Logstash, include ERROR (and higher) severity messages in the Logstash query.

Note that the jobrunner and jobchron services are independent PHP CLI programs (not MediaWiki cli scripts) so their logs will have a different type, and are not presently included anywhere else.

Related Objects
Search...

		Status	Subtype	Assigned	Task
		Open		None	T172480 Add a jobrunner server to the Scap canary pool
		Declined		None	T172479 Collect error logs from jobchron/jobrunner services in Logstash

Event Timeline

Krinkle created this task.Aug 4 2017, 3:40 AM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptAug 4 2017, 3:40 AM

Krinkle added a subtask: T172479: Collect error logs from jobchron/jobrunner services in Logstash.Aug 4 2017, 3:41 AM

greg moved this task from INBOX to Next on the Release-Engineering-Team board.Aug 23 2017, 4:36 PM

greg edited projects, added Release-Engineering-Team (Next); removed Release-Engineering-Team.

greg triaged this task as Low priority.Sep 18 2017, 4:47 PM

greg added a project: Scap (Tech Debt Sprint FY201718-Q2).

• Phabricator_maintenance moved this task from To Triage to Backlog (Tech) on the Deployments board.Sep 26 2017, 11:03 PM

Adding our Release-Engineering-Team (Kanban) project as we would like to work on this in the coming quarter or two (no promises though, this is not a "goal" only "other hoped for work").

• mmodell removed a project: Scap (Tech Debt Sprint FY201718-Q2).Feb 1 2018, 12:13 AM

• mmodell edited projects, added Scap; removed Deployments.Apr 23 2018, 4:32 PM

• mmodell subscribed.

greg edited projects, added Release-Engineering-Team (Next); removed Release-Engineering-Team (Kanban).Jul 5 2018, 6:05 PM

Krinkle added a project: Wikimedia-production-error.Jul 6 2018, 12:00 AM

Krinkle moved this task from Untriaged to Since Dec 2018 / 1.33.wmf9 on the Wikimedia-production-error board.

Krinkle edited projects, added Wikimedia-Incident; removed Wikimedia-production-error.

Krinkle moved this task from Active investigation to Follow-up prevention on the Wikimedia-Incident board.

Krinkle renamed this task from Add jobrunners to Scap canary process to Add jobrunner servers to Scap canary process.Jul 12 2018, 3:59 AM

Krinkle added a project: WMF-JobQueue.

Krinkle moved this task from Untriaged to Meta on the WMF-JobQueue board.

Krinkle added a project: Core-Platform-Team-Old.