Page MenuHomePhabricator

Disable production EventLogging analytics MySQL consumers
Closed, ResolvedPublic5 Story Points

Description

  • mysql-eventbus
  • mysql-m4-master-00 (blocked on T223414)

Details

Event Timeline

Ottomata created this task.Sep 9 2019, 2:53 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 9 2019, 2:53 PM

Change 535205 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Disable mysql-eventbus eventlogging consumer

https://gerrit.wikimedia.org/r/535205

Since the page-create MySQL based dashboards are no longer needed, can we go ahead and just turn off the mysql-eventbus EventLogging consumer? Does anyone else use any MediaWiki/EventBus tables on the eventlogging log database?

Ottomata updated the task description. (Show Details)Sep 9 2019, 2:58 PM

Change 535205 merged by Ottomata:
[operations/puppet@production] Disable mysql-eventbus eventlogging consumer

https://gerrit.wikimedia.org/r/535205

Change 535225 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Use $ensure for all resources in eventlogging::service::consumer

https://gerrit.wikimedia.org/r/535225

Change 535225 merged by Ottomata:
[operations/puppet@production] Use $ensure for all resources in eventlogging::service::consumer

https://gerrit.wikimedia.org/r/535225

Ottomata updated the task description. (Show Details)Sep 9 2019, 4:31 PM

Change 535226 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Remove unused eventlogging mysql consumer eventbus puppetization

https://gerrit.wikimedia.org/r/535226

Change 535226 merged by Ottomata:
[operations/puppet@production] Remove unused eventlogging mysql consumer eventbus puppetization

https://gerrit.wikimedia.org/r/535226

This look related,
https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=eventlog1002&service=Check+systemd+state
CRITICAL - degraded: The system is operational but one or more units failed.

eventlog1002:~$ sudo systemctl
[...]
● eventlogging-consumer@mysql-eventbus.service not-found failed failed    eventl

I ACK'ed the alert, with a pointer to that task.

This look related,
https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=eventlog1002&service=Check+systemd+state
CRITICAL - degraded: The system is operational but one or more units failed.

eventlog1002:~$ sudo systemctl
[...]
● eventlogging-consumer@mysql-eventbus.service not-found failed failed    eventl

Thanks, just executed systemctl reset-failed etc.. to clear the problem, everything should be ok now!

I ACK'ed the alert, with a pointer to that task.

Something strange happened today: I noticed in icinga that the eventlogging mysql insertion rate alarm was in UNKNOWN state, so I checked the graphs and the m4 consumer stopped sending data around the 21st:

https://grafana.wikimedia.org/d/000000505/eventlogging?panelId=12&fullscreen&orgId=1&from=now-7d&to=now-5m

I bounced the consumer on eventlog1002, and now events are re-flowing.. Didn't see anything in here or in the SAL mentioning that m4 was also stopped on purpose, let me know if I need to re-stop it or not.. Super strange that the alarm didn't fire!

Very strange! Joseph and I did end up bouncing eventloggingctl stuff last Wednesday for deployment of ua-parser, but that doesn't seem related.

Milimetric triaged this task as High priority.Oct 7 2019, 4:07 PM

Change 547231 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Disable eventlogging-consumer mysql in prod

https://gerrit.wikimedia.org/r/547231

Change 547231 merged by Ottomata:
[operations/puppet@production] Disable eventlogging-consumer mysql in prod

https://gerrit.wikimedia.org/r/547231

Ottomata updated the task description. (Show Details)
Ottomata moved this task from Next Up to Done on the Analytics-Kanban board.
Nuria closed this task as Resolved.Thu, Nov 7, 11:15 PM
Nuria set the point value for this task to 5.