Page MenuHomePhabricator

Invalid wikidata graphite metrics received
Open, Needs TriagePublic

Description

I noticed this in carbon logs on graphite1004, looks like some wikidata processes don't send a metric value

==> carbon-cache@b/listener.log <==
14/10/2021 07:57:00 :: invalid line (wikidata.dispatch.freshest.lag 1634198220) received from client 127.0.0.1:40438, ignoring

==> carbon-cache@c/listener.log <==
14/10/2021 07:57:00 :: invalid line (wikidata.dispatch.freshest.pending 1634198220) received from client 127.0.0.1:49942, ignoring
14/10/2021 07:57:00 :: invalid line (wikidata.dispatch.stalest.pending 1634198220) received from client 127.0.0.1:49942, ignoring

==> carbon-cache@d/listener.log <==
14/10/2021 07:57:00 :: invalid line (wikidata.dispatch.median.pending 1634198220) received from client 127.0.0.1:38194, ignoring

==> carbon-cache@a/listener.log <==
14/10/2021 07:57:00 :: invalid line (wikidata.dispatch.median.lag 1634198220) received from client 127.0.0.1:43602, ignoring

==> carbon-cache@b/listener.log <==
14/10/2021 07:57:00 :: invalid line (wikidata.dispatch.average.lag 1634198220) received from client 127.0.0.1:40438, ignoring

==> carbon-cache@d/listener.log <==
14/10/2021 07:57:00 :: invalid line (wikidata.dispatch.stalest.lag 1634198220) received from client 127.0.0.1:38194, ignoring
14/10/2021 07:57:00 :: invalid line (wikidata.dispatch.average.pending 1634198220) received from client 127.0.0.1:38194, ignoring

cc @Addshore

Event Timeline

These metrics come from analytics-wmde-scripts dispatch.php, and are read from action=query&meta=siteinfo&siprop=statistics. The data is supposed to be added to those statistics in RepoHooks::onAPIQuerySiteInfoStatisticsInfo(), but that no longer has statistics to add: since we migrated change dispatching to the job queue, the wb_changes table is now empty most of the time, so the statistics are missing from the API response, and dispatch.php probably calls WikimediaGraphite::sendNow() with a null $value (after triggering some warnings for accessing nonexistent array keys).

We’re restoring a small amount of dispatch statistics in T291846: New Content for Special:DispatchStats, but I don’t know if we’ll add this to the API as well, and/or whether this will be worth tracking in Graphite at all.

Change 731351 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[analytics/wmde/scripts@master] Check that change dispatch statistics are present

https://gerrit.wikimedia.org/r/731351

Change 732277 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[analytics/wmde/scripts@master] Remove dispatch.php

https://gerrit.wikimedia.org/r/732277

Change 731351 merged by jenkins-bot:

[analytics/wmde/scripts@master] Check that change dispatch statistics are present

https://gerrit.wikimedia.org/r/731351

Change 732079 had a related patch set uploaded (by Awight; author: Lucas Werkmeister (WMDE)):

[analytics/wmde/scripts@production] Check that change dispatch statistics are present

https://gerrit.wikimedia.org/r/732079

Change 732079 merged by jenkins-bot:

[analytics/wmde/scripts@production] Check that change dispatch statistics are present

https://gerrit.wikimedia.org/r/732079

Change 732277 merged by jenkins-bot:

[analytics/wmde/scripts@master] Remove dispatch.php

https://gerrit.wikimedia.org/r/732277

Change 732329 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[analytics/wmde/scripts@production] Remove dispatch.php

https://gerrit.wikimedia.org/r/732329

Change 732329 merged by jenkins-bot:

[analytics/wmde/scripts@production] Remove dispatch.php

https://gerrit.wikimedia.org/r/732329