Page MenuHomePhabricator

Grafana Datamodel References dashboard broken (daily data)
Closed, ResolvedPublic3 Estimated Story Points

Description

Problem:
The Grafana graph Reference Snak main properties and other graphs stopped displaying values after 2018-12-31.

The data is generated by https://github.com/wikimedia/analytics-wmde-toolkit-analyzer/blob/master/analyzer/src/main/java/org/wikidata/analyzer/Processor/MetricProcessor.java and sent to graphite using https://github.com/wikimedia/analytics-wmde-scripts/blob/master/src/wikidata/dumpScanProcessing.php

Acceptance criteria:

  • The Grafana graphs on the dashboard show current data again

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Addshore renamed this task from Grafana Reference Snak graph missing current data to Grafana Datamoel Referneces dashboard broken (daily data).Jul 9 2019, 4:49 PM
Addshore updated the task description. (Show Details)
Addshore moved this task from incoming to needs discussion or investigation on the Wikidata board.
Addshore moved this task from Incoming to Needs Work on the Wikidata-Campsite board.

From the logs:

addshore@stat1007:~$ sudo -u analytics-wmde cat /srv/analytics-wmde/graphite/log/toolkit-analyzer.log
Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF-8
****************************************************************************
***                       Wikidata Toolkit: ToolkitAnalyzer              ***
******************************* Data Directory Layout **********************
* Target storage directory : data/                                         *
* Downloaded dump locations: data/dumpfiles/json-<DATE>/<DATE>-all.json.gz *
* Processor output location: data/<DATE>/                                  *
****************************************************************************
Targeting latest dump: 20190704
Using data directory: /srv/analytics-wmde/graphite/data
MetricProcessor enabled
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
Error getting data from https://query.wikidata.org/sparql
Connection timed out (Connection timed out)
java.net.ConnectException: Connection timed out (Connection timed out)
Addshore triaged this task as Medium priority.Jul 9 2019, 5:17 PM
Addshore moved this task from needs discussion or investigation to ready to go on the Wikidata board.
Addshore moved this task from Needs Work to Ready to estimate on the Wikidata-Campsite board.
matej_suchanek renamed this task from Grafana Datamoel Referneces dashboard broken (daily data) to Grafana Datamodel References dashboard broken (daily data).Jul 16 2019, 8:07 AM

Change 526471 had a related patch set uploaded (by Ladsgroup; owner: Ladsgroup):
[analytics/wmde/toolkit-analyzer@master] Use the internal WDQS endpoint instead

https://gerrit.wikimedia.org/r/526471

Change 526471 merged by jenkins-bot:
[analytics/wmde/toolkit-analyzer@master] Use the internal WDQS endpoint instead

https://gerrit.wikimedia.org/r/526471

is there a way to test this on Beta, or just in production?

It should be only done in production. but I have no idea how to deploy this thing.

Change 528767 had a related patch set uploaded (by Ladsgroup; owner: Ladsgroup):
[analytics/wmde/toolkit-analyzer-build@master] New build

https://gerrit.wikimedia.org/r/528767

Change 528767 merged by jenkins-bot:
[analytics/wmde/toolkit-analyzer-build@master] New build

https://gerrit.wikimedia.org/r/528767

Change 528871 had a related patch set uploaded (by Ladsgroup; owner: Ladsgroup):
[analytics/wmde/toolkit-analyzer-build@production] New build

https://gerrit.wikimedia.org/r/528871

Change 528871 merged by jenkins-bot:
[analytics/wmde/toolkit-analyzer-build@production] New build

https://gerrit.wikimedia.org/r/528871

So the graphs don't show up (yet), I confirmed that the new build is on production but the latest log still gives out connection time out to query.wikidata.org. It seems it's ran weekly (the one before that is exactly one week before it). If my assessment is correct, it will show up after 12th of August.

So the graphs don't show up (yet), I confirmed that the new build is on production but the latest log still gives out connection time out to query.wikidata.org. It seems it's ran weekly (the one before that is exactly one week before it). If my assessment is correct, it will show up after 12th of August.

It's actually run daily but at noon, it'll happen in a couple of hours

It's still failing
we have time java -Dhttp.proxyHost=\"http://webproxy.${::site}.wmnet\" -Dhttp.proxyPort=8080 -Xmx2g -jar ${dir}/src/toolkit-analyzer-build/toolkit-analyzer.jar --processors Metric --store ${dir}/data --latest >> ${log_dir}/toolkit-analyzer.log 2>&1 so it should have not an issue (goes through webproxy which is I'm not sure if needed anymore)

BUT stat is firewalled: https://lists.wikimedia.org/pipermail/analytics/2019-July/006648.html cc: @elukey

Change 529094 had a related patch set uploaded (by Ladsgroup; owner: Ladsgroup):
[analytics/wmde/toolkit-analyzer@master] Fix port when connecting to WDQS

https://gerrit.wikimedia.org/r/529094

Change 529094 merged by jenkins-bot:
[analytics/wmde/toolkit-analyzer@master] Fix port when connecting to WDQS

https://gerrit.wikimedia.org/r/529094

Change 529097 had a related patch set uploaded (by Ladsgroup; owner: Ladsgroup):
[analytics/wmde/toolkit-analyzer-build@master] New build

https://gerrit.wikimedia.org/r/529097

Change 529097 merged by jenkins-bot:
[analytics/wmde/toolkit-analyzer-build@master] New build

https://gerrit.wikimedia.org/r/529097

Change 529098 had a related patch set uploaded (by Ladsgroup; owner: Ladsgroup):
[analytics/wmde/toolkit-analyzer-build@production] New build

https://gerrit.wikimedia.org/r/529098

Change 529098 merged by jenkins-bot:
[analytics/wmde/toolkit-analyzer-build@production] New build

https://gerrit.wikimedia.org/r/529098

The dashboard now seems to have data for the 5th of August but only that. Is this only tracked weekly?

The dashboard now seems to have data for the 5th of August but only that. Is this only tracked weekly?

It's daily but the date it sends the information is based on the latest json dumps and dumps are ran weekly now it seems.

Random question, maybe you'd be in the know @Ladsgroup: any chance we could get the past missing data back-filled?

Random question, maybe you'd be in the know @Ladsgroup: any chance we could get the past missing data back-filled?

It's possible and it's not that hard if you need the numbers.

@Lydia_Pintscher said it is not worth investing effort to restore the past numbers. Closing then.