Page MenuHomePhabricator

wmde-toolkit-analyzer-build.service fails on stat1007
Open, Needs TriagePublic

Description

Mar 28 12:00:00 stat1007 java[21649]: Error: Invalid or corrupt jarfile /srv/analytics-wmde/graphite/src/toolkit-analyzer-build/toolkit-analyzer.jar

Event Timeline

I have re-created the repo via puppet, initialized git lfs and checked out the jar, since for some reason there was only the git lfs placeholder.

Now I get this:

elukey@stat1007:~$ sudo journalctl -u wmde-toolkit-analyzer-build.service -f
-- Logs begin at Thu 2021-03-18 06:48:51 UTC. --
Mar 30 09:29:17 stat1007 java[3403]: * Target storage directory : data/                                         *
Mar 30 09:29:17 stat1007 java[3403]: * Downloaded dump locations: data/dumpfiles/json-<DATE>/<DATE>-all.json.gz *
Mar 30 09:29:17 stat1007 java[3403]: * Processor output location: data/<DATE>/                                  *
Mar 30 09:29:17 stat1007 java[3403]: ****************************************************************************
Mar 30 09:29:18 stat1007 java[3403]: Targeting latest dump: 20210324
Mar 30 09:29:18 stat1007 java[3403]: Using data directory: /srv/analytics-wmde/graphite/data
Mar 30 09:29:18 stat1007 java[3403]: MetricProcessor enabled
Mar 30 09:29:18 stat1007 java[3403]: SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
Mar 30 09:29:18 stat1007 java[3403]: SLF4J: Defaulting to no-operation (NOP) logger implementation
Mar 30 09:29:18 stat1007 java[3403]: SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
Mar 30 09:29:19 stat1007 java[3403]: Fetching dump
Mar 30 09:29:19 stat1007 java[3403]: Getting dump with date 20210324
Mar 30 09:29:19 stat1007 java[3403]: Looking for dump files in: /srv/analytics-wmde/graphite/data/dumpfiles/json-20210324/
Mar 30 09:29:19 stat1007 java[3403]: Using dump file from: /srv/analytics-wmde/graphite/data/dumpfiles/json-20210324/20210324.json.gz
Mar 30 09:29:19 stat1007 java[3403]: Processing dump
Mar 30 09:29:19 stat1007 java[3403]: Processed!
Mar 30 09:29:19 stat1007 java[3403]: Memory Usage (MB): 960
Mar 30 09:29:19 stat1007 java[3403]: Exception in thread "main" java.lang.NullPointerException
Mar 30 09:29:19 stat1007 java[3403]:         at org.wikidata.analyzer.Processor.MetricProcessor.doPostProcessing(MetricProcessor.java:37)
Mar 30 09:29:19 stat1007 java[3403]:         at org.wikidata.analyzer.WikidataAnalyzer.scan(WikidataAnalyzer.java:207)
Mar 30 09:29:19 stat1007 java[3403]:         at org.wikidata.analyzer.WikidataAnalyzer.run(WikidataAnalyzer.java:145)
Mar 30 09:29:19 stat1007 java[3403]:         at org.wikidata.analyzer.WikidataAnalyzer.init(WikidataAnalyzer.java:76)
Mar 30 09:29:19 stat1007 java[3403]:         at org.wikidata.analyzer.WikidataAnalyzer.main(WikidataAnalyzer.java:38)
Mar 30 09:29:19 stat1007 systemd[1]: wmde-toolkit-analyzer-build.service: Main process exited, code=exited, status=1/FAILURE
Mar 30 09:29:19 stat1007 systemd[1]: wmde-toolkit-analyzer-build.service: Failed with result 'exit-code'.

Used jdb to get more info:

main[1] print this.counters
 this.counters = "{property.statements.avg=0.0, item.statements.avg=0.0}"

The code is https://github.com/wikimedia/analytics-wmde-toolkit-analyzer/blob/master/analyzer/src/main/java/org/wikidata/analyzer/Processor/MetricProcessor.java#L37, so I guess that for some reason item.* is not incremented at all?

Change 677087 had a related patch set uploaded (by Elukey; author: Elukey):

[operations/puppet@production] statistics::wmde::graphite: disable wmde-toolkit-analyzer-build timer

https://gerrit.wikimedia.org/r/677087

Change 677087 merged by Elukey:

[operations/puppet@production] statistics::wmde::graphite: disable wmde-toolkit-analyzer-build timer

https://gerrit.wikimedia.org/r/677087

I have removed the systemd timer for the time being, it was constantly failing and causing alerts.