Mar 28 12:00:00 stat1007 java[21649]: Error: Invalid or corrupt jarfile /srv/analytics-wmde/graphite/src/toolkit-analyzer-build/toolkit-analyzer.jar
Description
Details
Subject | Repo | Branch | Lines +/- | |
---|---|---|---|---|
statistics::wmde::graphite: disable wmde-toolkit-analyzer-build timer | operations/puppet | production | +2 -1 |
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Open | None | T348609 [EPIC] Clarify team ownership of WMDE cronjobs on stats1007 | |||
Open | None | T278665 wmde-toolkit-analyzer-build.service fails on stat1007 |
Event Timeline
I have re-created the repo via puppet, initialized git lfs and checked out the jar, since for some reason there was only the git lfs placeholder.
Now I get this:
elukey@stat1007:~$ sudo journalctl -u wmde-toolkit-analyzer-build.service -f -- Logs begin at Thu 2021-03-18 06:48:51 UTC. -- Mar 30 09:29:17 stat1007 java[3403]: * Target storage directory : data/ * Mar 30 09:29:17 stat1007 java[3403]: * Downloaded dump locations: data/dumpfiles/json-<DATE>/<DATE>-all.json.gz * Mar 30 09:29:17 stat1007 java[3403]: * Processor output location: data/<DATE>/ * Mar 30 09:29:17 stat1007 java[3403]: **************************************************************************** Mar 30 09:29:18 stat1007 java[3403]: Targeting latest dump: 20210324 Mar 30 09:29:18 stat1007 java[3403]: Using data directory: /srv/analytics-wmde/graphite/data Mar 30 09:29:18 stat1007 java[3403]: MetricProcessor enabled Mar 30 09:29:18 stat1007 java[3403]: SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". Mar 30 09:29:18 stat1007 java[3403]: SLF4J: Defaulting to no-operation (NOP) logger implementation Mar 30 09:29:18 stat1007 java[3403]: SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. Mar 30 09:29:19 stat1007 java[3403]: Fetching dump Mar 30 09:29:19 stat1007 java[3403]: Getting dump with date 20210324 Mar 30 09:29:19 stat1007 java[3403]: Looking for dump files in: /srv/analytics-wmde/graphite/data/dumpfiles/json-20210324/ Mar 30 09:29:19 stat1007 java[3403]: Using dump file from: /srv/analytics-wmde/graphite/data/dumpfiles/json-20210324/20210324.json.gz Mar 30 09:29:19 stat1007 java[3403]: Processing dump Mar 30 09:29:19 stat1007 java[3403]: Processed! Mar 30 09:29:19 stat1007 java[3403]: Memory Usage (MB): 960 Mar 30 09:29:19 stat1007 java[3403]: Exception in thread "main" java.lang.NullPointerException Mar 30 09:29:19 stat1007 java[3403]: at org.wikidata.analyzer.Processor.MetricProcessor.doPostProcessing(MetricProcessor.java:37) Mar 30 09:29:19 stat1007 java[3403]: at org.wikidata.analyzer.WikidataAnalyzer.scan(WikidataAnalyzer.java:207) Mar 30 09:29:19 stat1007 java[3403]: at org.wikidata.analyzer.WikidataAnalyzer.run(WikidataAnalyzer.java:145) Mar 30 09:29:19 stat1007 java[3403]: at org.wikidata.analyzer.WikidataAnalyzer.init(WikidataAnalyzer.java:76) Mar 30 09:29:19 stat1007 java[3403]: at org.wikidata.analyzer.WikidataAnalyzer.main(WikidataAnalyzer.java:38) Mar 30 09:29:19 stat1007 systemd[1]: wmde-toolkit-analyzer-build.service: Main process exited, code=exited, status=1/FAILURE Mar 30 09:29:19 stat1007 systemd[1]: wmde-toolkit-analyzer-build.service: Failed with result 'exit-code'.
Used jdb to get more info:
main[1] print this.counters this.counters = "{property.statements.avg=0.0, item.statements.avg=0.0}"
The code is https://github.com/wikimedia/analytics-wmde-toolkit-analyzer/blob/master/analyzer/src/main/java/org/wikidata/analyzer/Processor/MetricProcessor.java#L37, so I guess that for some reason item.* is not incremented at all?
Change 677087 had a related patch set uploaded (by Elukey; author: Elukey):
[operations/puppet@production] statistics::wmde::graphite: disable wmde-toolkit-analyzer-build timer
Change 677087 merged by Elukey:
[operations/puppet@production] statistics::wmde::graphite: disable wmde-toolkit-analyzer-build timer
I have removed the systemd timer for the time being, it was constantly failing and causing alerts.
The analyzer had apparently been broken even before these JAR / NPE issues, see T218711: Regular wikidata JSON dump scanning broken on analytics machine.