Page MenuHomePhabricator

Some toolinfo.json imports being treated as changed on every crawler run?
Closed, ResolvedPublicBUG REPORT

Description

There are a number of tools in both my local dev instance and the other deployments which are being given a new revision in each crawler run, but also show empty diffs between runs. One example is http://localhost:8000/tools/mm_metamine/history.

Event Timeline

I think there is a None vs "" (empty string) differentiation problem causing this.

{"@timestamp":"2021-10-13T21:36:34.865Z","log.level":"info","message":"mm_metamine: Updating repository to  (was None)","ecs":{"version":"1.7.0"},"log":{"logger":"toolhub.apps.toolinfo.models","origin":{"file":{"line":315,"name":"models.py"},"function":"from_toolinfo"}},"process":{"name":"MainProcess","pid":138,"thread":{"id":140610652100416,"name":"MainThread"}},"trace":{"id":"none"}}
bd808 changed the task status from Open to In Progress.Oct 13 2021, 9:52 PM
bd808 claimed this task.
bd808 triaged this task as Medium priority.
bd808 moved this task from Research needed to In Progress on the Toolhub board.

Change 730649 had a related patch set uploaded (by BryanDavis; author: Bryan Davis):

[wikimedia/toolhub@main] crawler: Guard against empty string/null diff loops

https://gerrit.wikimedia.org/r/730649

Change 730649 merged by jenkins-bot:

[wikimedia/toolhub@main] crawler: Guard against empty string/null diff loops

https://gerrit.wikimedia.org/r/730649

Change 730662 had a related patch set uploaded (by BryanDavis; author: Bryan Davis):

[operations/deployment-charts@master] toolhub: Bump container version to 2021-10-13-231209-production

https://gerrit.wikimedia.org/r/730662

bd808 changed the subtype of this task from "Task" to "Bug Report".Oct 13 2021, 11:27 PM

Change 730662 merged by jenkins-bot:

[operations/deployment-charts@master] toolhub: Bump container version to 2021-10-13-231209-production

https://gerrit.wikimedia.org/r/730662