Page MenuHomePhabricator

Reject crawler updates from multiple urls in same run
Closed, ResolvedPublic

Description

The name field of a toolinfo record is intended to be universally unique. This constraint is enforced by the Toolhub database. We are enforcing that the record for a given name is only changed by the crawler, but currently it is possible for multiple crawled urls to include records with the same name. This leads to "edit warring" between the urls which is not desired.

It should be possible for a tool's toolinfo to migrate from one url to another (eg from self-hosted to striker, or vice versa), but it should only be present in one url at a time.

Event Timeline

Change 674099 had a related patch set uploaded (by BryanDavis; owner: Bryan Davis):
[wikimedia/toolhub@main] crawler: reject updates from multiple urls in single run

https://gerrit.wikimedia.org/r/674099

Change 674099 merged by jenkins-bot:
[wikimedia/toolhub@main] crawler: reject updates from multiple urls in single run

https://gerrit.wikimedia.org/r/674099