Page MenuHomePhabricator

IABot sometimes corrupts data due to an undetected edit conflict
Open, MediumPublicBUG REPORT

Description

The background to this is on this enwiki VPP thread

Steps to replicate the issue (include links if applicable):

  1. Set the moon to the required (but as of yet, uncertain) phase
  2. Run IABot via the interface at https://iabot.wmcloud.org/index.php?page=runbotsingle.
  3. Continue to edit the article you attempted to run IABot on
  4. Wait a few hours
  5. Observe that your edits in the previous step have been reverted

What happens?:
Under certain (unknown) circumstances, the web interfaces fails after a substantial delay, returning a 50x series error. It would appear, however, that a job has been queued which will run at some indeterminate time in the future. When it runs, it will make edits to the page content it grabbed earlier, which is no longer the current revision. When it writes the updated page back to the wiki, it will fail to notice the edit conflict and overwrite the intervening changes.

Example: https://en.wikipedia.org/w/index.php?diff=1179698559

What should have happened instead?:
It should have noticed the edit conflict and either aborted the write or grabbed a fresh copy of the page.

Note that there is a suggestion in the cited thread that IABot is passing the correct information to the wiki API to detect the edit conflict and that it's the Mediawiki software that's failing. I don't know if that's true or not, but either way, I would classify this a a high priority issue because it causes silent data corruption. In my case, I didn't notice the problem until I had made some additional edits after IABot's revision, so it was too late for undo. Fortunately, the volume of changes which needed to be replayed manually were fairly small so it wasn't a huge problem. Depending on the timing, however, a far greater range of revisions could be overwritten and a far greater amount of post-IABot changes could have been performed, making recovery substantially more difficult.

Software version

  • IABot Management Interface (v2.0.9.5; checkIfDead v1.8.3.7)

Other information (browser name/version, screenshots, etc.):

  • Google Chrome Version 117.0.5938.149 (Official Build) (x86_64)
  • MacOSMonterey version 12.6.1 (21G217)

Event Timeline

This could be resolved by passing the baserevid param instead of basetimestamp. Self-conflicts are ignored only in case of basetimestamp.

I don't recall this ever existing before. Was this recently introduced?

Yes, compared to the other API params. It was added in March 2020. Use of basetimestamp nowadays is almost-but-not-quite deprecated.

That's interesting. That may be simple enough to implement, I think.

Harej triaged this task as Medium priority.Oct 25 2023, 8:46 PM
Harej moved this task from Inbox to Backlog: Editing and Saving on the InternetArchiveBot board.

@Harej, I disagree with prioritizing this as "medium". This involves silent data loss. That should never be a medium priority. It also sounds like the fix has already been identified (use baserevid instead of basetimestamp) and that fix is probably trivial to implement. So why toss it on the medium heap, which basically means it'll never get fixed?

While it's a hazardous issue, it's a fairly uncommon one. Prioritization is based on frequency and level of service disruption.