Page MenuHomePhabricator

InternetArchiveBot fix links which still works (but link is to PDF file)
Closed, InvalidPublic


I got report few times from few users on srwiki reports about wrong fixing links from InternetArchiveBot.

Bot marks links to PDF files as dead (but URL work already) and add link to web archive. If URL works, bot must not to mark URLs as dead because URLs are to PDF files.

Event Timeline

Restricted Application assigned this task to Cyberpower678. · View Herald TranscriptSep 22 2018, 9:48 PM
Restricted Application added a subscriber: Petar.petkovic. · View Herald Transcript
Zoranzoki21 triaged this task as High priority.Sep 27 2018, 9:26 PM

This happening on srwiki. I am not sure for other projects. I got more and more reports from users, so I'm set High priority.

I guess it would be useful to provide specific cases where the problem has happened.

Zoranzoki21 added a comment.EditedSep 27 2018, 9:31 PM

I guess it would be useful to provide specific cases where the problem has happened.

Sure, I will.

One of examples of changes:Европско_првенство_у_атлетици_на_отвореном_2016_—_1.500_метара_за_жене&diff=20914916&oldid=20888954
I reported a false positive, but looks as it was ignored.

Cirdan lowered the priority of this task from High to Normal.Oct 14 2018, 6:52 AM
Cirdan added a subscriber: Cirdan.

I don't yet understand how this is specific to PDF files? (To me this seems like a case of a false positive. As with any automated system, the dead link check sometimes yields wrong results which need to be fixed manually.)

I see no malfunction here. You reported an edit where the bot converted the archive URLs, which were already there, into full and proper citations using citation templates.

Cyberpower678 closed this task as Invalid.EditedMar 7 2019, 9:19 PM

So there is no malfunction here. But I will note a number of things I am seeing.

  1. The URLs the bot reformatted are alive, and the bot considers them as alive.
  2. The URLs were archive URLs of the original prior to the bot's edit. Not sure why this was the case in this instance, but IABot did not replace a live original link with an archive.
  3. The bot took the URLs and turned them into complete citation templates. As the bot is currently being instructed to convert plain URLs within references in to full citations, it is doing just that. It has left the archive URL in the archive-url parameter and placed the original one in the url parameter. It has also left a dead-url parameter that is set to no.
  4. If you don't want the bot to be converting to citation templates, administrators have the power to adjust the bot's configuration at This should only be done with consensus.