Page MenuHomePhabricator

IABot marking links dead when not dead
Closed, ResolvedPublic

Description

Diff

https://en.wikipedia.org/w/index.php?title=Global_catastrophic_risk&type=revision&diff=891329064&oldid=891196083

Three links saved. The first two are still live. The database was Live state but this diff switched it over to dead. I set the database back to Live. Something might not be right with dead link checker. Will keep this ticket open for a bit while researching more diffs.

Event Timeline

Restricted Application added a subscriber: Cyberpower678. · View Herald Transcript
Cirdan subscribed.

False positives and false negatives are expected to occur regularly. Unless these mistakes clearly point to an issue with the dead link checker, there is nothing that can be done here except correcting the links in the database.

(Feel free to re-open if you think that you found a bug in the dead link checker.)

Please do not close this ticket. The only way we can track this issue is by posting examples. I understand there will always be false positives just not high quantity.

I don't know how to emphasize this but the bot is making a lot of mistakes. Marking links dead that are not dead. Frequently. One only needs to go through and click on links and look at them. It's a frequent problem. Each of these mistakes should be looked into.

Another example:

https://en.wikipedia.org/w/index.php?title=Aleksandra_Maltsevskaya&diff=prev&oldid=896737057

This link is not dead. A basic header check show status 200

HTTP/1.1 200 OK
Server: nginx
Date: Sun, 12 May 2019 15:20:49 GMT
Content-Type: text/html; charset=UTF-8
Transfer-Encoding: chunked
Connection: close
X-Pingback: http://rostovoblchess.ru/xmlrpc.php
Link: <https://rostovoblchess.ru/wp-json/>; rel="https://api.w.org/"
Link: <https://rostovoblchess.ru/?p=44579>; rel=shortlink
Vary: Accept-Encoding

The URL was not marked dead in the database, and the domain has no global live state set.

If it was an occasional problem who cares. But it is frequent.

Cirdan changed the task status from Open to Stalled.May 23 2019, 5:07 PM

What do you suggest?

If this is about detecting links as live/dead, this is probably a topic for the CheckIfDead tool rather than the bot?

Is this still happening? I'm thinking whatever it was just a transient issue. The URLs you have reset haven't gone back to be dead and they've been getting checked regularly since they were reset.

I'm classifying this as a transient issue that no longer applies.