Page MenuHomePhabricator

sul-swap-prod.stanford.edu
Closed, ResolvedPublic

Description

Diff (line #219):

https://en.wikipedia.org/w/index.php?title=Israel_lobby_in_the_United_States&action=history

It swapped a working archiveurl at Stanford archive with a non-working archive at Wayback.

Is it because it doesn't recognize sul-swap-prod.stanford.edu as an archive? It was recently discovered.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Another question with the diff: It rescued 17 links, but IABot ran 9 days prior when it rescued 7 sources. Wouldn't it rescue all sources (17 + 7) at the same time, or is there a difference in how IABot runs in auto vs manual mode?

I'll add it to the list.

Another question with the diff: It rescued 17 links, but IABot ran 9 days prior when it rescued 7 sources. Wouldn't it rescue all sources (17 + 7) at the same time, or is there a difference in how IABot runs in auto vs manual mode?

Do you remember the checks it does to dead links? 3 checks in a row, spaced at least 3 days apart?

I thought maybe the dead link checker was a separate process running ahead of IABot, and IABot follows behind by about 30 days. But it sounds like IABot feeds links to the DLC and when IABot process it a second time, whatever fixes were added to the database by the DLC are corrected. Is that kind of how it works?

I thought maybe the dead link checker was a separate process running ahead of IABot, and IABot follows behind by about 30 days. But it sounds like IABot feeds links to the DLC and when IABot process it a second time, whatever fixes were added to the database by the DLC are corrected. Is that kind of how it works?

Actually the DB IABot relies on uses states, stored as ints. All newly encountered URLs have a state of 4, meaning unknown. After the first check, it is either set to state 3, meaning alive, or to 2, meaning dying. If it fails the second check, the state drops by 1. If it hits 0, it's considered dead, and the bot no longer checks the URL.

Fixed in final release.