Avoid adding duplicate archive links when rescuing dead links with IABot
Closed, ResolvedPublic5 Story Points

Description

If a raw reference (that doesn't use a citation template) already includes an archive URL, IABot will sometimes add a duplicate copy of the archive link:
https://en.wikipedia.org/w/index.php?title=Tom_Monaghan&type=revision&diff=729652760&oldid=723647767

We should figure out some way to avoid this.

kaldari created this task.Jul 13 2016, 8:38 PM
Restricted Application added subscribers: Zppix, Aklapper. · View Herald TranscriptJul 13 2016, 8:38 PM
kaldari edited the task description. (Show Details)Jul 13 2016, 8:39 PM
kaldari moved this task from Untriaged to Sprint planning/estimation on the Community-Tech board.
Cyberpower678 triaged this task as "Normal" priority.
DannyH set the point value for this task to 5.Jul 14 2016, 5:14 PM
DannyH moved this task from Sprint planning/estimation to Backlog on the Community-Tech board.

This is one is going to be tricky but I have an idea on how to intelligently approach. This method will even work outside of reference if done right.

I fashioned up a subroutine that should help the bot to detect this, so now it's on to the testing phase. Wish me luck.

I've got it partially working. Need to fine tune it.

https://en.wikipedia.org/w/index.php?title=Tom_Monaghan&diff=next&oldid=729924113

Per ^, I would say the upgrade works like a charm. If there are any other examples I can test this on, before I deploy that would be great.

Cyberpower678 closed this task as "Resolved".Jul 16 2016, 12:25 PM
kaldari moved this task from Backlog to Archive on the Community-Tech board.Jul 19 2016, 5:04 PM