Page MenuHomePhabricator

Avoid adding duplicate archive links when rescuing dead links with IABot
Closed, ResolvedPublic5 Estimated Story Points

Description

If a raw reference (that doesn't use a citation template) already includes an archive URL, IABot will sometimes add a duplicate copy of the archive link:
https://en.wikipedia.org/w/index.php?title=Tom_Monaghan&type=revision&diff=729652760&oldid=723647767

We should figure out some way to avoid this.

Event Timeline

kaldari created this task.Jul 13 2016, 8:38 PM
Restricted Application added subscribers: Zppix, Aklapper. · View Herald TranscriptJul 13 2016, 8:38 PM
kaldari updated the task description. (Show Details)Jul 13 2016, 8:39 PM
DannyH set the point value for this task to 5.Jul 14 2016, 5:14 PM
DannyH moved this task from To Be Estimated/Discussed to Estimated on the Community-Tech board.

This is one is going to be tricky but I have an idea on how to intelligently approach. This method will even work outside of reference if done right.

I fashioned up a subroutine that should help the bot to detect this, so now it's on to the testing phase. Wish me luck.

I've got it partially working. Need to fine tune it.

https://en.wikipedia.org/w/index.php?title=Tom_Monaghan&diff=next&oldid=729924113

Per ^, I would say the upgrade works like a charm. If there are any other examples I can test this on, before I deploy that would be great.

Cyberpower678 closed this task as Resolved.Jul 16 2016, 12:25 PM
kaldari moved this task from Estimated to Archive on the Community-Tech board.Jul 19 2016, 5:04 PM