Page MenuHomePhabricator

Avoid adding duplicate archive links when rescuing dead links with IABot
Closed, ResolvedPublic5 Estimated Story Points

Description

If a raw reference (that doesn't use a citation template) already includes an archive URL, IABot will sometimes add a duplicate copy of the archive link:
https://en.wikipedia.org/w/index.php?title=Tom_Monaghan&type=revision&diff=729652760&oldid=723647767

We should figure out some way to avoid this.

Event Timeline

DannyH set the point value for this task to 5.Jul 14 2016, 5:14 PM
DannyH moved this task from Needs Discussion to Up Next (June 3-21) on the Community-Tech board.

This is one is going to be tricky but I have an idea on how to intelligently approach. This method will even work outside of reference if done right.

I fashioned up a subroutine that should help the bot to detect this, so now it's on to the testing phase. Wish me luck.

I've got it partially working. Need to fine tune it.

https://en.wikipedia.org/w/index.php?title=Tom_Monaghan&diff=next&oldid=729924113

Per ^, I would say the upgrade works like a charm. If there are any other examples I can test this on, before I deploy that would be great.