Page MenuHomePhabricator

IABot replacing plus sign with space (and thus breaking archiveurls)
Closed, InvalidPublic

Description

IABot has started replacing the + character with %20 (space) in archiveurls. These are not equivalent characters, and in some cases makes the archiveurl no longer functional:

https://en.wikipedia.org/w/index.php?title=Yawkey_(MBTA_station)&curid=13271384&diff=780023300&oldid=757084321
https://en.wikipedia.org/w/index.php?title=Porter_(MBTA_station)&curid=966328&diff=780004334&oldid=766578010

Event Timeline

Restricted Application added subscribers: Elisfkc, Aklapper. · View Herald Transcript

Actually they are equivalent characters. They represent whitespace in the query strings. This is an issue with the Wayback machine, and the devs are looking into it.

Okay, good to know. Could the + --> %20 swap be temporarily disabled until the Wayback issue is fixed?

If it's causing too many issues, https://en.wikipedia.org/wiki/User:InternetArchiveBot/Dead-links.js can be adjusted so the "convert_archives" parameter is set to 0, disabling archive conversion.

Otherwise, I'd say just leave it.