Page MenuHomePhabricator

Migrate dead external links to archives
Closed, ResolvedPublic

Description

This card tracks a top 10 wish from the Community Wishlist Survey: https://meta.wikimedia.org/wiki/2015_Community_Wishlist_Survey

Original proposal: Most external links have an average lifespan of about 7 years before they go dead. As Wikipedia ages, the dead external links problem grows exponentially. Internet Archive has partnered with Wikipedia to ensure all new external links have a Wayback cache. However there has been no formal process of adding the Wayback links to Wikipedia (via the cite web |archiveurl=.. feature for example). There have been attempts to automate with various bots (see en:WP:Link rot) but the coding is non-trivial and multiple volunteer efforts have stalled. Likely what will be required is a team of programmers working full-time, something that is beyond the scope of a few volunteers working spare time. It's the sort of coding work that MediaWiki could sponsor and make a big difference in the quality of content, impacting every article. -- Green Cardamom (talk) 19:27, 7 November 2015 (UTC)

Community Tech preliminary assessment

Support: High. Dead reference links hurt our projects' reliability and verifiability, and connecting deadlinks with an archive supports the usefulness of our content. There were some dissents in the voting phase, pointing out that it's better when humans find the appropriate alternative links, rather than a bot that might not choose the right one.

Impact: High. Improving the quality of citations helps readers as well as contributors. There are some bots currently running on English, French and Spanish Wikipedias. We want to help build solutions that can be adapted to every language.

Feasibility: High. Cyberbot II is currently active on English Wikipedia, and Elvisor on Spanish Wikipedia. Cyberpower678's work on Cyberbot is being supported by The Wikipedia Library and the Internet Archive. There is obviously good work being done here, and we can figure out how to best support it, and help it to scale globally.

Risk: Low. Cyberbot II is running on English Wikipedia, with no major issues encountered. It may be challenging to integrate with other wikis’ citation templates.

Project page: https://meta.wikimedia.org/wiki/Community_Tech/Migrate_dead_external_links_to_archives

InternetArchiveBot code: https://github.com/cyberpower678/Cyberbot_II

Related Objects

StatusSubtypeAssignedTask
ResolvedCyberpower678
ResolvedNone
ResolvedSadads
ResolvedNone
Resolvedkaldari
ResolvedNiharika
ResolvedCyberpower678
ResolvedCyberpower678
ResolvedNiharika
ResolvedNiharika
ResolvedNiharika
ResolvedNiharika
ResolvedNiharika
ResolvedNiharika
ResolvedNiharika
ResolvedNiharika
ResolvedCyberpower678
ResolvedNiharika
ResolvedCyberpower678
ResolvedCyberpower678
ResolvedNiharika
ResolvedNiharika
ResolvedCyberpower678
ResolvedNiharika
ResolvedNiharika
ResolvedCyberpower678
InvalidNiharika
ResolvedNiharika
ResolvedMusikAnimal
ResolvedCyberpower678
ResolvedCyberpower678
ResolvedCyberpower678
ResolvedCyberpower678
ResolvedCyberpower678
OpenNone
ResolvedCyberpower678
DeclinedCyberpower678
ResolvedCyberpower678
InvalidCyberpower678
InvalidCyberpower678
ResolvedCyberpower678
ResolvedCyberpower678
ResolvedCyberpower678
ResolvedCyberpower678
ResolvedCyberpower678
ResolvedCyberpower678
ResolvedCyberpower678
ResolvedCyberpower678
ResolvedCyberpower678
ResolvedCyberpower678
ResolvedCyberpower678
ResolvedNone
ResolvedCyberpower678
ResolvedCyberpower678
ResolvedCyberpower678
ResolvedNone
ResolvedCyberpower678
ResolvedCyberpower678
ResolvedCyberpower678
ResolvedCyberpower678
ResolvedCyberpower678
ResolvedCyberpower678
ResolvedCyberpower678
ResolvedCyberpower678
ResolvedCyberpower678
ResolvedCyberpower678
ResolvedCyberpower678
ResolvedCyberpower678
ResolvedCyberpower678
ResolvedAnomie
ResolvedCyberpower678
ResolvedCyberpower678
ResolvedHarej
ResolvedCyberpower678
ResolvedCyberpower678
ResolvedCyberpower678
ResolvedCyberpower678
OpenNone
ResolvedNone
ResolvedCyberpower678
ResolvedCyberpower678
ResolvedCyberpower678
ResolvedCyberpower678
DeclinedNone
ResolvedCyberpower678
ResolvedCyberpower678
ResolvedCyberpower678
ResolvedCyberpower678
ResolvedCyberpower678
ResolvedCyberpower678
ResolvedCyberpower678
ResolvedMusikAnimal
Resolvedbd808
ResolvedMusikAnimal
ResolvedCyberpower678
ResolvedCyberpower678
ResolvedCyberpower678
ResolvedCyberpower678
ResolvedCyberpower678
ResolvedCyberpower678
DeclinedCyberpower678
ResolvedCyberpower678
ResolvedCyberpower678
ResolvedCyberpower678
ResolvedCyberpower678
ResolvedCyberpower678
ResolvedCyberpower678
InvalidNone
ResolvedMusikAnimal
ResolvedCyberpower678
ResolvedCyberpower678
ResolvedCyberpower678
ResolvedCyberpower678
ResolvedNiharika
ResolvedNiharika
InvalidNone

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Wow, I can't help but reminisce at how crude and ugly the code was a year ago. Now, IABot, is really close to being completed for the English Wikipedia. The bot has definitely come very far in such a short time.

kaldari updated the task description. (Show Details)
Cyberpower678 changed the task status from Open to Stalled.Aug 20 2016, 12:04 PM
Cyberpower678 changed the task status from Stalled to Open.Sep 4 2016, 12:31 AM

Closing this ticket, because Community Tech has finished working on this project. Well done, everyone! :)

Okay, sorry about that -- yes, this is still open for your development.

There's a new ticket for Automatic links: T153354: Automatic archive for new external links

@Cyberpower678: I am resetting the assignee of this task because there has not been progress lately (please correct me if I am wrong!). Resetting the assignee avoids the impression that somebody is already working on this task. It also allows others to potentially work towards fixing this task. Please claim this task again when you realistically plan to work on it (via Add Action...Assign / Claim in the dropdown menu). Thanks for your understanding!

@Aklapper Probably this ticket should be closed. IABot is very advanced and working across a couple dozen languages and more in the works. It has basically a team of people working on it with its own Phab tickets etc.. unclear what this ticket would be for now (I wrote the original post quoted in the ticket in 2015)

Cyberpower678 claimed this task.