Page MenuHomePhabricator

Attempt to get original source URL from archive.org URL or other metadata
Closed, DuplicatePublic8 Estimated Story Points

Description

When given an archive.org URL, it'd be nice if citoid attempted to extract the original source URL and put the archive.org URL in the archive URL parameter.

[Presumably total wishlist/long term.]

Event Timeline

LuisVilla created this task.Apr 8 2015, 4:33 AM
LuisVilla raised the priority of this task from to Needs Triage.
LuisVilla updated the task description. (Show Details)
LuisVilla added a project: Citoid.
LuisVilla added a subscriber: LuisVilla.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptApr 8 2015, 4:33 AM
Mvolz moved this task from Backlog to Site specific issues on the Citoid board.Apr 8 2015, 8:49 AM
Mvolz moved this task from Site specific issues to IO Tasks on the Citoid board.Apr 8 2015, 8:57 AM

Note that https://tools.wmflabs.org/refill/ has code to do the right thing here.

The relevant code is available here.

jayvdb added a subscriber: jayvdb.Sep 3 2015, 1:56 AM

archive.org support Memento (RFC 7089), so if Citeoid can do a HEAD request on the URL, rich metadata can be obtained from the headers.
I dont know if there is a Memento client in Node.js, but example code can be found in https://github.com/mementoweb/py-memento-client and probably https://github.com/mementoweb/mediawiki .

Mvolz added a subscriber: Mvolz.Sep 3 2015, 7:13 AM

Thanks @jayvb! If there's no suitable Node library we can build one...
We've already done some of that already with different types of metadata (
see http://github.com/wikimedia/html-metadata )

Ocaasi set Security to None.Sep 3 2015, 6:52 PM
Ocaasi added a subscriber: Jdforrester-WMF.
Ocaasi added a subscriber: Ocaasi.
LuisVilla renamed this task from Attempt to parse archive.org URLs? to Attempt to get original source URL from archive.org URL or other metadata.Mar 27 2016, 6:15 AM
Jdforrester-WMF triaged this task as Low priority.Apr 12 2016, 7:22 PM
Jdforrester-WMF set the point value for this task to 8.
Jdforrester-WMF moved this task from To Triage to Freezer on the VisualEditor board.
Mvolz moved this task from IO Tasks to Service on the Citoid board.Oct 28 2016, 3:35 PM
Mvolz moved this task from Service to Zotero on the Citoid board.Oct 28 2016, 3:41 PM