Attempt to get original source URL from archive.org URL or other metadata
Closed, DuplicatePublic8 Estimated Story Points
Actions

Assigned To

None

Authored By

	LuisVilla
	Apr 8 2015, 4:33 AM

Description

When given an archive.org URL, it'd be nice if citoid attempted to extract the original source URL and put the archive.org URL in the archive URL parameter.

[Presumably total wishlist/long term.]

Event Timeline

LuisVilla created this task.Apr 8 2015, 4:33 AM

LuisVilla raised the priority of this task from to Needs Triage.

LuisVilla updated the task description. (Show Details)

LuisVilla added a project: Citoid.

LuisVilla subscribed.

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptApr 8 2015, 4:33 AM

Mvolz moved this task from Backlog to Site specific issues on the Citoid board.Apr 8 2015, 8:49 AM

Mvolz moved this task from Site specific issues to IO Tasks on the Citoid board.Apr 8 2015, 8:57 AM

Note that https://tools.wmflabs.org/refill/ has code to do the right thing here.

The relevant code is available here.

archive.org support Memento (RFC 7089), so if Citeoid can do a HEAD request on the URL, rich metadata can be obtained from the headers.
I dont know if there is a Memento client in Node.js, but example code can be found in https://github.com/mementoweb/py-memento-client and probably https://github.com/mementoweb/mediawiki .

Thanks @jayvb! If there's no suitable Node library we can build one...
We've already done some of that already with different types of metadata (
see http://github.com/wikimedia/html-metadata )

Ocaasi set Security to None.Sep 3 2015, 6:52 PM

Ocaasi added a subscriber: Jdforrester-WMF.

Ocaasi subscribed.

LuisVilla renamed this task from Attempt to parse archive.org URLs? to Attempt to get original source URL from archive.org URL or other metadata.Mar 27 2016, 6:15 AM

Restricted Application added projects: VisualEditor, Internet-Archive. · View Herald TranscriptMar 27 2016, 6:15 AM

Jdforrester-WMF triaged this task as Low priority.Apr 12 2016, 7:22 PM

Jdforrester-WMF set the point value for this task to 8.

Jdforrester-WMF moved this task from To Triage to Freezer on the VisualEditor board.

Mvolz moved this task from IO Tasks to Service on the Citoid board.Oct 28 2016, 3:35 PM

Mvolz moved this task from Service to Zotero on the Citoid board.Oct 28 2016, 3:41 PM

Mvolz closed this task as a duplicate of T98680: Improve results for archive.org including wayback.archive.org.Oct 28 2016, 3:44 PM

Attempt to get original source URL from archive.org URL or other metadataClosed, DuplicatePublic8 Estimated Story PointsActions

Description

Event Timeline

Attempt to get original source URL from archive.org URL or other metadata
Closed, DuplicatePublic8 Estimated Story Points
Actions