Page MenuHomePhabricator

Global, better URL to citation conversion functionality
Closed, ResolvedPublic


Author: vladjohn2013

Global, better URL to citation conversion functionality

Suppose, in Wikipedia, all that needed to be done to generate a perfect citation was to provide a URL? That would be a tremendous step toward getting a much higher percentage of text in Wikipedia articles to be supported by inline citations.

There are already expanders (for the English Wikipedia, at least) that will convert an ISBN, DOI, or PMID, supplied by an editor, into a full, correct citation (footnote). These are in the process of being incorporated into the reference dialog of the VisualEditor extension, making it almost trivial (two clicks, paste, two clicks) to insert a reference.

For web pages, however, the existing functionality seems to be limited to a Firefox add-on. Its limits, besides the obvious requirement to use that browser (and to install the add-on), include an inability to extract the author and date from even the most standard pages (e.g., New York Times), and the lack of integration with MediaWiki.

For a similar approach, using a different plug-in/program, see this Wikipedia page about Zotero.

A full URL-to-citation engine would use the existing Cite4Wiki (Firefox add-on) code, perhaps, plus (unless these exist elsewhere) source-specific parameter specifications. For example, the NYT uses "<meta name="author" content="NICK BILTON" />" for its author information; that format would be known by the engine (via a specifications database). Each Wikipedia community would be responsible for coding these (except for a small starter set, as examples), in the way that communities are responsible for TemplateData for the new VisualEditor extension.

(Project idea suggested by John Broughton.)


Version: unspecified
Severity: enhancement



Event Timeline

bzimport raised the priority of this task from to Low.Nov 22 2014, 2:34 AM
bzimport set Reference to bz57804.
bzimport added a subscriber: Unknown Object (MLST).

vladjohn2013 wrote:

This proposal has been listed at and we are filing a report to gather community feedback and share updates.

We are, in fact, investigating this in the context of VisualEditor, specifically the possibility of setting up a WMF web service for this that can be used by VisualEditor and wikitext editors alike to quickly generate citations. It'd be ideal to be able to share code with Zotero which has a comprehensive suite of URL-to-citation translators already.


This project was listed as a possible project for the FOSS Outreach Program for Women and I'd very much like to work on it.

I've written a very rough proposal here:

However, I would need a mentor to help nail down the specs of the project and then oversee it. I don't to step on anyone's toes regarding the project Erik mentioned (a WMF web service to provide citations to both the visual editor and the wikitext editor)- it sounds like some of what I proposed might replicate that. I'd be happy to work on this from that angle as well but I'm not sure what other people have done already.

Is there anyone on this thread that could mentor such a project? Thanks!


Marielle - I'd be happy to work with you on nailing down the specs, and I'd also be happy to assist with QA (testing). If you do decide to create a bot to clean up naked urls in references/citations, I can also assist you with getting permission to run the bot at (per [[WP:BOTREQ]]).

However, I can't offer technical assistance - it's been far too many years since I did programming, in languages that aren't particularly useful today.

Finally, I note that the list of existing tools, at , omits a significant number of things.

See for a more comprehensive list, and also see .

John- thank you for the links, they were very helpful!

The project for an engine to provide citations to VE was approved for the May 2014 round for OPW. Extension is tentatively called although I'm not set on the name.

Since you mention NYTimes, they're proving to be quite problematic because they treat scrapers as logged out users (so scrapers get redirected to a log-in page), and they've also disallowed caching by Google so you can't use the cached page as a back-up solution. If anyone has any ideas let me know!

Qgil claimed this task.
Qgil added a project: Citoid.

Resolving, then. Check Citoid for details.