Page MenuHomePhabricator

Page curation marking pages with citations as having no citations
Open, Needs TriagePublic

Description

Recently discovered this when I saw one of my own creations in the new pages feed: https://en.wikipedia.org/wiki/Pietro_Campori

Its marked as no citations, but every version of the page had at least one reference. It uses The shortened footnotes format {{sfn}}

Event Timeline

Restricted Application added a project: Collaboration-Team-Triage. · View Herald TranscriptJun 20 2017, 1:23 AM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
MusikAnimal added a subscriber: MusikAnimal.EditedJun 20 2017, 1:56 AM

Yup, the {{sfn}} format is what's throwing it off. Page Cuartion specifically looks for <ref> tags.

From the top of my head, I believe there's one place in the code where we're parsing the page. In that same block we could look there for citation-like markup (e.g. <sup>[integer]</sup). There's probably also a more MediaWiki-ish way to detect references.

We could probably put in something in the Cite extension that adds information to the ParserOutput object about references on the page (somewhat similar to how tracking categories and the link/category arrays work), and then check that in PageTriage.

Just to note another example that I found today in case it is helpful: https://en.wikipedia.org/wiki/Action_of_22_August_1917 was a page referenced using {{sfn}}/Harvard style references that also showed up as no citations in the feed.

Restricted Application added a project: Growth-Team. · View Herald TranscriptOct 12 2019, 4:11 AM
Restricted Application added a subscriber: Liuxinyu970226. · View Herald Transcript