Page MenuHomePhabricator

Page curation marking pages with citations as having no citations
Closed, ResolvedPublicBUG REPORT

Description

Recently discovered this when I saw one of my own creations in the new pages feed: https://en.wikipedia.org/wiki/Pietro_Campori

Its marked as no citations, but every version of the page had at least one reference. It uses The shortened footnotes format {{sfn}}

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Yup, the {{sfn}} format is what's throwing it off. Page Cuartion specifically looks for <ref> tags.

From the top of my head, I believe there's one place in the code where we're parsing the page. In that same block we could look there for citation-like markup (e.g. <sup>[integer]</sup). There's probably also a more MediaWiki-ish way to detect references.

We could probably put in something in the Cite extension that adds information to the ParserOutput object about references on the page (somewhat similar to how tracking categories and the link/category arrays work), and then check that in PageTriage.

Just to note another example that I found today in case it is helpful: https://en.wikipedia.org/wiki/Action_of_22_August_1917 was a page referenced using {{sfn}}/Harvard style references that also showed up as no citations in the feed.

Restricted Application added a subscriber: Liuxinyu970226. · View Herald Transcript

Change 833888 had a related patch set uploaded (by MPGuy2824; author: MPGuy2824):

[mediawiki/extensions/PageTriage@master] PageCuration: Detect {{Sfn}} as being a reference while applying the "No references" page_tag

https://gerrit.wikimedia.org/r/833888

Change 858991 had a related patch set uploaded (by Kosta Harlan; author: Kosta Harlan):

[mediawiki/extensions/PageTriage@master] [hygiene] Clean up checkReferenceTag method

https://gerrit.wikimedia.org/r/858991

Change 833888 merged by jenkins-bot:

[mediawiki/extensions/PageTriage@master] PageCuration: Detect {{Sfn}} and {{Harvnb}} templates as references

https://gerrit.wikimedia.org/r/833888

MPGuy2824 changed the subtype of this task from "Task" to "Bug Report".Nov 21 2022, 11:17 AM
MPGuy2824 moved this task from Code Review to Waiting for enwiki deploy on the PageTriage board.
MPGuy2824 moved this task from Waiting for enwiki deploy to Done on the PageTriage board.

Change 858991 merged by jenkins-bot:

[mediawiki/extensions/PageTriage@master] [hygiene] Clean up checkReferenceTag method

https://gerrit.wikimedia.org/r/858991