Page MenuHomePhabricator

Clean up implementation for "follow" cases
Open, Needs TriagePublic8 Estimate Story Points


While rewriting the Cite extension, we tiptoed around two edge cases for the follow feature (see manual). This is something used on Wikisource when splitting long footnotes broken across pages in the published document. The PHP implementation of that feature causes considerable technical debt in Cite, this task's scope is to cleaned up most of the debt, or documented a path to fixing in the future. Don't change behavior if possible, but if there are changes please announce them as part of this task.

Existing logic:

  • In normal usage, <ref name="first">First text</ref> will precede <ref follow="first">Second text</ref>, producing a footnote marker for only the first reference, and the text "First text Second text" in the references section.
  • If the two refs are swapped so that follow comes before the "main" ref, the output will be broken. We won't show a footnote mark in the article body, but the follow ref's text will appear at the top of the references section, without any red error indicators. This is different than how any other erroring ref is handled, it seems justifiable to at least paint as an error, or make fully consistent with the other error types.
  • There are a ton of smaller edge cases, which may or may not be worth listing here.

After the in-depth investigation in T240858#5777804 we believe the current behavior was never a formal requirement, but an arbitrary fallback the original developer came up with, but a fallback that's not even relevant in production. We decided to consider the behavior a bug and fix it. The misplaced follow will not be displayed any more on top of the references list, but instead show an error in-place.


Suggested wording for the User-notice:

In 2010, a <ref follow="…"> syntax was introduced. It allows to merge footnotes that follow each other. It's meant to be used for digitized books on Wikisource. If the order of such consecutive references is wrong, no error was reported but the bad <ref> shown outside of the <references> list. This will change and a red error message be shown instead.


Related Gerrit Patches:
mediawiki/extensions/Cite : masterStandardize "follow" validation
mediawiki/extensions/Cite : masterRemove "follow" special case from ReferencesFormatter
mediawiki/extensions/Cite : masterSpecific error for missing follows parent
mediawiki/extensions/Cite : masterRemove broken "follow" special case from ReferenceStack
mediawiki/extensions/Cite : masterHandle "follow" in Cite

Event Timeline

awight created this task.Dec 16 2019, 1:50 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptDec 16 2019, 1:50 PM

Change 553096 had a related patch set uploaded (by Awight; owner: Awight):
[mediawiki/extensions/Cite@master] [WIP] Refactor follow edge case

awight updated the task description. (Show Details)Dec 16 2019, 8:46 PM
awight updated the task description. (Show Details)Dec 16 2019, 8:48 PM
awight set the point value for this task to 8.Dec 19 2019, 10:43 AM

Change 559671 had a related patch set uploaded (by Awight; owner: Awight):
[mediawiki/extensions/Cite@master] Standardize "follow" validation

Change 559672 had a related patch set uploaded (by Awight; owner: Awight):
[mediawiki/extensions/Cite@master] Remove "follow" special case from ReferencesFormatter

Change 559673 had a related patch set uploaded (by Awight; owner: Awight):
[mediawiki/extensions/Cite@master] Remove broken "follow" special case from ReferenceStack

Change 559755 had a related patch set uploaded (by Awight; owner: Awight):
[mediawiki/extensions/Cite@master] Specific error for missing follows parent

Change 559755 abandoned by Awight:
Specific error for missing follows parent

Squashed with I506e4dcd1151671f

What the patch does is removing an edge case where a <ref follow="…"> that points to a parent that either doesn't exist or appears later in the text can not be rendered as expected.

Example wikitext
<ref follow="a">and some more text</ref>
<ref name="a">Here is some text</ref>

== References ==
<references />
Resulting HTML (simplified)
<p><sup ><a >[1]</a></sup></p>

<ol class="references">
    <p >and some more text</p>
    <li ><a ></a> Here is some text</li>

Notice the misplaced <p> at the beginning of the list. This is not only rendered in a weird position, it's entirely disconnected to anything: neither connected to the parent nor to the position in the text where the <ref> originally appeared. I believe it's fair to assume this is useless to a user trying to read and understand a wiki page. The only benefit is that the content appears somewhere on the page.

The patch changes this behavior. The content of the broken <ref> is not rendered any more, nowhere on the page. Instead an error message shows up at the position of the broken <ref>.


Here is the patch that introduced the feature in 2010: The bogus <p> was already part of this original patch.

The patch author is He is not active since 2013. Via"ref+follow"+"ThomasV" I found:

I scanned all these (short) discussions to see if the <p> edge case is mentioned. And indeed, the author mentions it in the original RfC. But nobody picks the stray <p> up in any of the discussions.

TL;DR: It does not look like the discussed behavior was an actual requirement.

Next steps

This change should be announced via (hence the User-notice tag). I'm not sure if there is another communication channel to reach the Wikisource communities. Considering the history of the feature, it seems to be a very good idea to announce it in the frwikisource village pump, ideally in French. @Lea_Lacroix_WMDE might be able to help.

awight changed the point value for this task from 8 to 3.Wed, Jan 8, 12:43 PM
thiemowmde updated the task description. (Show Details)Wed, Jan 8, 4:21 PM
thiemowmde changed the point value for this task from 3 to 8.