Page MenuHomePhabricator

Suppress autogenerated reference list regardless of details subref glitches
Open, Needs TriagePublic

Description

Parser tests show that the <references /> tag is emitted during round-trip testing, even when it was autogenerated. This seems to be caused by currently unsupported versions of the details subref syntax, and should be carefully suppressed in all cases of good or bad subrefs.

Example:

Running test Subreferencing attribute blocked without feature flag [html2wt]... FAILED!                                                                                            [221/26590]
/srv/docker-dev/mediawiki/extensions/Cite/tests/parser/subReferencing.txt:12                   
--- /tmp/mwParser-expectedswwYId        2025-03-19 14:18:57.754374968 +0000                    
+++ /tmp/mwParser-actualpoiphq  2025-03-19 14:18:57.754374968 +0000                            
@@ -1 +1,2 @@                                                                                                                                                                                 
-<ref name="a" details="abc">def</ref>                                                         
+<ref name="" details="def"></ref>                                                                                                                                                            
+<references />                                                                                

Look in ReferenceListTagHandler.php, where tests for $dataMw->autoGenerated result in empty text being returned from the function.

Use patch https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Cite/+/1129278 to help expose code paths which continue to emit the references tag. Run tests as follows:

./modules/mediawiki/bin/mwscript tests/parser/parserTests.php --wiki=dev --file=/srv/docker-dev/mediawiki/extensions/Cite/tests/parser/subReferencing.txt  --html2wt

Event Timeline

awight changed the subtype of this task from "Spike" to "Task".

Change #1131002 had a related patch set uploaded (by Awight; author: Awight):

[mediawiki/extensions/Cite@master] Don't html2wt subreferences when they aren't supported

https://gerrit.wikimedia.org/r/1131002

Note to self: It seems that the autogeneration rules which should apply to wt2html were also applied in html2wt by mistake. Specifically, round-tripping to wikitext should not add an empty <references /> tag at the end of the document, if none existed before editing the page in VE. This is responsible for many of our parser tests not playing nice with wt2wt.

Change #1131002 merged by jenkins-bot:

[mediawiki/extensions/Cite@master] Don't html2wt subreferences when they aren't supported

https://gerrit.wikimedia.org/r/1131002

This turns out to be a very old behavior, probably unnoticed due to selective serialization. We'll fix it anyway, to support more round-trip testing.

Change #1133107 had a related patch set uploaded (by Awight; author: Awight):

[mediawiki/extensions/Cite@master] [WIP] Skip autogenerated reference lists when serializing to wikitext

https://gerrit.wikimedia.org/r/1133107

This turns out to be a very old behavior, probably unnoticed due to selective serialization. We'll fix it anyway, to support more round-trip testing.

No, specifically,

Atogenerated references ... are not suppressed when serializing because apparently that's the behaviour Parsoid clients want.

Change #1133107 abandoned by Awight:

[mediawiki/extensions/Cite@master] [WIP] Skip autogenerated reference lists when serializing to wikitext

Reason:

The tag should be included in wikitext, for some use cases such as Flow. These clients disable selser which makes it possible for the tag to appear in output.

https://gerrit.wikimedia.org/r/1133107

Change #1133368 had a related patch set uploaded (by Awight; author: Awight):

[mediawiki/extensions/Cite@master] [WIP] Clean up references tag handler

https://gerrit.wikimedia.org/r/1133368

Change #1133408 had a related patch set uploaded (by Awight; author: Awight):

[mediawiki/extensions/Cite@master] Normalize input wikitext for better round-trip coverage

https://gerrit.wikimedia.org/r/1133408

After learning more about the problem: this is actually desirable behavior in some use cases such as Flow and CX, where the reference list tag should be added close to the context of any new ref tags. It's also harmless in the main editing case thanks to selective serialization.

So I've edited our parser tests to provide more normalized inputs, increasing the number which can be switched to full round-trip tests.

Change #1133411 had a related patch set uploaded (by Awight; author: Awight):

[mediawiki/extensions/Cite@master] Try to clarify comment about autogenerated <references> tag

https://gerrit.wikimedia.org/r/1133411

awight removed awight as the assignee of this task.Apr 2 2025, 1:56 PM
awight moved this task from Doing to Tech Review on the WMDE-TechWish-Sprint-2025-04-02 board.

Change #1133875 had a related patch set uploaded (by Thiemo Kreuz (WMDE); author: Thiemo Kreuz (WMDE)):

[mediawiki/extensions/Cite@master] Clean up references tag handler, step 1

https://gerrit.wikimedia.org/r/1133875

Change #1133408 merged by jenkins-bot:

[mediawiki/extensions/Cite@master] Normalize input wikitext for better round-trip coverage

https://gerrit.wikimedia.org/r/1133408

Change #1133875 merged by jenkins-bot:

[mediawiki/extensions/Cite@master] Clean up references tag handler, step 1

https://gerrit.wikimedia.org/r/1133875

We found today that the autogenerated default reference list is inserted back into wikitext in some cases which were probably not intended. This seems to happen only if a non-default-group ref is present in the page?

Example:

<ref>Testing default group here</ref>
<ref group="g">Testing with a named group</ref>

We also found that an automatically generated group reference list is rendered in VE but renders empty (no refs) and displays as a generic tag:

image.png (76×193 px, 2 KB)

@awight Just to understand the urgency here better: Could the issues described in the ticket be responsible for user facing dirty diffs or is it more about internal states that should never be user facing?