false is probably buggy)
Closed, ResolvedPublicBUG REPORT
Actions

Assigned To

Authored By

	Bawolff
	Apr 21 2022, 4:26 AM

Description

Split from T299896

So right now, if you want to do just a metadata parser output with no html generated, you have to construct your parser output as new ParserOutput( null ).

Confusingly, if you do something like this:

$pout = new ParserOutput;
$pout->setText( null );
$pout->hasText(); // This returns true;

you don't get the result you expect. Once a ParserOutput is constructed there is no possible way to mark it as not containing text. Even if you never call ->setText, ->hasText() will still return true. This is despite the docs strongly implying that ->setText(null) is what you are supposed to do to mark the ParserOutput as having no text (However, it should be noted, that in almost every place where generate-html is set to false, the actual code does ->setText( '' ), notwithstanding the docs saying that is wrong).

However (with one exception) mediawiki, never constructs them that way, so in essence ParserOutput->hasText will always return true no matter what you do. This suggests that the "generate-html" => false option is probably subtly broken all throughout core.

I think what we should do here:

Make the default value of the first arg to the ParserOutput constructor be null instead of ''
Make ParserOutput::setText( null ) actually set the text to be null instead of being a no-op [Is this a breaking API change?]
The requirements here should be documented more clearly, namely, that if you are respecting generate-html=> false, you must have parser output text set to null. Many places in code will simply half generate the html, and have setText set to a value that contains half the html.

(I'm not that familiar with MCR, so I would appreciate feedback on if i am missing something here)

As an aside, I also wonder if the generate-html optimization is worth all this complexity, or if in the cases it is used, we eventually need to do a full parse later on in the request, and it causes us to parse the page twice where otherwise we'd be able to reuse the parse.

Details

Subject	Repo	Branch	Lines +/-
Clarify generate-html and make ParserOutput behave as expected	mediawiki/core	master	+46 -17
Assert that parser output without HTML has no text	mediawiki/extensions/Wikibase	master	+1 -2
Don’t assert text of parser output without HTML	mediawiki/extensions/Wikibase	master	+2 -1
Fix EntityHandlerTestCase::testPageProperties()	mediawiki/extensions/Wikibase	master	+6 -3

Customize query in gerrit

Related Objects

Mentioned In: T299896: Massmessage's content handler sometimes parses the delivery list as an empty HTML
Mentioned Here: T299896: Massmessage's content handler sometimes parses the delivery list as an empty HTML

Event Timeline

Bawolff created this task.Apr 21 2022, 4:26 AM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptApr 21 2022, 4:26 AM

Bawolff updated the task description. (Show Details)Apr 21 2022, 4:33 AM

Bawolff updated the task description. (Show Details)Apr 21 2022, 5:12 AM

Bawolff updated the task description. (Show Details)Apr 21 2022, 6:46 AM

wctaiwan subscribed.Apr 22 2022, 1:00 AM

wctaiwan mentioned this in T299896: Massmessage's content handler sometimes parses the delivery list as an empty HTML.Apr 22 2022, 1:26 AM

Change 785247 had a related patch set uploaded (by Brian Wolff; author: Brian Wolff):

[mediawiki/core@master] Clarify generate-html and make ParserOutput behave as expected

https://gerrit.wikimedia.org/r/785247

gerritbot added a project: Patch-For-Review.Apr 22 2022, 5:57 AM

Change 785276 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[mediawiki/extensions/Wikibase@master] Fix EntityHandlerTestCase::testPageProperties()

https://gerrit.wikimedia.org/r/785276

Change 785282 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[mediawiki/extensions/Wikibase@master] Don’t assert text of parser output without HTML

https://gerrit.wikimedia.org/r/785282

Change 785297 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[mediawiki/extensions/Wikibase@master] Assert that parser output without HTML has no text

https://gerrit.wikimedia.org/r/785297

Change 785276 merged by jenkins-bot:

[mediawiki/extensions/Wikibase@master] Fix EntityHandlerTestCase::testPageProperties()

https://gerrit.wikimedia.org/r/785276

ReleaseTaggerBot added a project: MW-1.39-notes (1.39.0-wmf.9; 2022-04-25).Apr 22 2022, 3:00 PM

Change 785282 merged by jenkins-bot:

[mediawiki/extensions/Wikibase@master] Don’t assert text of parser output without HTML

https://gerrit.wikimedia.org/r/785282

Change 785247 merged by jenkins-bot:

[mediawiki/core@master] Clarify generate-html and make ParserOutput behave as expected

https://gerrit.wikimedia.org/r/785247

ReleaseTaggerBot edited projects, added MW-1.39-notes (1.39.0-wmf.12; 2022-05-16); removed MW-1.39-notes (1.39.0-wmf.9; 2022-04-25).May 3 2022, 10:01 AM

Change 785297 merged by jenkins-bot: