Page MenuHomePhabricator

Parse API returns “saved in parser cache” comment even with disablelimitreport set
Closed, ResolvedPublic

Description

Wikibase gate-and-submit builds are failing (example build) because the action=parse API returns an unexpected HTML comment now:

AssertionError: expected '\n<!-- Saved in parser cache with key wikidb:pcache:idhash:29-0!canonical and timestamp 20211109100948 and revision id 34. Serialized with JSON.\n -->\n' to equal ''
+ expected - actual

-
-<!-- Saved in parser cache with key wikidb:pcache:idhash:29-0!canonical and timestamp 20211109100948 and revision id 34. Serialized with JSON.
- -->

Usually, this comment is only there when the disablelimitreport parameter is not set. Compare this English Wikipedia request with this Beta English Wikipedia Request using the same parameters – the Beta version has this extra comment as well:

\n<!-- Saved in parser cache with key enwiki:pcache:idhash:1-0!canonical and timestamp 20211109093033 and revision id 512115. Serialized with JSON.\n -->

Note that this only happens when using page=, not with text= + contentmodel=wikitext. (Which makes sense – with text=, you wouldn’t expect the result to be stored in the parser cache.)

I think this is a regression in MediaWiki core that should be fixed there, rather than being handled in the Wikibase API integration tests – other API users might also expect the action=parse&disablelimitreport= result to not have this comment. Move limit report rendering to ParserOutput looks like the most likely culprit to me.

Event Timeline

Move limit report rendering to ParserOutput looks like the most likely culprit to me.

Confirmed – I managed to reproduce the error locally (node_modules/.bin/mocha --timeout 0 tests/api-testing/LuaWikibaseIntegrationTest.js in the Wikibase directory, assuming you have an .api-testing.config.json config file there), and reverting that commit fixes the errors.

It looks like the commit also anticipates this API change, by updating the ApiParse tests:

diff --git a/tests/phpunit/includes/api/ApiParseTest.php b/tests/phpunit/includes/api/ApiParseTest.php
index 4e54c44729..b9bf2d2425 100644
--- a/tests/phpunit/includes/api/ApiParseTest.php
+++ b/tests/phpunit/includes/api/ApiParseTest.php
@@ -100,6 +100,9 @@ private function doAssertParsedTo( $expected, array $res, $warnings, callable $c
 
         $html = substr( $html, strlen( $expectedStart ) );
 
+        $possibleParserCache = '/\n<!-- Saved in (?>parser cache|RevisionOutputCache) (?>.*?\n -->)\n/';
+        $html = preg_replace( $possibleParserCache, '', $html );
+
         if ( $res[1]->getBool( 'disablelimitreport' ) ) {
             $expectedEnd = "</div>";
             $this->assertSame( $expectedEnd, substr( $html, -strlen( $expectedEnd ) ) );
@@ -112,7 +115,7 @@ private function doAssertParsedTo( $expected, array $res, $warnings, callable $c
         } else {
             $expectedEnd = '#\n<!-- \nNewPP limit report\n(?>.+?\n-->)\n' .
                 '<!--\nTransclusion expansion time report \(%,ms,calls,template\)\n(?>.*?\n-->)\n' .
-                '(\n<!-- Saved in (?>parser cache|RevisionOutputCache) (?>.*?\n -->)\n)?</div>$#s';
+                '</div>$#s';
             $this->assertRegExp( $expectedEnd, $html );
 
             $html = preg_replace( $expectedEnd, '', $html );

But isn’t this something that should be avoided?

Tentatively marking as a train blocker, since I assume it will affect other action=parse users as well.

Change 737685 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[mediawiki/extensions/Wikibase@master] Refactor LuaWikibaseIntegrationTest

https://gerrit.wikimedia.org/r/737685

I'm not sure if this is a train blocker, it adds a HTML comment to an API response, it changes things but it doesn't break anything as far as I see. Tests breaking because heavily depending on the output is a bad pattern ("brittle test"). Also, you can possibly disable limit reporting for the test or the wiki by setting $wgEnableParserLimitReporting to false.

Change 737708 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[mediawiki/extensions/Wikibase@master] Strip \u201Csaved in parser cache\u201D comments from parse response

https://gerrit.wikimedia.org/r/737708

Also, you can possibly disable limit reporting for the test or the wiki by setting $wgEnableParserLimitReporting to false.

Surely that can’t be the right fix, given that there’s already a disablelimitreport API parameter which we send…

But isn’t this something that should be avoided?

The update to the tests made it ready for reordering of the 2 comments in HTML. LimitReport comment now comes after the saved in parser cache comment, not before.

I don't think reordering 2 comments in HTML can be considered a breaking change. Whatever CI started failing because of it should be updated to not be so sensitive.

'disablelimitreport' is a different thing 'saved in parser cache' comment. These are 2 completely different comments from different systems. The reason 'saved in parser cache' started showing up is because we can not save more things in parser cache, and I guess one of your tests now has content that can be saved. There's never been a way to remove 'saved in parser cache' comment from the output.

The updates to the tests I've made were because limit report comment and saved in parser cache comment got reordered.

Agreed with @Ladsgroup that differences in HTML comments not a blocker.

'disablelimitreport' is a different thing 'saved in parser cache' comment. These are 2 completely different comments from different systems. The reason 'saved in parser cache' started showing up is because we can not save more things in parser cache, and I guess one of your tests now has content that can be saved. There's never been a way to remove 'saved in parser cache' comment from the output.

This might all be technically true, but the practical consequence I see is that, previously, disablelimitreport effectively removed the “saved in parser cache” comment, and now it doesn’t anymore. Even if it only did that by making the content uncacheable altogether, that doesn’t change the fact that the API output is now different, in my opinion.

Hm... Indeed you're right, it did remove it by making the content uncacheable. But it was a bad thing.
I still stand firm in my believe that it's a problem with tests, not with the patch :)

What's the possible fix on the server side? We can detect that 'disablelimitreport' was set, and then strip the 'saved in parser cache' comment from the output. But this would be a horrible hack for no reason. If it's possible to fix the tests - let's please do it. We can add the horrible hack but only if you need more time to fix the tests, the hack would only be temporary and will go away.

I have been looking into the code to understand the problem better and I have a hypothesis now. By putting debugger and such in calling the API, I can confirm that it doesn't add the report to the output if disablelimitreport is set but it still shows up in the result which leads to assuming that it's been part of ParserCache output. The fix would be to wait for a month for the old cases to expire.

Specially given the fact that the timestamp in the report doesn't change.

Hm... Indeed you're right, it did remove it by making the content uncacheable. But it was a bad thing.
I still stand firm in my believe that it's a problem with tests, not with the patch :)

What's the possible fix on the server side? We can detect that 'disablelimitreport' was set, and then strip the 'saved in parser cache' comment from the output. But this would be a horrible hack for no reason. If it's possible to fix the tests - let's please do it. We can add the horrible hack but only if you need more time to fix the tests, the hack would only be temporary and will go away.

Well, we can hack around it in our tests, but it’s not really nice either. @Michael said on Gerrit:

Mh, this is ok as a workaround for now to make CI green again, but I think I would prefer to remove this line again when the issue is fixed in core

As the API user, I just want the HTML parse for the Wikitext, and I’m not interested in extra comments the parser wants to add. (I found/remembered another piece of code that assumes there are no comments in the response, though that case probably won’t be broken by this change, since it uses text+contentmodel instead of page.) disablelimitreport has been a fairly reliable way to remove those comments so far – if you think it shouldn’t control the “saved in parser cache” comment, then maybe there could be a separate flag to disable that comment, but I think that should be communicated as a significant change in the output, at least.

From a technical perspective, it’s not clear to me why stripping the “saved in parser cache” comment would have to be such a horrible hack, when you just (apparently) managed to move the limit report around in such a way that parser output can be cached regardless of the “disable limit report” setting – can’t the “saved in parser cache” part be moved into a similar area separate from the main HTML

Change 737708 abandoned by Lucas Werkmeister (WMDE):

[mediawiki/extensions/Wikibase@master] Strip \u201Csaved in parser cache\u201D comments from parse response

Reason:

squashed into parent change

https://gerrit.wikimedia.org/r/737708

ok. I guess I agree. let's squash them together.

We can revert my original patch for now, squashing will take some time...

Change 737685 merged by jenkins-bot:

[mediawiki/extensions/Wikibase@master] Fix LuaWikibaseIntegrationTest

https://gerrit.wikimedia.org/r/737685

Alright, thanks! The Wikibase workaround got merged too, we can later remove the one line that regexps out the comment.

Change 737943 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[mediawiki/extensions/Wikibase@master] Remove \u201Csaved in parser cache\u201D workaround

https://gerrit.wikimedia.org/r/737943

Change 737943 merged by jenkins-bot:

[mediawiki/extensions/Wikibase@master] Remove \u201Csaved in parser cache\u201D workaround

https://gerrit.wikimedia.org/r/737943

Lucas_Werkmeister_WMDE assigned this task to Pchelolo.

I think this is done, thanks everyone!