Page MenuHomePhabricator

[regression - wmf.26] frwiki Homepage SE module has 'cirrussearch-query-too-long' for default filters
Closed, ResolvedPublic

Description

  1. On frwiki go to Special:Homepage and return filters to the default state

Difficulty - only easy filters
Topics - no selected topic.

  1. Reload the page - the following error is displayed:

    "No suggested edits are available at this time" and the Console has the following error (from GrowthTasksApi.prototype.handleError)
Fetching task suggestions failed: cirrussearch-query-too-long cirrussearch-query-too-long

"Search request is longer than the maximum allowed length. (2122 > 2048)"
  1. After reloading the page the SE module loads the card successfully.

Steps for users without enabled Homepage:

  1. On frwiki log in as a user who does not have Homepage enabled and enable it from the Preferences (or for a user with existing Homepage click to "Restore all default settings (in all sections)" in user Preferences and then enabled the Homepage again).
  2. Go to Homepage and go through two initial steps in SE module (topic and task difficulty selection filters without making any selection). After clicking "Get suggestions" - the error is displayed.

SE_module_initial_state2.gif (917×683 px, 349 KB)

Notes

  • I did not see the issue on other wikis (cswiki, ruwiki, ukwiki, arwiki, hewiki, kowiki)
  • sometime a SE card (even with the image present) appears to be quickly replaced with "No suggested edits are available at this time" message.
  • even when a card is displayed and the counter says 1 of 200 suggestions - the navigation (the left/right arrows) is not active

Event Timeline

This is clearly caused by T269642: Newcomer tasks: Use search to revalidate cached tasks: the same API query works when I'm logged out. In theory the pageid filter shouldn't count towards the query length limit but apparently that's not working. No idea why the error would be specific to frwiki.

Tgr triaged this task as High priority.Jan 15 2021, 2:56 AM
Tgr moved this task from Incoming to In Progress on the Growth-Team (Current Sprint) board.

This is pretty broken - I don't get the error page but the AJAX request to the growthtasks API errors out and so the navigation arrows don't work.

I can also confirm that this is not happening on other wikis (or at least some other wikis). Maybe that's just because frwiki is large enough to have longer page IDs?

There relevant code change is this - it's trivial so I'm not sure how it could be wrong. There's also a test for it.

Here's the search query generated for my WMF account:

hastemplate:"À_wikifier" pageid:755476|6272494|12704750|12893208|12890749|12893250|12889613|770768|736257|327876|575575|13407816|13692932|10356852|1385279|13863818|686687|89649|337883|13483509|13006216|13601567|3154685|4710286|13442952|11852301|7482158|10637072|12271250|4484672|3110205|12637314|6063711|4521891|13184976|3476084|11725761|12081845|12886504|12244477|13500096|9495125|1387441|9211285|68659|486508|505229|3340460|171748|6260580|8372664|3646595|1410998|13764395|11574915|13164493|13539088|13559568|13371049|2692230|13408548|1264497|13262996|12425637|769292|1413523|6362918|11988397|3932336|581746|12591540|5862869|13755638|5686329|11832375|3112719|11586668|12848439|10923601|12463087|700119|13750840|11715613|12962211|3145851|13549754|920522|11979663|402014|780520|2787247|12905731|172903|845024|13016319|528533|13639288|12707065|12151658|1192801|1712118|13833917|12002920|5855601|13015990|8513497|3221831|1918437|4704118|12646567|2786818|744997|13392303|1418653|7360961|4857999|4849109|3093086|324588|11908087|9331731|13402008|6954528|10350427|11582036|11092626|1291953|3914806|2700059|7897644|12822247|1511053|9540288|13610027|340165|13497857|11149347|12372393|12472148|11540033|11838652|13488469|13530745|294090|2923091|991509|3260619|13809850|13233052|12508128|12482213|13243243|13860501|13604591|4215855|2877083|12900705|912433|7582502|7985810|12149044|12409586|6506585|983811|13710076|6605406|4966307|5967454|13433456|13433386|13423935|13724759|9099386|11421492|1330975|12573906|406081|12315955|4771437|661907|5261238|13741515|12062356|10466499|2943069|13223343|33944|10561954|12906620|791100|12987725|12881127|910210|5762613|1702188|353061|11400654|12858647|13409924|3192782|5086719|12594386|13713459|12502060|12960590|2394812|474829|13521373|12947360|13690782|12518615|12882825|3253257|12810273|13308286|10302471|12209537|1317497|12710194|7840942|11811707|12536496|11802454|13259323|5615075|4703063|12575132|2949746|612707|12540802|12960802|11860833|5271500|99678|650745|11661766|4881820|4402807|4841662|2721052|12783875|12372625|6625689|13348991|905824|13294306|1397128|11237994|305126|13695279

An API query with that string indeed results in cirrussearch-query-too-long.

The same string fails on other wikis as well. So it's probably just that higher article count on frwiki results in a longer string.

Apparently we are hitting the hard limit of QueryStringRegexParser::QUERY_LEN_HARD_LIMIT which is unaffected by exemptions.

Change 656370 had a related patch set uploaded (by Gergő Tisza; owner: Gergő Tisza):
[mediawiki/extensions/GrowthExperiments@master] Temporarily disable cache revalidation

https://gerrit.wikimedia.org/r/656370

Change 656301 had a related patch set uploaded (by Kosta Harlan; owner: Gergő Tisza):
[mediawiki/extensions/GrowthExperiments@wmf/1.36.0-wmf.26] Temporarily disable cache revalidation

https://gerrit.wikimedia.org/r/656301

kostajh raised the priority of this task from High to Unbreak Now!.Jan 15 2021, 9:38 AM

Change 656301 merged by jenkins-bot:
[mediawiki/extensions/GrowthExperiments@wmf/1.36.0-wmf.26] Temporarily disable cache revalidation

https://gerrit.wikimedia.org/r/656301

Mentioned in SAL (#wikimedia-operations) [2021-01-15T09:58:24Z] <reedy@deploy1001> Synchronized php-1.36.0-wmf.26/extensions/GrowthExperiments/includes/NewcomerTasks/TaskSuggester/CacheDecorator.php: T272103 (duration: 00m 57s)

hashar lowered the priority of this task from Unbreak Now! to High.Jan 15 2021, 10:06 AM
hashar added a subscriber: hashar.

The issue is worked around in the code, has been confirmed to resolve the issue and has been deployed to all wikis. I am leaving this task open cause it seems to me the code needs some further tweak, but the UI is at least working now.

Thank you everyone for the investigation and the quick patch!

We could probably double QueryStringRegexParser::QUERY_LEN_HARD_LIMIT if this is sufficient for you? It was set to 2048 at the time because the assumption was that most queries were passed through a param of the request URI and could not be that big.

We could probably double QueryStringRegexParser::QUERY_LEN_HARD_LIMIT if this is sufficient for you? It was set to 2048 at the time because the assumption was that most queries were passed through a param of the request URI and could not be that big.

I think that would probably work for us.

For a non-CirrusSearch fix, I think we should look at implementing T260522: Optimize number of results requested from API.

Thanks @dcausse, that would be plenty. The queries are like hastemplate:<list of templates> pageid:<list of pages>, with up to 250 pages, which fits into the limit on most wikis, frwiki just had larger page IDs. Maybe enwiki would have even larger ones, but that's just a 20% increase. Technically, the query could be arbitrarily long due to the template part, but in practice there are few templates and their names aren't long.

For a non-CirrusSearch fix, I think we should look at implementing T260522: Optimize number of results requested from API.

Yeah, that's something we wanted to do anyway, it helps with a number of other performance aspects.

Another possibility would be to chunk the pageid: into multiple parts if it surpasses the query length limit. That seems easy to do, but it would make things slower.

I wonder if it would be hard to issue multiple CirrusSearch queries in parallel. That would help in other places as well.

Thanks @dcausse, that would be plenty. The queries are like hastemplate:<list of templates> pageid:<list of pages>, with up to 250 pages, which fits into the limit on most wikis, frwiki just had larger page IDs. Maybe enwiki would have even larger ones, but that's just a 20% increase. Technically, the query could be arbitrarily long due to the template part, but in practice there are few templates and their names aren't long.

I'll ship a quick patch to bump this up.

I wonder if it would be hard to issue multiple CirrusSearch queries in parallel. That would help in other places as well.

Internally Cirrus can send multiple queries to elastic sadly I see no easy way to expose this without a significant refactoring.

Change 656459 had a related patch set uploaded (by DCausse; owner: DCausse):
[mediawiki/extensions/CirrusSearch@master] Bump hard limit on query length

https://gerrit.wikimedia.org/r/656459

Change 656459 merged by jenkins-bot:
[mediawiki/extensions/CirrusSearch@master] Bump hard limit on query length

https://gerrit.wikimedia.org/r/656459

Change 656370 abandoned by Gergő Tisza:
[mediawiki/extensions/GrowthExperiments@master] Temporarily disable cache revalidation

Reason:
Thanks to I376a373853f355 this is not needed in master anymore.

https://gerrit.wikimedia.org/r/656370

Thanks all for the quick help.

Etonkovidova claimed this task.