Page MenuHomePhabricator

Enforce GrowthConfigValidation::MAX_TEMPLATES_IN_COLLECTION using CommunityConfiguration
Closed, ResolvedPublic5 Estimated Story PointsBUG REPORT

Description

Background

GrowthExperiments adds a template collection feature to search, which allows users to submit search queries based on groups of templates. This feature is used to exclude articles with infoboxes from Add Image. It is exposed to administrators using Community Configuration as this:

image.png (836×1 px, 195 KB)

Since ElasticSearch doesn't allow more than 1024 clauses in a boolean query, the total number of templates is limited. Currently, this limit is set at 500 templates (500 infobox templates). Prior to migrating to CC2.0, GrowthExperiments enforced this restriction via its GrowthConfigValidation. However, this limitation was not migrated to Community Configuration 2.0.

Problem

Currently, Special:CommunityConfiguration/GrowthSuggestedEdits would allow admins to specify as many templates as they want. If they exceed the 500 limit, the template collection feature will not exclude anything. In other words: specifying more than 500 templates results in excluding no articles from Add Link. This is an user facing portion of the impact.

This also makes it more challenging to remove CC legacy, as a portion of GrowthConfigValidation (part of CC legacy) continues to be used (namely, the 500 constant).

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Urbanecm_WMF moved this task from Inbox to Estimated tasks backlog on the Growth-Team board.

Moving to Up Next, as we decided to prioritize tasks related to CC1.0 deprecation.

Michael changed the subtype of this task from "Task" to "Bug Report".Feb 17 2025, 5:21 PM
Urbanecm_WMF set the point value for this task to 5.Feb 17 2025, 5:27 PM

Sergio and I discussed this task today. When looking at the existing configuration, it turned out that one wiki, Polish wikipedia, had more than 500 entries for this. Surprisingly, it had those more than 500 entries BOTH in the legacy config as well as in the new config. This is the edit that would have seemed to add additional templates to the legacy config on April 12th 2024: https://pl.wikipedia.org/w/index.php?title=MediaWiki:GrowthExperimentsConfig.json&diff=prev&oldid=73495657

That was before the migration to CommunityConfiguration in June 2024 (see T368121). So maybe that limit was already not enforced to 500 in the legacy configuraiton?

Not sure where the "500" in the article description is coming from, but in code we seem to have a limit of 800, added in Avoid references to TemplateCollectionFeature in 2021. This would mean that also the config in Polish Wikipedia is within the limits, and we can just move forward with enforcing it in CC2.0 as well.

Change #1126557 had a related patch set uploaded (by Michael Große; author: Michael Große):

[mediawiki/extensions/GrowthExperiments@master] fix(CC2.0): enforce max-limit on InfoboxTemplates

https://gerrit.wikimedia.org/r/1126557

Change #1126557 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@master] fix(CC2.0): enforce max-limit on InfoboxTemplates

https://gerrit.wikimedia.org/r/1126557

Now storing more than 800 templates in the "Infobox templates"-section should fail. Not sure how feasible it is to test that...

Etonkovidova subscribed.

Now storing more than 800 templates in the "Infobox templates"-section should fail. Not sure how feasible it is to test that...

I just checked for general regression.plwiki has disabled both Add image tasks. GEInfoboxTemplates still lists lots of Infobox templates though - https://pl.wikipedia.org/wiki/Specjalna:Konfiguracja_wiki/GrowthSuggestedEdits.