Page MenuHomePhabricator

Wikimedia Commons deepcategory searches return unexpected results for categories with spaces in name
Closed, DuplicatePublicBUG REPORT

Description

Steps to replicate the issue:

On Wikimedia Commons, do a deepcategory search on a category with a space in the name. You can perform the search on either Special:MediaSearch or directly via the API.

Example API call for a deepcat search on Category:Wikipedia_23_cakes:
https://commons.wikimedia.org/w/api.php?action=query&format=json&uselang=en&generator=search&gsrsearch=filetype:bitmap%20deepcategory:%22Wikipedia%2023%20cakes%22&gsrlimit=40&gsroffset=0&gsrinfo=totalhits|suggestion&gsrprop=size|wordcount|timestamp|snippet&prop=info|imageinfo|entityterms&inprop=url&gsrnamespace=6&iiprop=url|size|mime&iiurlheight=180&wbetterms=label

Each time I make the API call, I get back one of two possible responses, both of which have issues.

What happens?:

I most frequently get a 200 back with this body (which is incorrect, as there are results in the category):
{"batchcomplete":"","query":{"searchinfo":{"totalhits":0}}}

If I retry a few times, I eventually do get back results, but with deepcat search failing:

{
   "batchcomplete":"",
   "warnings":{
      "search":{
         "*":"Deep category search SPARQL query failed"
      }
   },
   "query":{
      "searchinfo":{
         "totalhits":30
      },
      "pages":{
         "144728426":{
            "pageid":144728426,
...

I tried hitting this search 10 times in a row as a test. These were the results:

  • 0 results
  • 0 results
  • 0 results
  • Results but deepcat failed
  • 0 results
  • Results but deepcat failed
  • 0 results
  • 0 results
  • Results but deepcat failed
  • 0 results

What should have happened instead?:

Results returned without the deep category search SPARQL query failing.

Other information (browser name/version, screenshots, etc.):

This issue has been brought up on the Wikimedia Commons technical village pump here.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
SuperHamster renamed this task from Wikimedia Commons deepcategory searches returning unexpected results to Wikimedia Commons deepcategory searches return unexpected results for categories with spaces in name.Jan 16 2026, 2:17 AM
SuperHamster updated the task description. (Show Details)

We are experiencing a similar issue (maybe?) on Nowiki. When querying e.g. deepcategory:"Svenske filmkomedier" or deepcategory:"Dataspill etter tema", the results are limited to the few articles in subcategories that only have one word (no space). Is this relevant?

Is this problem also the cause for why this deepcat scan as far as I can see shows very incomplete results and this deepcat scan shows no results?

Looks like deepcat searches are working correctly again :) All linked searches above returning expected results.

Indeed this seems solved. However, the following still doesn't work and I meant to create a separate issue about this just before this bug here appeared: is this a separate problem or the same?:

(this example is what's used to populate 2020s_maps_of_the_world_in_unidentified_languages which is how at least / starting with the most relevant world maps are categorized by language to e.g. better enable translations and hopefully eventually better search results that doesn't show maps in some niche language I can't read at the top when that's not in my configured language(s))

all of the added cats are subcats of the cats in the prior scan

can you name a couple categories that you think should be treated differently?

EBernhardson subscribed.

Indeed this seems solved. However, the following still doesn't work and I meant to create a separate issue about this just before this bug here appeared: is this a separate problem or the same?:

(this example is what's used to populate 2020s_maps_of_the_world_in_unidentified_languages which is how at least / starting with the most relevant world maps are categorized by language to e.g. better enable translations and hopefully eventually better search results that doesn't show maps in some niche language I can't read at the top when that's not in my configured language(s))

Go ahead and create a dedicated ticket, I'm not sure at first glance whats going on but we can triage it.

Today I found a bug in cases where the category name has an "&".
Example: deepcat:"Pellerin & Cie" Try a deepcat search in deepcat:"Pellerin & Cie" > results in this search: deepcategory:"Pellerin_ No matches because of the wrong folder name. The folder name is truncated at the "&".
I know there must be a patch for this since my branched button that automates " " -deepcat:" " (such as "Pellerin & Cie" -deepcat:"Pellerin_&_Cie") has this a patch been applied already for a longer time and works well at this point time. The fork is hosted and patched by User Samwilson.
I hope the regular deepcat function can be patched for this issue without damage to the opposite -deepcat searches.

Today I found a bug in cases where the category name has an "&".
Example: deepcat:"Pellerin & Cie" Try a deepcat search in deepcat:"Pellerin & Cie" > results in this search: deepcategory:"Pellerin_ No matches because of the wrong folder name. The folder name is truncated at the "&".
I know there must be a patch for this since my branched button that automates " " -deepcat:" " (such as "Pellerin & Cie" -deepcat:"Pellerin_&_Cie") has this a patch been applied already for a longer time and works well at this point time. The fork is hosted and patched by User Samwilson.
I hope the regular deepcat function can be patched for this issue without damage to the opposite -deepcat searches.

This doesn't seem to be related to the current bug, could you open a direct ticket about your issue? Please include information like which wiki this applies to.