Page MenuHomePhabricator

[L] MediaSearch: Trailing whitespace causes mismatch in the number of search results
Closed, ResolvedPublic

Description

Image size filter 'Any' returns more images than "Small", "Medium", and "Large" filters combined when File type filter is used.

  1. On Special:MediumSearch enter "Bristlecone pine" search items (no quotes) - the filters should be in default - "All images sizes" and File type should be "jpg"
  2. The result set has infinite scroll.
  3. Select Image size filters (jpg is unchanged)
  4. Changing the file type to tiff
    • Select "All image sizes" - 22 results
    • Select "Small" - 0 results
    • Select "Medium"- 0 results
    • Select "Large" - 0 results

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptAug 27 2020, 12:00 AM
AnneT added a comment.Aug 27 2020, 8:06 PM

Weird, I just tried this on production and am not seeing the same results; there are many results for "Bristlecone pine" when filtered by jpg or tiff plus image size. I was able to see the specific file you mentioned when filtered by "small" and "jpg".

Numbers for what I saw:

Bristlecone pine search with jpg filter:

Bristlecone pine search with tiff filter:

  • "All image sizes" - 22 results
  • "Small" - 0 results
  • "Medium"- 0 results
  • "Large" - 22 results

I'm wondering if I missed a step in the process that caused the issue to occur for you @Etonkovidova....

Thanks, @AnneT - yes, indeed it was a mystery:)
Only after some time spent on re-checking I think I figured it out - a trailing space or some punctuation at the end of the search items dramatically change the returned result.

  1. Enter Bristlecone pine (either by typing or by pasting it into the search box. And add one trailing whitespace.
  2. Perform search for "Small" + "jpg" - 5 results will be returned:

https://commons.wikimedia.org/w/api.php?action=query&format=json&uselang=en&generator=mediasearch&prop=info%7Cimageinfo%7Cpageterms&inprop=url&gmssearch=Bristlecone%20pine%20&iiprop=url%7Csize%7Cmime&iiurlheight=180&wbptterms=label&gmsrawsearch=filetype%3Abitmap%7Cdrawing%20filemime%3Ajpeg%20fileres%3A%3C500&gmslimit=40&gmscontinue=

My steps:

  1. Type/paste Bristlecone pine - and no trailing whitespace
  2. Perform search for "Small" + "jpg" - 0 results will be returned:

https://commons.wikimedia.org/w/api.php?action=query&format=json&uselang=en&generator=mediasearch&prop=info%7Cimageinfo%7Cpageterms&inprop=url&gmssearch=Bristlecone%20pine&iiprop=url%7Csize%7Cmime&iiurlheight=180&wbptterms=label&gmsrawsearch=filetype%3Abitmap%7Cdrawing%20filemime%3Ajpeg%20fileres%3A%3C500&gmslimit=40&gmscontinue=

More:
Adding some punctuation marks to Bristlecone pine - a dot, a colon, a semi-colon - will produce the same behavior as with a trailing whitespace - "Small" + "jpg" will return 5 results. Below is an animated gif (click to see) to illustrate that:

I checked the case with tiff - yes, the same reason for the discrepancy in the results. I guess it makes this bug much less severe (and not that interesting). I've updated the title to make it more specific.

Etonkovidova renamed this task from MediaSearch: Mismatch in the number of search results for Image size/File types filters to MediaSearch: Trailing whitespace causes mismatch in the number of search results .Aug 28 2020, 1:46 AM
CBogen added a comment.Sep 9 2020, 4:36 PM

Closing because we cannot reproduce in the UI. @Etonkovidova if there are other steps to reproduce, please reopen, thanks!

CBogen closed this task as Invalid.Sep 9 2020, 4:36 PM
CBogen reopened this task as Open.Sep 9 2020, 4:42 PM

Reopening because now we can reproduce...seems to be an inconsistent bug.

CBogen renamed this task from MediaSearch: Trailing whitespace causes mismatch in the number of search results to [L] MediaSearch: Trailing whitespace causes mismatch in the number of search results .Sep 9 2020, 4:43 PM
Ramsey-WMF closed this task as Resolved.Sep 29 2020, 3:14 PM
Ramsey-WMF claimed this task.

We can't reproduce this anymore so something got fixed (or the bug is intermittent). Will reopen if it pops up again.