Page MenuHomePhabricator

[Search] Bug: opensearch API doesn't default to resolving redirects
Open, Needs TriagePublic

Description

Search results show the title of the redirect page, not the actual page. These should be resolved by the API

Opensearch for autocomplete applies to current-Vector not new-Vector. (Everything else about this ticket is perfectly accurate.)

Current-Vector runs this API query for search autocomplete: https://en.wikipedia.org/w/api.php?action=opensearch&format=jsonfm&formatversion=2&search=grand%20buda&namespace=0&limit=10

If it was changed to add the redirects=resolve parameter like so: https://en.wikipedia.org/w/api.php?action=opensearch&format=jsonfm&formatversion=2&search=grand%20buda&namespace=0&limit=10&redirects=resolve

what I seewhat I expect to see
Screen Shot 2021-11-29 at 11.43.56 AM.png (220×634 px, 26 KB)
Screen Shot 2021-11-29 at 11.44.58 AM.png (310×426 px, 36 KB)

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

@DLynch can you add details either in a comment or the description regarding the nature of this issue, as well as the explanation of the search query you ran to demonstrate the issue?

alexhollender renamed this task from Search bug: opensearch API doesn't default to resolving redirects to [Search] Bug: opensearch API doesn't default to resolving redirects.Nov 22 2021, 10:29 PM

Opensearch for autocomplete applies to current-Vector not new-Vector. (Everything else about this ticket is perfectly accurate.)

Current-Vector runs this API query for search autocomplete: https://en.wikipedia.org/w/api.php?action=opensearch&format=jsonfm&formatversion=2&search=grand%20buda&namespace=0&limit=10

image.png (302×894 px, 34 KB)

If it was changed to add the redirects=resolve parameter like so: https://en.wikipedia.org/w/api.php?action=opensearch&format=jsonfm&formatversion=2&search=grand%20buda&namespace=0&limit=10&redirects=resolve

image.png (304×894 px, 35 KB)

The default could be changed, but the query the searchbox uses could also be amended to just include that parameter.

New-Vector uses the REST search API which has no ability (that I can see) to be told to follow redirects: https://www.mediawiki.org/wiki/API:REST_API/Reference#Autocomplete_page_title

Change 740695 had a related patch set uploaded (by DLynch; author: DLynch):

[mediawiki/core@master] Autocomplete search follow redirects

https://gerrit.wikimedia.org/r/740695

For discussion, untested, that patch is probably what'd be needed to make current-Vector follow redirects for autocomplete. I scoped it such that it won't spill over into any uses of SearchInputWidget, though it could be made even simpler if that's not a problem.

It won't do anything for new-Vector, which will require entirely separate development probably by the search team.

We're talking about 2 different bugs here. I've spun out a new ticket for the one in the description since the autocomplete search is not used in new Vector.
I've opened a bug for the REST Api at T296671

Is this related to this significant problem with categorization we started noticing at Commons? - https://commons.wikimedia.org/wiki/Commons:Village_pump#Category_autocomplete_is_now_case_sensitive

@DLynch @alexhollender , just to make sure I understand, the current behavior in the current Vector resolves redirects, but the new Vector does not? Is this ticket to bring behavior back to parity (this is what it seems like to me)? or is it asking for net new behavior? In the former case, it makes sense to fix this regression.

In the latter case, I just want to make sure that we are ok with potentially weird edge case behavior from resolving redirects for autocompletes. Re-posted examples from comments on connected ticket: https://docs.google.com/document/d/1HL54H8yYlGADEwX6mCh_q6Mdrfu1bxU0zsA95Gm0fmY/edit?usp=sharing

Can also try
Thelma Riley → Ozzy Osbourne
Corn → Maize
→ ↑ → [redirects to] Tsk Tsk Tsk
The Free Encyclopedia → Wikipedia

@DLynch @alexhollender , just to make sure I understand, the current behavior in the current Vector resolves redirects, but the new Vector does not? Is this ticket to bring behavior back to parity (this is what it seems like to me)? or is it asking for net new behavior? In the former case, it makes sense to fix this regression.

as far as I can tell this issue is present in legacy Vector, new Vector, and all other skins. I didn't check the portal previously, but as you point out in the google doc the portal does not seem to have the issue (which seems like good news 🙂).

Screen Shot 2021-12-06 at 10.02.49 AM.png (362×676 px, 63 KB)

To confirm, the current behavior of the header search autocomplete does not resolve redirects regardless of skin. (Unless some skin out there decided to not rely on the automatic behavior from core's searchSuggest -- since new-Vector has reimplemented that, it's possible other skins did as well 🤷🏻‍♂️.)

This ticket requests a change in behavior, on the theory that showing the page you'll actually be taken to is probably-good. (As does T296671, it's just that it requires a more-complicated change to a separate system for new-Vector's reimplementation of search autocomplete to be affected.)

Is this related to this significant problem with categorization we started noticing at Commons? - https://commons.wikimedia.org/wiki/Commons:Village_pump#Category_autocomplete_is_now_case_sensitive

No it is unrelated to this, the broken behavior you've seen is most likely related to another issue (T295478 and followups T295705, T296897). It is being fully addressed on the search cluster but the problem you raised should be fixed by now.

Moving this ticket to watching/waiting until it becomes higher priority for desktop refresh, as per discussion with @ovasileva

Hang on, this needs a lot more consideration before implementation as it will be actively undesirable, even harmful, in some circumstances. While no user is going to be confused about seeing "A Tale of Two Cities" when searching for "Tale of Two Cities", the same is not going to be true in every case - examples:

  • Bush TwinsGeorge W. Bush. Unless you know that this is a redirect to a section of the former president's article this is confusing and implies that George is a twin (he isn't).
  • Diaoyu IslandsSenkaku Islands. Unless you know that one is an alternate name for the other this is going to be very confusing (and given the sensitivity of place names in disputed territories there could be other issues too)
  • Cuba, OhioCuba (disambiguation), Dark Angel (album)Dark Angel. These will cause confusion as, as far as the reader is concerned, they have entered a precise search term - why is Wikipedia going to take them to an ambiguous title or a disambiguation page?

Readers will be wondering what have they done wrong and how can they get where they want to go? This is bad UX.
Other problems include:

  • Making it much harder to visit the redirect page itself - e.g. to categorise it, nominate it for discussion, retarget it, overwrite it with an article, etc., especially if there is no "redirect from" note on the page.
  • Obscuring the utility of redirects - if people do not travel via a redirect there is no record of it being used, making it much harder for editors to distinguish redirects that are useful from those that are not and removing a source of information used when considering the best target for a redirect. These will make it harder for readers to find the content they are looking for.

There is definitely a benefit to hiding some redirects in the search suggestions, but not all and likely not when the redirect matches the search string. Which redirects should be hidden and which should not is not something that can be done other than by humans. See T24251 for a proposed solution to this (note that this task will not resolve that problem in all cases, so the tickets should not be merged).

Hang on, this needs a lot more consideration before implementation as it will be actively undesirable, even harmful, in some circumstances. While no user is going to be confused about seeing "A Tale of Two Cities" when searching for "Tale of Two Cities", the same is not going to be true in every case - examples:

  • Diaoyu IslandsSenkaku Islands. Unless you know that one is an alternate name for the other this is going to be very confusing (and given the sensitivity of place names in disputed territories there could be other issues too)

I agree. One suggestion that has been made before, which I like, is to include the redirect in the suggestion to make it clearer what's happening.

For example, for a query of diaoyu isl you could have a suggestion like this:

  • Senkaku Islands (from Diaoyu Islands)

Similarly:

  • corn → Maize (from Corn)
  • thelma r → Ozzy Osbourne (from Thelma Riley)
  • bush tw → George W. Bush (from Bush Twins)

(With the exact layout, formatting, and wording left as an exercise for a UX Designer.)

We already do this with full-text search results: einstien. The top results are:

  • Albert Einstein (redirect from Albert Einstien)
  • Bose–Einstein condensate (redirect from Bose-Einstien condensate)
  • Albert Einstein Memorial (redirect from Einstien Sitting Statue)

We also already ignore cases where the redirect is a substring of the article it redirects to, so searching for tale of two cities gives this:

  • A Tale of Two Cities

Not

  • A Tale of Two Cities (redirect from Tale of Two Cities)

Which redirects should be hidden and which should not is not something that can be done other than by humans.

Automatically detecting small typos (Einstein/Einstien/Einsten) is doable for some writing systems, but not as easy for others. For example, in Chinese, a one-character difference could be equivalent to a one-word difference in English.

Also, to keep the mood light, note that there are some extreme edge cases that will test the limits of any UX design:

  • Lopado­temacho­selacho­galeo­kranio­leipsano­drim­hypo­trimmato­silphio­karabo­melito­katakechy­meno­kichl­epi­kossypho­phatto­perister­alektryon­opte­kephallio­kigklo­peleio­lagoio­siraio­baphe­tragano­pterygon (redirect from Lopadotemakhoselakhogameokranioleipsanodrimypotrimmatosilphiokarabomelitokatakekhymenokikhlepikossyphophattoperister-alektryonoptokephalliokigklopeleiolagōiosiraiobaphētraganopterýgōn)