Page MenuHomePhabricator

When searching by keyword, results sorted by relevance should prioritize family names in the title: please improve search results for articles with DEFAULTSORT
Closed, DeclinedPublicFeature

Description

Feature summary (what you would like to be able to do and where):
Originally posted on the technical village pump of frwiki:

why doesn't the article Edmond Saintonge appear when you do a search with the keyword "Saintonge"?

  • Actually, the drop-down list of the search form doesn't suggest results like family names.
  • Results in the drop-down list are differents from results in the Special Search page (sorted by relevance).
  • When searching by keyword, results sorted by relevance should prioritize family names in the title
  • Consider standardizing the suggested results of the two lists (see case 2)
  • Consider using DEFAULTSORT to show relevant results too (see case 3)

Use case(s) (list the steps that you performed to discover that problem, and describe the actual underlying problem which you want to solve. Do not describe only a solution):
Case 1 (with redirect):

  • go to enwiki and fill the form Search, for exemple with the keyword "Rossini" (in this exemple, you are a user searching for Gioachino Rossini)
  • the most famous "Rossini" is "Gioachino Rossini", who is correctly the first result
  • it is the first result because of the redirect "Rossini"
  • you easily found what you were searching for
Example (one click for getting my result)
Wikipedia, the free encyclopedia — Mozilla Firefox 10_03_2023 13_04_00.png (1×2 px, 363 KB)

Case 2 (with disambiguation and redirects):

  • go to frwiki and fill the form Search, for exemple with the keyword "Bonaparte" (in this exemple, you are a user searching for Napoleon)
  • the most famous "Bonaparte" is "Napoleon", who is not shown within the most relevant results in the drop-down list, but the disambiguation page is the first one
  • in the drop-down list, click on the link "rechercher les pages contenant Bonaparte" ("search for pages containing Bonaparte"), which redirects to the Special Search page
  • in the Special Search page, now you can see the list of Bonaparte as a family names (sort by releance): the redirect from Napoléon Bonaparte is the second suggestion
  • the first suggestion is the redirect of the keyword "Bonaparte", which is correct
  • but 1) here you can't see the disambiguation page 2) you have found what you were searching for, but with two click and looking at two lists
Drop-down list (I don't find my result)Special Search page (two clicks for getting it)
Wikipedia, the free encyclopedia — Mozilla Firefox 10_03_2023 13_04_56.png (1×2 px, 356 KB)
Wikipedia, the free encyclopedia — Mozilla Firefox 10_03_2023 13_05_04.png (1×2 px, 354 KB)

Case 3 (without disambiguation nor redirects, but with DEFAULTSORT):

  • go to frwiki and fill the form Search, for exemple with the keyword "Saintonge", (in this exemple, you are searching for a person less famous then Rossini or Napoleon, with the family name "Saintonge")
  • in the drop-down list only results beginning with the keywords appears, so you have not your result here
  • so, in the drop-down list, click on the link "rechercher les pages contenant Saintonge" ("search for pages containing Saintonge"), which redirects to the Special Search page
  • in the Special Search page (sort by releance), you have to scroll down a lot before to find the person with the family name "Saintonge" in the second page of results, displayed after results such as "Alphonse de Poitiers" and "Angoumois" without the keyword in the title
Drop-down list (I don't find my result)Special Search page - page 1 (I don't find my result)Special Search page - page 2 (3 clicks for getting it)
Wikipedia, the free encyclopedia — Mozilla Firefox 10_03_2023 13_08_40.png (1×2 px, 325 KB)
Screenshot 2023-03-10 at 13-08-57 Résultats de recherche pour « Saintonge » — Wikipédia.png (3×2 px, 1 MB)
Screenshot 2023-03-10 at 13-13-33 Résultats de recherche pour « Saintonge » — Wikipédia.png (3×2 px, 1 MB)

Benefits (why should this be implemented?):
Biographies are an important part of articles on Wikipedia editions. Readers like searching for persons on Wikipedia. This may concern a lot of people.
Using DEFAULTSORT for relevant results should improve the processus of searching for biographies using family names as keywords (as redirects and disambiguations already do).
Actually, different search forms give different results to user. This can confuse users who have no idea that different enhanced search features exist (Vue or CirrusSearch).
When possible, please consider uniforming results of different lists, this may help people to understand how to compile the forms and make habits.

Event Timeline

Thanks for the detailed report!
Adding DEFAULTSORT to autocomplete searches is a feature that we can enable on a per-wiki basis, due to the way this tag is used it can't be enabled on every wiki without prior evaluation (see T145427#3515817). This feature was first enabled on mongolian wikipedia a couple weeks ago (see T327878). If this is something the frwiki community would like to experiment with we could enable it.

Using DEFAULTSORT in Special:Search results is sadly not something we can enable easily without revisiting how we configure the elasticsearch schema and how we train/tune the search criteria.

Regarding

Actually, different search forms give different results to user. This can confuse users who have no idea that different enhanced search features exist (Vue or CirrusSearch).

If you mean autocomplete vs Special:Search, these two search systems address very different purposes and I don't think they should necessarily display the same result set.

If this is something the frwiki community would like to experiment with we could enable it.

I open a new discussion on frwiki to ask the communty.

Another note from my conversation with @Patafisik_WMF: There are a lot of biographies on Wikipedia, so having Search prioritize names in some ways is something that might make sense. It is not clear how many searches are about people / names (as opposed to how many articles are about people), so this might need some more investigation.

for what it's worth, that also seems to align with Aisha's WDQS subgraph query analysis, where the human subgraph was the most queried one.
As mentioned before, while it makes sense given this conversation to boost the ranking of people/names, it will come at an unknown cost of having to push other results further down: when searching for 'Houston', should Whitney Houston outrank the city of Houston, TX? Maybe (they're pretty close). Should Marques Houston? Seems less likely that he is more relevant than the metropolitan area to most searches.

Do we factor in pageviews into MLR (I remember we do look at page popularity)? If people looking for Saintonge are unable to find it with on wiki search, but are still getting there via external search, we would expect that page's pageview to still be relatively high, and thus taking into account the fact that people are looking for people articles when ranking

It looks like the discussion on https://fr.wikipedia.org/wiki/Wikip%C3%A9dia:Le_Bistro/13_mars_2023#Recherche_avanc%C3%A9e_avec_DEFAULTSORT_? isn't getting much traction.

The Search Platform team feels that the right way to address this should not be about tweaking the ranking, but about providing ways to refine search along the way (faceted search for example).

Let's close this for now, if you feel strongly otherwise, please re-open.

Not for reopening the task, just to left here a comment posted on itwiki (about searching persons on itwiki):

I do this kind of research using Google "first name last name" site:it.wikipedia.org