Page MenuHomePhabricator

Improve search result order
Closed, DeclinedPublic

Description

Search results ought to take article size/quality into account. This will make wikipedia more useful to users.
Example - if you were to search for “Henry football” or “Henry footballer”, currently the famous French player is way down in the list (#8 or #21), and many lesser known/minor articles are displayed first. The Thierry Henry article has 100x the content of some of the others.

Steps to reproduce example:

  1. Go to https://en.wikipedia.org/w/index.php?title=Main_Page (desktop).
  2. Type “Henry football” on search box.
  3. Click the “containing...” option that displays.
  4. Notice the “Thierry Henry” high quality article is ranked below many minor, small articles.

Related Objects

Event Timeline

Aklapper changed the task status from Open to Stalled.Aug 24 2018, 9:27 AM

Hi @Zojj, thanks for taking the time to report this!
Unfortunately this report lacks some information. Please add a more complete description to this report.
That means a clear list of specific steps to reproduce the situation, as little details sometimes matter, so that nobody needs to guess how you performed each step, describing actual results and expected results after performing the steps to reproduce, and providing a link to a public website where the issue can be seen.
You can edit the task description by clicking Edit Task.
Ideally, exact and clear steps to reproduce should allow any other person to follow these steps (without having to interpret those steps) and see the same results. Problems that others can reliably reproduce can get fixed faster. Thanks!

Aklapper changed the task status from Stalled to Open.Aug 24 2018, 12:09 PM
Aklapper added a project: CirrusSearch.

“Thierry Henry” is the 8th result here.

I don't think that "length" or "excellence" also necessarily implies "relevance"?

Size is already taken into account by the scoring mechanism (see text_word_count in P7481) unfortunately this does not help enough to move Thierry to the top.
I don't see a particular feature we don't use yet use to push this page to the top.
Perhaps football/soccer/american football ambiguity is part of the problem, Thierry is ranked #2 for henry soccer.
Thanks for reporting this but not sure we can do much for this particular query.

Thanks for thoughts. I see on the mobile search, the suggestions show a snippit of the article, eg for Thierry Henry it shows “French association football player”. In the particular search for “henry football”, I would think there is a way to prioritize on the snippit? Although I have no idea where this text is coming from...!
Also dcausse, I see “popularity” in that P7481 list, does it affect search order?

Zojj updated the task description. (Show Details)

Although I have no idea where this text is coming from...!

Wikidata.

does it affect search order?

For general info, see https://www.mediawiki.org/wiki/Extension:CirrusSearch/Scoring

EBjune triaged this task as Low priority.
EBjune subscribed.

As dcausse explains above, there's nothing we can do to improve the results of this particular query, so declining for now.