Page MenuHomePhabricator

Finding and inserting templates: Random search results
Closed, InvalidPublicBUG REPORT

Description

reported on https://meta.wikimedia.org/wiki/Talk:WMDE_Technical_Wishes/Finding_and_inserting_templates#Feedback_and_a_suggestion

Example from hewiki: If I'll put the search word "Newspaper", the first results will be, indeed, temp:newspaper, but afterwards it gets kind of odd. random templates used to quote various newspapers (I haven't noticed any common denominator between them) pop up, then there's temp:cite-journal for some reason. I know it isn't something awful, but it's weird.

Ideally, {{cite news}} should be displayed prominently in such a search. Also, an infobox, maybe "infobox publication" should be part of the search results.

Event Timeline

thiemowmde subscribed.

I would like to clean our backlog of reoccurring tasks like this that can be explained by the same misunderstanding.

First things first. I believe the user doesn't talk about a literal search for the English word "newspaper". It's probably about a search for "עיתון", the Hebrew word for "newspaper". Here is the same via Google Translate for non-Hebrew speakers.

It helps to look at these search results via Special:Search. The snippets you can see under each search result often explain why a page was found. And indeed, all these pages contain the word either in the title or in a prominent place in the text, often multiple times.

We don't really talk about a bug here, but about expectation management. It appears like users expect something rather specific just by typing "עיתון". But how should CirrusSearch know? It can't read minds. All it has is the word "עיתון". That word appears on 450 pages in the template namespace. Cirrus does it's best to bring these 450 pages in some useful order, with some definition of "relevance" that uses a mixture of factors like how often a template is used, how often and how prominent the word appears on a page (with page name and headlines being more relevant), and so on. That's not "random" but the nature of the algorithm Cirrus uses.

It doesn't make much sense to expect Cirrus to behave more or less like prefixsearch did before. Changing that was the whole point of T274903.

This tasks specifically asks for some search results.

  • Template:Cite news contains the word "newspaper" and is indeed found on position #1 when searching for "newspaper". "עיתון" doesn't appear on the page.
  • "Infobox publication" probably refers to Template:עיתון. It's the first result when searching for "עיתון".

Either way, I'm afraid there is nothing actionable in the current task description.