Page MenuHomePhabricator

Searching for strings starting with a # should not redirect to main page
Closed, ResolvedPublic

Description

To replicate:

Expected result: The search result page displays. That is, I'd have been shown real search results if I'd opened https://www.mediawiki.org/wiki/Special:Search and typed #whatever in the input on said special page. The very same should be displayed when searching through the search bar that comes with the skin.

Original task description

I wanted to search for some IRC channel on mediawiki.org and noticed that I was unable to search for it, or any other string starting with a #. I thought it might be a skin issue, so I changed from my default (Timeless) to Vector, but it also appeared there.

Typing #foo in the search bar (top-right on every page in Vector) will just redirect me to [[MediaWiki#Foo]], no matter which page I was on before.

Now the interesting part is, when I type in bar it naturally searches for bar and the search result page appears, which has another input for keywords on it's top (prefilled with bar now). If I take that input box and type #foo there, it'll actually show me search results for foo. Still, those seem to ignore the #: Testing with #wikimedia-operations and looking for some search result that actually is about the irc channel, the search result boldens only the wikimedia-operations part of the search result, not boldening the #.


I don't know if that's the same or a different issue, but even if I type #wikimedia-operations in the "Exactly this text" field, it still matches "Wikimedia Operations", which is not "exactly this text": It's missing a hash, and it's having a space where i searched for a dash.

Event Timeline

This can currently be done using the insource regex search. We might need to think how to reword the "exactly this text" field in the new search interface since, as you've noted, it's not strictly that text. What it does is skip things such as stemming which converts cats into cat or simplifying the character set (depends on language used), but it still tokenizes into words and drops non-words such as # and -.

debt subscribed.

Using the regex search workaround should work. We'll go ahead and close this ticket — please re-open if needed. :)

Using the regex search workaround should work.

It's a way to search for a string beginning with a #, sure, so that'd solve the original task titel. It's a horrible solution from a user experience point of view though: As a user, I want to be able to go to mediawiki.org, click into the top-right search bar (or whereever that search bar is located in your skin), type in #whatever there, press enter and get search results for #whatever. I think I should clarify myself:

  • If the 'smart' search engine gives me results for whatever (without #) that's totally acceptable - I know that we want to get our search results more into a "what did the user want" experience than the past "matches exactly, letter-for-letter, the text specified by the user" experience
  • If even the "exactly this text" search still gives me results for whatever instead of #whatever that's completely counter-intuitive to what I would expect from "exactly this text". On the other hand normal users (as opposed to developers) probably don't use weird symbols like # and - that much in search, so they might interprete the word 'exactly' different than I do and be okay with the current behaviour. My opinion still is that said field should eventually be renamed to reflect that it's not an "exactly this text" search. But if the current opinion on your side is "we don't know what would be a better yet short and descriptive name" I can totally live with that.
  • If I don't get to a page with any search results at all but instead redirected to https://www.mediawiki.org/wiki/MediaWiki#whatever , that sounds unacceptable to me. No matter which string I input in the top-right search bar (in this specific case: No matter whether it starts with an # or not), I should always get to a page displaying search results after sending the form. It's a search bar after all - if I input a string there, send the form, and the resulting page doesn't show search results at all, I consider that broken behaviour. I'll try to edit task title/description to re-scope this task for that. Feel free to amend me :)
EddieGP renamed this task from Can't search for keywords beginning with a # to Searching for strings starting with a # should not redirect to main page.Jan 4 2018, 7:17 PM
EddieGP updated the task description. (Show Details)
debt triaged this task as Low priority.Jan 11 2018, 6:31 PM
debt moved this task from needs triage to Up Next on the Discovery-Search board.

We can add in accepting # as part of a well-formed query, since we currently ignore the special character and treat the entire string as an invalid search query entry. But, we'll make it so that we ignore the # rather than discard the entire query string.

Other search engines also drop the # when searching — check out this on Google:

hash in search query.png (398×829 px, 106 KB)

So, to be able to search Wikipedia for "#whatever" and get back results for "#whatever" thats a big stretch and seems to be something that not very many people would ever use.

Attached patch resolves the issue where typing #anything into the search box sends the user to the main page. For the issue regarding "exactly this text" i've added a comment referencing this from T182856 which i believe is a task for this issue in particular.

Change 459639 had a related patch set uploaded (by EBernhardson; owner: EBernhardson):
[mediawiki/core@master] Go search to consider fragment only title invalid

https://gerrit.wikimedia.org/r/459639

Change 459639 merged by jenkins-bot:
[mediawiki/core@master] Go search to consider fragment only title invalid

https://gerrit.wikimedia.org/r/459639