Page MenuHomePhabricator

Add current issues to "exactly this text" helptext
Open, MediumPublic

Description

Motivation
Although this should not be the case, quoted search terms currently don't differentiate between "ss" and "ß" (see T87136: ~"daß" should not match "dass")
The info text of "exactly this text" should reflect that.

Task
Update the info text to say

TODO WMDE-Design


Original request
This is a request from https://de.wikipedia.org/wiki/Wikipedia_Diskussion:Technische_Wünsche/Spezialisierte_Suche#Rückmeldungen_zur_Beta-Funktion

Story
As an editor, I want to correct old spellings. I search for "umfaßt" in the search field "Exactly this text" in order to change the spelling to "umfasst". The search results, however, return pages containing both "umfaßt" and "umfasst". This makes the search results useless for my use case. And: In this case, the label "Exactly this text" is incorrect.

See also https://phabricator.wikimedia.org/T182447 and https://phabricator.wikimedia.org/T87136.

Apparently, it is possible to get search results containing only "umfaßt" by using

insource:/umfaßt/

Question:

  • Can the field "Exactly this text" use insource, when a "ß" is involved?
  • If not, can a hint be added to the info popup next to the search field?

Bonus question:

  • Is it possible to have a dependent search that recognizes when words on a page are part of a quote? This would be ideal for this use case because the spelling "umfaßt" must not be changed when it's part of a quote.

Best,
Johanna

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Lea_WMDE renamed this task from Search for "Exactly this text" does not only find exact matches to Add current issues to "exactly this text" helptext.Feb 15 2018, 4:09 PM
Lea_WMDE triaged this task as Medium priority.
Lea_WMDE updated the task description. (Show Details)
Lea_WMDE moved this task from Backlog to Text stuff on the Advanced-Search board.

Can the field "Exactly this text" use insource, when a "ß" is involved?

I'm pretty sure we can do this, but i might suggest having a checkbox or some such to toggle between character and word based searching. I came here from T182452 in which there is additional confusion about why searching "#wikimedia-operations" in the "exactly this text" field searches for the two words, instead of the combined symbol. Basically the field is not really an "exactly this text" search, but some word-based variant.

My suggestion would be to allow to toggle between the word-based search (implemented by wrapping term with quotes) and a character based search using insource/regex. The insource/regex would need to escape regex in the term and submit it as in insource query. So for the "#wikimedia-operations" example the searches sent to the backend by advanced search would be:

typegenerated
word based"#wikimedia-operations"
char based"#wikimedia-operations" insource:/\#wikimedia-operations/