Page MenuHomePhabricator

Incorrect results from insource with regex
Closed, ResolvedPublic

Description

See discussion at https://en.wikipedia.org/wiki/Wikipedia_talk:AutoWikiBrowser#Finding_unlinked_.22Phi_Beta_Kappa.22

An editor is trying to find articles where the phrase "Phi Beta Kappa" is preceded by a space. These searches at enwiki incorrectly return "1977 in literature". In that article the phrase "Phi Beta Kappa" is preceded by a left square bracket:

Coetzee "Phi Beta Kappa"~0 insource:"/ Phi Beta Kappa/"
Coetzee "Phi Beta Kappa"~0 insource:"/[^\[]Phi Beta Kappa/"
Coetzee "Phi Beta Kappa"~0 insource:"/\\Phi Beta Kappa/"
Coetzee "Phi Beta Kappa"~0 insource:"/@@@ Phi Beta Kappa/"

The insource *is* being processed; the search

Coetzee "Phi Beta Kappa"~0 insource:"/Elephant Phi Beta Kappa/"

correctly returns no results.

Event Timeline

John_of_Reading raised the priority of this task from to Needs Triage.
John_of_Reading updated the task description. (Show Details)
John_of_Reading added a project: CirrusSearch.
John_of_Reading subscribed.

I think the regex wasn't being processed because of the double quotation marks. The regex only needs slashes now: "Phi Beta Kappa" insource:/ Phi Beta Kappa/. "1977 in literature" was not included, and I checked five percent of the 267 results (to make sure each one had an instance with the space).

Restricted Application added a subscriber: StudiesWorld. · View Herald Transcript