Page MenuHomePhabricator

Template field values of {{cite journal}} lost in indexing
Open, LowPublic

Description

This search: https://meta.wikimedia.org/wiki/Special:Search?search=Hasty&prefix=Research%3ANewsletter%2F20

should return this page: https://meta.wikimedia.org/wiki/Research:Newsletter/2015/March#cite_note-17

which contains the search string as part of a template:

{{Cite journal| [...] | last1 = Hasty| first1 = Robert T.| last2 = Garbalosa| first2 = Ryan C.| [...] }}

The other author names in that citation seems to have the same problem.

On the other hand, the problem doesn't seem to be the template itself, as other uses of the same template show up just fine in search results (example).

@EBernhardson observed on IRC:
"fwiw it looks to have been removed somewhere in the indexing, because it's in the source_text field and not the text field. [...] https://meta.wikimedia.org/wiki/Research:Newsletter/2015/March?action=cirrusdump
[...]
for now you can find it with this, but it doesn't do stemming so you wont find hastily when searching for hasty: https://meta.wikimedia.org/w/index.php?title=Special%3ASearch&profile=default&search=insource%3AHasty+prefix%3AResearch%3ANewsletter%2F20&fulltext=Search "

(Context: The archive search function of the Wikimedia Research Newsletter has become increasingly important as a way to quickly find coverage of academic research publications about Wikipedia from half a decade, so it would be great to fix this one way or another.)

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Deskana moved this task from Needs triage to Search on the Discovery-ARCHIVED board.
Deskana subscribed.

Given this affects very few users, and there's a partial workaround, this is fairly low priority.

MPhamWMF subscribed.

Closing out low/est priority tasks over 6 months old with no activity within last 6 months in order to clean out the backlog of tickets we will not be addressing in the near term. Please feel free to reopen if you think a ticket is important, but bare in mind that given current priorities and resourcing, it is unlikely for the Search team to pick up these tasks for the indefinite future. We hope that the requested changes have either been addressed by or made irrelevant by work the team has done or is doing -- e.g. upgrading Elasticsearch to a newer version will solve various ES-related problems -- or will be subsumed by future work in a more generalized way.

RhinosF1 removed a project: Discovery-Search.
RhinosF1 subscribed.

Re-opening tasks and removing from team workboard per IRC feedback given yesterday and discussion with MPham.