Page MenuHomePhabricator

Phabricator search hugely degraded in quality
Closed, ResolvedPublic

Assigned To
Authored By
jcrespo
Nov 16 2017, 4:59 PM
Referenced Files
None
Tokens
"Manufacturing Defect?" token, awarded by Liuxinyu970226."Evil Spooky Haunted Tree" token, awarded by jcrespo."Manufacturing Defect?" token, awarded by mmodell.

Description

High because it is now causing coordionation issues that can translate into unavailability.

If there is maintenance ongoing, that is ok, but please announce it on the lists with an ETA.

Keywokords on titles no longer show results, or the results are show at the bottom of the same page.

For example, I would expect "puppet mysql" to show T162070

or "db1022" to show T163778

Either they are not found, or they are relagated to the last results. This is consistent and several people have reported the same issue:

[17:42] <moritzm> db1022 is a decomissioned host? can't find it in site.pp and searching for db1022 in phab doesn't show a decom ticket. I'm wondering since it's still listed in https://servermon.wikimedia.org/hosts/
[17:43] <moritzm> but hasn't received a puppet update since 1 month, 3 weeks either
[17:43] <jynus> phabricator search is broken
[17:43] <marostegui> https://phabricator.wikimedia.org/T163778
[17:44] <marostegui> moritzm: ^
[17:44] <moritzm> what, why didn't I find this?

Tags and user search work as expected.

Revisions and Commits

Event Timeline

There was a phabricator update last night that required a reindex for search i think. (not sure if they ran the index, but at least when i upgraded a test site it asked me to reindex).

If it is temporary, that is great news. This would be resolved if an email could be sent with details to wikitech/ops/engineering ?

For example, I would expect "puppet mysql" to show T162070

Works for me, it's just not among the first 100 results (pagination).
Listed when using the Maniphest search for open tasks only: https://phabricator.wikimedia.org/maniphest/query/iwbdgAd17cns/#R

or "db1022" to show T163778

It does for me when using the Maniphest search: https://phabricator.wikimedia.org/maniphest/query/f5ShhWyMxHS_/#R

those were mentioned in the ticket which causes reindexing iirc from last time we had search issues.

The reindex is almost finished, so quality should be restored to normal by now. If this is not the case then something else is wrong.

Search is definitely still not working for me, I search db1022 and I cannot find the at least 3 tickets with it in the title (but a lot of unrelated tickets). Objectived by other ops having the same experience:

[20:41] <bblack> the first hit for open+closed tasks searching "db1063" is https://phabricator.wikimedia.org/T247
[20:42] <bblack> which doesn't have the string db1063 anywhere within it I don't think

Please provide links to your search queries, so we can make sure that we talk about the same things. See T180706#3767493 for my links. Thanks!

@jcrespo we use elasticsearch for searches now so it's not going to be in the db.

Because of how phabricator is architected, it kinda does both. Elasticsearch seems to be used as a first pass filtering/sorting, but then phabricator pulls more info from the database (and potentially does more filtering).

Ok two things I think are affecting things:

  1. I think the reindex just indexed a lot of commits that were previously not being seen by search. This is mostly noise when you are looking for tasks
  2. Last night's update inadvertently published an experimental change to the way results are scored - the intention of the change was to boost more recent results but I'm not sure that was actually working as intended and I didn't mean to publish it to production without a feature flag. This was an oversight on my part and I've reverted that change.

No, when I mean I search db1022, it means I literally search the word db1022, I do not think it is difficult to go to the search box and type that. I do not know if the searches are sharable between users.

Now they seem much better, is the change immediate?

I've tried now a couple of searches and this is for me fixed already- things like quality in a fine grained way is something that would take time to be checked, but for me this is like day and night; now I can find things that I used to.

BTW, commit search is not a problem to me (my box already defaults to open tasks, and I change them to all tasks if necessary).

greg assigned this task to mmodell.
greg triaged this task as Unbreak Now! priority.
greg subscribed.

Sorry and thanks all.

mmodell added a revision: Restricted Differential Revision.Nov 17 2017, 4:40 AM

Search ordering is weird again. Cannot find T181613 searching for db1110, which only should have 3 or 4 references: https://phabricator.wikimedia.org/search/query/SMLv_VgvtkrG/#R

I'm pretty sure we aren't seeing the same issue as last time. Search in general seems to be functioning normally, however, it appears that maybe something changed with the way phabricator parses fulltext queries because db1110 now matches anything with "db" thus it produces a large list of not-very-relevant results.

status? If this isn't the same root cause, let's not reuse tasks. Especially since this one is UBN!.

Yeah this probably deserves a new task. It doesn't appear to affect most queries.

Originally reported issue resolved.