|Resolved||mmodell||T182088 Phabricator search degraded in quality for almost any query|
|Stalled||None||T182160 Develop tests for phabricator search to detect regressions / search quality issues|
|Open||None||T92630 Project tags can not be found via the fulltext search index|
Any repository would do as far as I am concerned! :)
The trouble I have is how to contribute to rPHDEP. I vaguely remember that there was a page on mediawiki that explained it, but I can not find it now. I am reading Phabricator, Diffusion and Category:Phabricator but I do not see the instructions.
Should I just push to a topic branch?
Argh. Looks like I am doing something wrong. arc works fine (as far as I can see) in my home folder.
~$ arc version arcanist 58f254840efe4b29b8c89684804fae7e2dfa525b (11 Oct 2017) libphutil cb945f0205fab3b7683efe32ddae65eeb5e2b9af (30 Nov 2017)
But it fails in phab-deployment.
~/Documents/phabricator/phab-deployment$ arc version Exception Source file "phabricator/src/__phutil_library_init__.php" failed to load. (Run with `--trace` for a full exception trace.)
~/Documents/phabricator/phab-deployment$ arc version --trace ARGV '/Users/z/Documents/phabricator/arcanist/bin/../scripts/arcanist.php' 'version' '--trace' LOAD Loaded "phutil" from "/Users/z/Documents/phabricator/libphutil/src". LOAD Loaded "arcanist" from "/Users/z/Documents/phabricator/arcanist/src". Config: Reading user configuration file "/Users/z/.arcrc"... Config: Did not find system configuration at "/etc/arcconfig". Working Copy: Reading .arcconfig from "/Users/z/Documents/phabricator/phab-deployment/.arcconfig". Working Copy: Path "/Users/z/Documents/phabricator/phab-deployment" is part of `git` working copy "/Users/z/Documents/phabricator/phab-deployment". Working Copy: Project root is at "/Users/z/Documents/phabricator/phab-deployment". Config: Did not find local configuration at "/Users/z/Documents/phabricator/phab-deployment/.git/arc/config". Loading phutil library from '/Users/z/Documents/phabricator/arcanist/src'... Loading phutil library from 'phabricator/src'... [2017-12-12 17:15:04] EXCEPTION: (Exception) Source file "phabricator/src/__phutil_library_init__.php" failed to load. at [<phutil>/src/moduleutils/PhutilBootloader.php:242] arcanist(head=wmf/stable, ref.wmf/stable=58f254840efe), phutil(head=wmf/stable, ref.wmf/stable=cb945f0205fa) #0 PhutilBootloader::executeInclude(string) called at [<phutil>/src/moduleutils/PhutilBootloader.php:208] #1 PhutilBootloader::loadLibrary(string) called at [<phutil>/src/moduleutils/core.php:12] #2 phutil_load_library(string) called at [<arcanist>/scripts/arcanist.php:624] #3 arcanist_load_libraries(array, boolean, string, ArcanistWorkingCopyIdentity) called at [<arcanist>/scripts/arcanist.php:172]
If you are looking for search quality, typically what would be done is:
- Create a dataset that has a set of queries you care about
- For each query source some results and grade them on a scale of 0-3 for how good they are
- At test time run all the queries and get the result lists
- Calculate per-query ndcg@n, average together all the queries. You might look at ndcg@3 and ndcg@10, but depends on how users use the search and how willing they are to scan down the list
- Monitor changes in ndcg
NDCG is probbably the most common metric, but there might be other interesting ones. ndcg puts a lot of weight on the position of the result, but our users might only care that the result is there or not. Precision@n might be interesting for this as it disregards position and just checks if docs are in the top N. This is basically # of good docs / # of possible good docs in top N per query. So for P@5 if a query has 1 good result and that result is in 3rd place, the query gets a 1. If there are 3 and only 2 make it in the top 5 it gets .66.
The first part of this is now finished. We have the selenium/webdriver/mocha framework running and all that remains is to write some tests for search results. I think I will use production dataset and the production phabricator instance to run the tests against. Thanks @EBernhardson for the tips about search quality testing. I'll try to take this into account when building my tests. And thanks to @zeljkofilipin for setting up the testing framework and getting me up to speed on how this all works.