The current rescoring method is the same for all indices, which is convenient but isn't going to work with our plans to integrate page rank and page view data into scoring. The rescore's need to be refactored so this can be turned on/off with some configuration.
Description
Description
Details
Details
Subject | Repo | Branch | Lines +/- | |
---|---|---|---|---|
Allow customization of rescore queries | mediawiki/extensions/CirrusSearch | master | +1 K -252 |
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | EBernhardson | T113439 [Epic] Utilize the analytics cluster to improve scoring relevancy | |||
Declined | None | T113441 build hadoop job to calculate pagerank of a wikipedia | |||
Resolved | EBernhardson | T116055 Build out hadoop job to calculate average page views over time for cirrussearch scoring purposes | |||
Resolved | dcausse | T116016 Adjust rescoring methods to be able to optionally use additional fields for scoring information |
Event Timeline
Comment Actions
Change 249460 had a related patch set uploaded (by DCausse):
[WIP] Allow customization of rescore queries
Comment Actions
I think we will have to add a task to adjust the weights, I don't have any good methods except using the hypothesis cluster with large wikis and &cirrusExplain.
A set of ambiguous queries where pageviews makes sense would be very welcome.