Page MenuHomePhabricator

Evaluate BM25 with the new fulltext query and the weighted sum with relforge/PaulScore and discernatron data
Closed, ResolvedPublic

Description

Once we implemented the new fulltext query and the weighted sum we should be ready to do a first offline evaluation.
We could run 2 evaluations with

  • PaulScore we used in the past which unfortunately only showed interesting results in offline testing that were not confirmed by A/B testing
  • Discernatron data

Depending on the results we could run an optimization plan to fine tune the various settings.

Possible subtasks:

  • implement a small tool that loads discernatron scores into relforge.
  • index enwiki on relforge servers with production settings (classic similarity, #shards)
  • index a second enwiki index on relforge servers with BM25 and production settings (#shards)