Page MenuHomePhabricator

Metrics/Report tool for Relevance Lab
Closed, ResolvedPublic

Description

Create a tool that computes basic stats and our first metric, zero results rate, and generates a report.

Inputs:

  • a comparison name
  • two files with one JSON blob per line in each file (output of T116869)
  • an output directory

Output:

  • a report, as specified below

Note: Computing metrics on the JSON files will probably require similar processing of the input as Diffs (see T116870), but will not require the same complexity of output. It may be possible to compute the diff and compute metrics at the same time, but it is worth it to re-process the JSON and compute metrics separately for simplicity.

The tool should count the raw number of queries with zero results for each file, the tool should note when the two files differ (comparing line-by-line), and track examples of differences, and the direction of the change. (Keeping in mind the need to add more complex metrics in the future, such as differences in the top 3 results.)

Output should be a report which includes:

Stats:

  • the comparison name
  • the path to the two results input files
  • the number of queries in each file (which should match—so make a note if not)

Metrics:

  • the percentage of queries with zero results in the first file
  • the percentage of queries with zero results in the second file, and the overall change with respect to the first file
  • the percentage or queries that went from zero to non-zero from the first to second file, along with a list of query numbers of examples
  • the percentage or queries that went from non-zero to zero from the first to second file, along with a list of query numbers of examples

The report should be written to the provided directory, with the name report.(txt,xml,html,whatever—as appropriate).

Example report (not necessarily in this exact format): see crude mockup report.

Event Timeline

TJones raised the priority of this task from to High.
TJones updated the task description. (Show Details)
TJones added subscribers: TJones, EBernhardson, dcausse, Smalyshev.

Change 251551 had a related patch set uploaded (by Tjones):
Initial commit for Metrics/Report tool for Relevance Lab

https://gerrit.wikimedia.org/r/251551

Change 251551 merged by jenkins-bot:
Initial commit for Metrics/Report tool for Relevance Lab

https://gerrit.wikimedia.org/r/251551