Page MenuHomePhabricator

Create the capability to send queries to 2 backends and compare results
Open, HighPublic5 Estimated Story Points

Description

The capability to diff 2 different backends on a query or set of queries would be really helpful for the query-rewriting flow, both for those rewriting official queries (such as those listed as Examples in the Query UI) and also for end users rewriting their own queries before migrating them.

This could take the form of a script that one can run on their laptop to send a set of queries to 2 different nodes on eqiad, diff the results (potentially using something like ResultsCompare), and report the diff.

This should work on a large dataset as well, so it will be necessary to parallelize the work.

There will probably need to be some normalization to handle outputs that are equivalent but technically different (example: ordering)

This task covers part of the larger project explored in T422521.

AC:

  • Script takes as input a set of queries and runs them on 2 nodes
  • Script works on a large batch of queries too (TBD exactly how large)
  • Script diffs the results and reports the diffs in some output

Event Timeline

A slightly different take on this is whether the answers that are being diff'ed are correct/definitive.

A slightly different take on this is whether the answers that are being diff'ed are correct/definitive.

That probably requires comparing the results of three or more engines.

lerickson set the point value for this task to 5.Apr 13 2026, 4:15 PM
gmodena renamed this task from Create a tool to send queries to 2 backends and compare results to Create the capability to send queries to 2 backends and compare results.Tue, Apr 21, 3:23 PM
gmodena updated the task description. (Show Details)