The capability to diff 2 different backends on a query or set of queries would be really helpful for the query-rewriting flow, both for those rewriting official queries (such as those listed as Examples in the Query UI) and also for end users rewriting their own queries before migrating them.
This could take the form of a script that one can run on their laptop to send a set of queries to 2 different nodes on eqiad, diff the results (potentially using something like ResultsCompare), and report the diff.
This should work on a large dataset as well, so it will be necessary to parallelize the work.
There will probably need to be some normalization to handle outputs that are equivalent but technically different (example: ordering)
This task covers part of the larger project explored in T422521.
AC:
- Script takes as input a set of queries and runs them on 2 nodes
- Script works on a large batch of queries too (TBD exactly how large)
- Script diffs the results and reports the diffs in some output