Page MenuHomePhabricator

Decide what to do with Wikidiff3
Closed, ResolvedPublic

Description

It's been lying around forever but required a configuration change to be used which probably nobody bothered to set. To avoid it bitrotting forever, we need to figure put how it compares to DairikiDiff in performance and diff quality and decide which of them should be used. Unless there is a very visible tradeoff between performance and quality, I don't think we should keep both alternatives around.

Event Timeline

Change 284003 had a related patch set uploaded (by MaxSem):
WIP: Make wikidiff3 the only diff engine

https://gerrit.wikimedia.org/r/284003

Can you describe the corpus you used at https://diff-forge.wmflabs.org/wiki/Special:DiffCompare and summarise the results?

Corpus: those of 100k diffs from enwiki's RC that have differences between the algorithms (~6K diffs total). Results:

root@localhost:[wiki]> select count(*), dv_vote from diff_votes group by dv_vote;
+----------+---------+
| count(*) | dv_vote |
+----------+---------+
|       45 |      -1 |
|      148 |       0 |
|       37 |       1 |
+----------+---------+
3 rows in set (0.00 sec)

Where -1 means wikidiff3 better, 0 is same shit, +1 is DairikiDiff.

Change 284003 merged by jenkins-bot:
Make wikidiff3 the only diff engine

https://gerrit.wikimedia.org/r/284003

MaxSem claimed this task.