Page MenuHomePhabricator

Investigate what the ideal change detection threshold would be
Closed, ResolvedPublic8 Story Points

Description

Motivation
The change detection threshold is the threshold that defines when two paragraphs are considered the same, but changed, or different ones where the first one was deleted and the latter added. T180259 shows different scenarios where the current threshold may not always bring the best results.

Task
Investigate which if another threshold values bring better results. And document the process e.g. on the test wiki or any other persistent page that can be altered at a later stage and linked.

Note
T181404: Make change detection threshold configurable from php, T182571: Create a test set for wikidiff2 and T183352: Investigate if character runs are better than character counts for threshold are prerequisites for this task.

Event Timeline

Lea_WMDE moved this task from Proposed to Todo on the WMDE-QWERTY-Team board.Nov 28 2017, 1:44 PM
Lea_WMDE updated the task description. (Show Details)Dec 11 2017, 12:04 PM

I tried to fix some of these regressions from the regression ticket, with a value of 0.2 ( instead of 0.25 ) I fixed:

https://de.wikipedia.org/w/index.php?diff=170728571
https://it.wikipedia.org/w/index.php?title=Colle_Vento&diff=prev&oldid=7897666

It's harder for the next example where 0.145 seems to be the magic border. ( see line 90 ):

https://de.wikipedia.org/w/index.php?title=Internationale_Mathematik-Olympiade&diff=167483670&oldid=167457198

Also the threshold alone seems to be not good enough to get the stuff going on in here:

https://en.wikipedia.org/w/index.php?title=Marbella_Cup&diff=646838927&oldid=645151368
( setting it to 0.015 fixes this completely, 0.08 most of the cases but then the threshold might be completely useless for most other cases )

We imported all diffs linked here and on other pages in our test-wiki:
https://wmde-wikidiff2-unpatched.wmflabs.org/core/index.php/Main_Page

You can see the old and new version there!

Feel free to add more!

WMDE-Fisch updated the task description. (Show Details)Feb 6 2018, 4:14 PM

For testing see the automated testing environment

WMDE-Fisch set the point value for this task to 8.Feb 6 2018, 4:37 PM

@jkroll cab you please summarize your findings. You said 0.2 would be the best compromise for the threshold.

I would say a default value of 0.2 looks pretty good for English. Some slightly annoying edge cases exist for any value, which could be fixed by special-case code. I also investigated "character runs" as an alternative to the character-based similarity but found no improvement.

jkroll closed this task as Resolved.Feb 20 2018, 4:20 PM