Show changes in moved text chunks (C++)
Open, Needs TriagePublic


Currently of diff pages, moved paragraphs are considered a deletion and a subsequent addition. This leads to the situation that people always need to read through all moved paragraphs to make sure, that nothing has changed apart from the position.

Using the currently existing display methods, show changes in moved text chunks. This means, that only the part that changed is highlighted.

The diff engine used on mediawiki production wikis is wikidiff2. This is a c++ library, so the code needs to be written in C++.

This is part of fulfilling wish #2 of the 2015 German-speaking community wishlist survey: T139603: Show text changes when moving text chunks (#2)

Related Objects

Lea_WMDE created this task.Sep 27 2016, 4:30 PM
Restricted Application added a project: TCB-Team. · View Herald TranscriptSep 27 2016, 4:30 PM

Change 319866 had a related patch set uploaded (by WMDE-Fisch):
Extend WikiDiff2 to show changes inside moved paragraphes

The case where a paragraph appears to have been moved to several places, and words deleted from the inserted copies, can't be displayed correctly. The workaround which is currently implemented shows only the first inserted paragraph as "moved". Users will rarely, if ever, see this edge case, so it is probably sufficient. Other suggestions to solve this are welcome.

Example diff:

It turns out that Tool Labs doesn't have the Wikidiff2 extension installed. Mediawiki falls back to some default diff implementation in that case (probably PHP) which produces slightly different output than Wikidiff2. I have set up two Labs instances with *unpatched* and with *patched* wikidiff2, so we can show a side-by-side diff of the diff code with and without our patch, yes that sentence makes sense.

Example: before | after

The wikis pull data from the same MariaDB server, so changes made in one wiki will show up in the other.

Change 356582 had a related patch set uploaded (by WMDE-Fisch; owner: MaxSem):
[mediawiki/php/wikidiff2@master] Better distinction of change vs add/delete actions

Change 319866 had a related patch set uploaded (by WMDE-Fisch; owner: WMDE-Fisch):
[mediawiki/php/wikidiff2@master] Extend WikiDiff2 to show changes inside moved paragraphs

@MaxSem Thanks for looking into I updated it. Could you review this and so we can get it merged?
@tstarling You could also review this, thanks!

@MaxSem updated again :) Anything else you want changed? Or can you give a +1? Thanks

MaxSem added a comment.Aug 9 2017, 5:59 PM

Left my reviews. One thing I'd like is to have a fuzz test of some kind. DiffTests work nice to test diffs on a bunch of real-life changes (and I actually once discovered a segfault with it), however your changes are the biggest thing happening to the code base in a decade. I'll try hacking something up today.

I've created a basic fuzz test at - if you could add some fuzzing targeted at your new code, I would be more confident about +2ing this project.

jkroll added a comment.Sep 5 2017, 1:16 PM

@MaxSem I added some code to your fuzz test which hits the other code paths in the patch. It just creates shuffled lines basically.

Tobi_WMDE_SW removed jkroll as the assignee of this task.Nov 1 2017, 7:49 AM
Tobi_WMDE_SW added a subscriber: jkroll.
Lea_WMDE updated the task description. (Show Details)Nov 28 2017, 11:20 AM
Alsee added a subscriber: Alsee.Dec 3 2017, 11:11 AM

Thanks for working on this. I reviewed some of the sample diffs.

This diff should not show either of the paragraphs as moved. It should be a side-by-side plus minus pair, then a minus blank line, then another side-by-side plus minus pair.

This diff should show the Ginger Rogers wurde durch... paragraph as not moved. (As a side-by-side plus minus pair.)

This diff swaps to paragraphs. One should show as moved, the other should be a side-by-side plus minus pair. It would be preferable if the longer paragraph was shown as not-moved.

Hi Alsee, thanks for reporting the diffs, we will look into it!