Factor diff logic out of EntityContent
Open, NormalPublic

Description

EntityContent currently implements the part of the diffing logic that handles redirects. This should be factored out into an EntityContentDiffer. Plus maybe an EntityContentPatcher, though I think patching can and perhaps should be handled by the differ, too - that would keep the knowledge about the data structures used for diffing in a single class.

Details

Reference
bz67238
bzimport raised the priority of this task from to High.
bzimport set Reference to bz67238.
bzimport added a subscriber: Unknown Object (MLST).
daniel created this task.Jun 28 2014, 7:42 AM

How does this relate to EntityContentDiff, EntityContentDiffView, EntityDiffVisualizer, and other classes?

And, if this isn't what EntityContentDiff does, where should this code live?

daniel added a subscriber: daniel.Nov 24 2014, 11:00 AM

EntityContent wraps an Entity. It acts as a glue between the Wikibase world (Entity objects) and the MediaWiki world (Content objects). (Some types of) EntityContents can be redirects, Entities never are.

EntityDiff represents a diff between two entities. EntityContentDiff is a union of an EntityDiff and a RedirectDiff, so it is able to represent the transition to and from a redirect, and changes of the redirect target, as well as regular entity edits.

EntityDiffVisualizer generates HTML from an EntityDiff; EntityContentDiffView generates HTML from an EntityContentDiff (using an EntityDiffVisualizer).

EntityContent::getDiff returns an EntityContentDiff, using an EntityDiffer for diffing two proper Entities; The logic used when redirects are involved is implemented directly, which isn't really ideal. EntityContent shouldn't rely on the array structure representation of redirects to generate a diff, it shouldn't know or care about how redirects are represented as arrays.

To solve this, it would be nice to factor that logic out into an EntityContentDiffer, analogous to the EntityDiffer. That's what this bug is about.

Thanks, Daniel, for that detailed response.

However, I'm not clear which arrays you are referring to; there aren't any arrays explicitly in EntityContent::getDiff. The warningest mention of arrays relating to redirects seems to be EntityContent::getRedirectData.

Could you clarify for me where EntityContent is relying on the array structure of redirects?

Lydia_Pintscher removed a subscriber: Unknown Object (MLST).
daniel added a comment.Dec 3 2014, 3:52 PM

In EntityContent::getDiff(), we have:

$redirectDiffOps = $differ->doDiff(
	$fromContent->getRedirectData(),
	$toContent->getRedirectData()
);

As you noted, getRedirectData() returns an array representation of the redirects, and $differ->doDiff() operates directly on them. This logic is trivial, but still does not belong into EntityContent. There should be an EntityContentDiffer encapsulating all that nastiness;

Furthermore, EntityContentDiffer should also implement the logic in getPatchedCopy(); then, EntityContent would no longer need getRedirectData(), which would then be moved to EntityContentDiffer, which would then encapsulate the knowledge about how redirects are diffed and patched.

daniel lowered the priority of this task from High to Normal.Feb 4 2016, 2:36 PM
daniel set Security to None.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptFeb 4 2016, 2:36 PM
matej_suchanek added a project: Technical-Debt.
matej_suchanek removed a subscriber: Wikidata-bugs.