Show diffs for all slots [MCR]
Closed, ResolvedPublic

Description

When showing a diff view, show the diffs for all slots.

As a baseline, show the diffs one below the other, separated by h2 headers that show the human readable slot name, omitting headings for slots that did not changed (empty diffs). The navigation header should only be shown once. The UX should be revisited later.

This task does not specify any requirements about the content view shown below the diffs.

Related Objects

daniel created this task.May 15 2018, 9:22 AM
Restricted Application added a project: Wikidata. · View Herald TranscriptMay 15 2018, 9:22 AM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
CCicalese_WMF assigned this task to Tgr.May 15 2018, 3:18 PM

Change 441924 had a related patch set uploaded (by Gergő Tisza; owner: Gergő Tisza):
[mediawiki/core@master] [WIP][MCR] Split DifferenceEngine into page-level and slot-level part

https://gerrit.wikimedia.org/r/441924

Change 441925 had a related patch set uploaded (by Gergő Tisza; owner: Gergő Tisza):
[mediawiki/core@master] [WIP][MCR] Show diffs for all slots

https://gerrit.wikimedia.org/r/441925

Ladsgroup moved this task from incoming to monitoring on the Wikidata board.Jun 28 2018, 3:36 PM
Vvjjkkii renamed this task from Show diffs for all slots [MCR] to 4xcaaaaaaa.Jul 1 2018, 1:09 AM
Vvjjkkii triaged this task as High priority.
Vvjjkkii removed Tgr as the assignee of this task.
Vvjjkkii updated the task description. (Show Details)
Vvjjkkii edited subscribers, added: Tgr; removed: gerritbot, Aklapper.
CommunityTechBot assigned this task to Tgr.
CommunityTechBot raised the priority of this task from High to Needs Triage.
CommunityTechBot renamed this task from 4xcaaaaaaa to Show diffs for all slots [MCR].
CommunityTechBot edited subscribers, added: gerritbot, Aklapper; removed: Tgr.

Copy of the commit message of https://gerrit.wikimedia.org/r/c/mediawiki/core/+/441924 as written by @Tgr, for the sake of discussion here:

[WIP][MCR] Split DifferenceEngine into page-level and slot-level part

In preparation for showing multi-slot diffs, split out from
DifferenceEngine a PageDifferenceEngine class deals with
page-level tasks (determining the revisions to be diffed, handling
permissions, showing the top diff header, putting together the
combined diff) while DifferenceEngine can actually focus on
diff-generation (rendering a slot diff, creating slot headers).

Several extensions subclass DifferenceEngine, and that subclassing
covers both aspects. To maintain a semblance of B/C,
PageDifferenceEngine becomes a parent class of DifferenceEngine,
and the content type of the main slot determines which
PageDifferenceEngine subclass is used, while each slot uses the
DifferenceEngine matching its content type.

This is not great; ideally we would have a SlotDiffHandler for each
slot and a PageDiffHandler for combining them, with SlotDiffHandlers
selected by the ContentHandler and PageDiffHandlers selected by the
PageTypeHandler (and probably a separate DiffController class for
interpreting the request and selecting revisions), but extensions
need to be able to override both functionalities with custom
DifferenceEngines for a deprecation cycle, so that will have to wait.

Some methods have a different implementation in DifferenceEngine and
PageDifferenceEngine (mainly to iterate through all slot engines);
when the page-level engine gets subclassed and those methods are
overridden, and the page has non-main slots, things might break.
There is no way to avoid that. The affected methods are:

  • setContent
  • setTextLanguage
  • localiseDiff

TODO:

  • figure out on which level caching should be handled

A copy of my response to @Tgr at https://gerrit.wikimedia.org/r/c/mediawiki/core/+/441924#message-988f7c87660b1daba3ca441b29a92f4c6c036a47, copied for the sake of further discussion here:


You wrote:

To maintain a semblance of B/C,
PageDifferenceEngine becomes a parent class of DifferenceEngine,
and the content type of the main slot determines which
PageDifferenceEngine subclass is used, while each slot uses the
DifferenceEngine matching its content type.

I see two alternatives:

  1. Introduce PageDiffHandler and SlotDiffHandler. Make DifferenceEngine delegate to them as appropriate. Deprecate DifferenceEngine, explaining that core code can only start using PageDiffHandler directly once all extensions have been migrated.
  1. Introduce PageDiffHandler and SlotDiffHandler. Make PageDiffHandler check whether ContentHandler::createDifferenceEngine() returns anything different from the default DifferenceEngine base class. If so, delegate to it. Deprecate DifferenceEngine and ContentHandler::createDifferenceEngine(), and start using PageDiffHandler directly in core.

Also: Instead of PageDiffHandler and SlotDiffHandler I would introduce (Revision)DiffView and (Content)DiffRenderer. I don't think we need diff handlers per slot role. The DiffRenderer should be determined by ContentHandler based on the content model, and should be oblivious of the slot role.

Copy of @Tgr's comment at https://gerrit.wikimedia.org/r/c/mediawiki/core/+/441924#message-edcc2029a8d911da3ec680fb373cd10f9b0323a2, for further discussion here:

So one constraint to refactoring here is what callers expect a DifferenceEngine subclass. That can happen two ways, by calling ContentHandler::createDifferenceEngine or by implementing one of the hooks which have a DifferenceEngine as a parameter (GetDifferenceEngine, DifferenceEngineAfterLoadNewText, DifferenceEngineLoadTextAfterNewContentIsLoaded, DifferenceEngineMarkPatrolledLink, DifferenceEngineMarkPatrolledRCID, DifferenceEngineNewHeader, DifferenceEngineOldHeader, DifferenceEngineRenderRevisionAddParserOutput, DifferenceEngineShowDiff, DifferenceEngineShowEmptyOldContent, DifferenceEngineShowDiffPageMaybeShowMissingRevision, AbortDiffCache, DiffViewHeader, ArticleContentOnDiff). I made some quick analysis of entry points:
https://www.mediawiki.org/wiki/User:Tgr_(WMF)/DifferenceEngine/entry_points
and it's actually a lot less bad as that sounds, most of those hooks are unused in Gerrit at least (that or I messed something up) and the ones that are used only do trivial things with the diff engine (use it to get the context and such) with the sole exception of SocialProfile calling getRevisionHeader/markPatrolledLink. So probably there is not much that could be broken outside of Gerrit either.
Also, with the exception of SpecialUndelete::showDiff calling generateContentDiffBody, all the called methods are page-level, so DifferenceEngine could be used as the page-level object, all the slot-level methods could be hard-deprecated and proxied to the main slot, and the slot-level objects could be a different class.

The other contstraint is subclassing, and that is almost always used for slot-level functionality:
https://www.mediawiki.org/wiki/User:Tgr_(WMF)/DifferenceEngine/overrides
(MobileFrontend/showDiffPage and Wikibase/getDiffLang are the only exceptions).
So there are three options:

  1. go with the current inheritance-based patch, and in a follow-up step move functionality into new classes (PageDiffRenderer/SlotDiffRenderer or whatever), deprecate the existing method and turn them into proxies, have calling code switch to the new classes.
  2. declare DifferenceEngine to be the page-level handler, strip all slot-level functionality out of it, create a new slot-level class, update all extensions to subclass that class instead (a breaking change with no deprecation period; affects 8 extensions in Gerrit), deal with the MobileFrontend and Wikibase use cases in a different way.
  3. declare DifferenceEngine to be the slot-level handler, strip all page-level functionality out of it, create a new page-level class, update all hooks (except GetDifferenceEngine) and ContentHandler::createDifferenceEngine to return the new class, update all callers (a breaking change with no deprecation period; affects 9 extensions in Gerrit)

(or go for maximum breakage and do 2 and 3 both)

Technically only the first keeps B/C, and it's also the least amount of work, but leaves the biggest mess. OTOH it does not define any new interface so that can be done later in a less rushed fashion, and diff rendering could definitely use a cleaner interface (all those hooks, ugh) so I'd opt for that.

@Tgr thank you for doing the survey!

I'm ok with doing the sub-classing approach as an intermedia solution if we can't find a better alternative that can be done quickly.

How about this option:

  • Declare DifferenceEngine to be the page-level handler
  • Deprecate all slot-level functionality out of it, and delegate it to the main slot.
  • Create a new slot-level SlotDiffRenderer class.
  • Create an adapter that maps the new slot-level interface to the old slot-level methods on DifferenceEngine (which are conveniently public).
  • Instantiate one DifferenceEngine directly, and determine the SlotDiffRenderer for each slot by calling call ContentHandler::getSlotDiffRenderer().
  • ContentHandler::getSlotDiffRenderer() calls ContentHandler::createDifferenceEngine() - if that returns a subclass that has the old methods, getSlotDiffRenderer() wraps that in an adapter and returns it. Otherwise, getSlotDiffRenderer() returns the default SlotDiffRenderer.

Would this work, or am I missing something? Is this to complex to get done quickly?

Tgr added a comment.Jul 10 2018, 9:49 PM

Thanks, that's nicer than what I wrote for option 2. The only potential problem I can see is when something uses createDifferenceEngine to get an instance, and the content handler or GetDifferenceEngine hook handler subclasses it, and the subclass overrides something in a way that breaks delegation. (In Gerrit there is only one case where that would happen, with SpecialUndelete::showDiff calling generateContentDiffBody; that would have to be fixed manually.) That level of breakage is unavoidable though - option 1 has the same problem with subclassing and option 3 would result in loads of B/C breaks (I've realized that I forgot about direct object creation and apparently there's a metric ton of that).

All the hooks that receive a DifferenceEngine object operate on the page level (except DifferenceEngineShowDiff and DifferenceEngineShowEmptyOldContent which could arguably go both ways, and maybe DifferenceEngineAfterLoadNewText / DifferenceEngineLoadTextAfterNewContentIsLoaded ) so those won't need to be updated either. Sounds like the best plan so for.

Change 445200 had a related patch set uploaded (by Gergő Tisza; owner: Gergő Tisza):
[mediawiki/core@master] [WIP] Render multi-slot diffs

https://gerrit.wikimedia.org/r/445200

Change 441924 abandoned by Gergő Tisza:
[WIP][MCR] Split DifferenceEngine into page-level and slot-level part

Reason:
Abandoning in favor of https://gerrit.wikimedia.org/r/c/mediawiki/core/ /445200

https://gerrit.wikimedia.org/r/441924

pmiazga added a subscriber: pmiazga.Aug 6 2018, 8:52 PM
Tgr added a comment.Aug 13 2018, 1:54 PM

Went through the high and medium priority steps of the test plan.

  • MobileFrontend diffs break (use TextSlotDiffRenderer instead of InlineDifferenceEngine). MobileFrontend hand-creates the InlineDiffRenderer so it does not get used for the main slot; I don't think the diff patch can be improved to prevent that from happening (DifferenceEngine::getSlotDiffRenderers() could check its own class and replace TextSlotDiffRenderers with itself when it's not the base class, but that's scary and would probably break way more things). It needs to be fixed in MobileFrontend (and other extensions using a similar pattern).
  • Diffing with non-existent old revisions is somewhat broken due to a previous patch (T201218: Viewing page's first revision via diff gives error).
  • There are a bunch of callers which pass in a content object so non-main slots get ignored (edit conflict, for example). I think that's outside scope (better to keep patches small when possible) but should be tracked and eventually converted.
  • I get an exception when I try to make a Wikibase slot in a non-Wikibase namespace. I think that's expected.
  • Couldn't figure out what "diff should reflect user language" for Wikibase means.
  • Could not test VisualEditor (npm has random failures, probably because it hogs up all the memory and gets killed, and I haven't found a good way to detect or repair broken installs). Will have to test it on Beta.

In short, I think the diff patch is OK, but MobileFrontend needs to be updated first to use a diff engine customization method that core can understand.

Went through the high and medium priority steps of the test plan.

  • MobileFrontend diffs break (use TextSlotDiffRenderer instead of InlineDifferenceEngine). MobileFrontend hand-creates the InlineDiffRenderer so it does not get used for the main slot; I don't think the diff patch can be improved to prevent that from happening (DifferenceEngine::getSlotDiffRenderers() could check its own class and replace TextSlotDiffRenderers with itself when it's not the base class, but that's scary and would probably break way more things). It needs to be fixed in MobileFrontend (and other extensions using a similar pattern).

We'll need a ticket for that, and we need to figure out who can do this, and when.

How urgent is this? Do I understand correctly that this causes fatal errors when clicking certain links on the live site?

  • There are a bunch of callers which pass in a content object so non-main slots get ignored (edit conflict, for example). I think that's outside scope (better to keep patches small when possible) but should be tracked and eventually converted.

I agree, but then we need a task for this. Can probably go on the MCR backlog pile, doesn't lock SDC -- unless one of these things that pass Content objects needs to work with SDC.

  • I get an exception when I try to make a Wikibase slot in a non-Wikibase namespace. I think that's expected.

Hm yea, it's hard to test this before we have MediaInfo integrated with MCR. We can probably wait for that to happen, and then test that on beta.

  • Couldn't figure out what "diff should reflect user language" for Wikibase means.

Wikibase diffs should be rendered in the user language. This mainly affects value rendering: labels of referenced items, number format, etc. Probably not relevant for diffs of label or description.

  • Could not test VisualEditor (npm has random failures, probably because it hogs up all the memory and gets killed, and I haven't found a good way to detect or repair broken installs). Will have to test it on Beta.

ok

Tgr added a comment.Aug 13 2018, 3:21 PM

We'll need a ticket for that, and we need to figure out who can do this, and when.

T201842: Use ContentHandler to obtain DifferenceEngine in MobileFrontend

How urgent is this? Do I understand correctly that this causes fatal errors when clicking certain links on the live site?

No, it invokes DifferenceEngine's error handling (whereas previously these pages were special-cased to not show error messages). I added examples to T201218: Viewing page's first revision via diff gives error. The difference is not huge so I don't think it is super urgent.

I agree, but then we need a task for this. Can probably go on the MCR backlog pile, doesn't block SDC -- unless one of these things that pass Content objects needs to work with SDC.

T201848: Make DifferenceEngine callers pass revisions, not contents

Wikibase diffs should be rendered in the user language. This mainly affects value rendering: labels of referenced items, number format, etc. Probably not relevant for diffs of label or description.

Yeah, I actually did figure out in the last moment (and it worked), just forgot to update the comment.

Change 441925 abandoned by Gergő Tisza:
[WIP][MCR] Show diffs for all slots

Reason:
Abandoning in favor of https://gerrit.wikimedia.org/r/c/mediawiki/core/ /445200

https://gerrit.wikimedia.org/r/441925

I also did couple of tests before and I found that:

  • MobileFrontend uses different DiffEngine (it's hardcoded), this is something you have to talk to Reading Web
  • I also remember there is a work on a different engine that nicely shows when stuff is moved around (instead of just showing added/removed) it shows "moved" with an arrow that points to the new location (@Jdlrobson will know more)
  • at least on my machine populateContentTablesthrows strange mysql errors:
Wikimedia\Rdbms\DBTransactionError: Transaction round stage must be 'cursory' (not 'within-rollback') in /vagrant/mediawiki/includes/libs/rdbms/lbfactory/LBFactory.php on line 707
Wikimedia\Rdbms\DBTransactionStateError from line 1298 of /vagrant/mediawiki/includes/libs/rdbms/database/Database.php: Cannot execute query from Wikimedia\Rdbms\Database::ping while transaction status is ERROR.
  • When creating a test revision:
>>> $pu->setContent( 'slot', $c ); // 'slot' is the slot name, should be the same across revisions
InvalidArgumentException with a message 'Only the main slot is presently supported

I couldn't test the second slot as I cannot insert another slot (the error above)

I checked some diffs, also diffs with no changes/removing/undeleting stuff - looked fine

I didn't manage to do anything from Low&Medium priority,

Tgr added a comment.Aug 13 2018, 8:36 PM

Thanks!

  • MobileFrontend uses different DiffEngine (it's hardcoded), this is something you have to talk to Reading Web

Yeah, that's tracked in T201842.

  • I also remember there is a work on a different engine that nicely shows when stuff is moved around (instead of just showing added/removed) it shows "moved" with an arrow that points to the new location (@Jdlrobson will know more)

AFAIK that's on a much lower level (inside the wikidiff2 module).

Wikimedia\Rdbms\DBTransactionError: Transaction round stage must be 'cursory' (not 'within-rollback') in /vagrant/mediawiki/includes/libs/rdbms/lbfactory/LBFactory.php on line 707
Wikimedia\Rdbms\DBTransactionStateError from line 1298 of /vagrant/mediawiki/includes/libs/rdbms/database/Database.php: Cannot execute query from Wikimedia\Rdbms\Database::ping while transaction status is ERROR.

Seems like an error handler itself erroring out. Doesn't really tell what the original error is.

>>> $pu->setContent( 'slot', $c ); // 'slot' is the slot name, should be the same across revisions
 InvalidArgumentException with a message 'Only the main slot is presently supported

Yeah, that's my mistake. Used to work but not anymore. I just used sudo $pu->slotsUpdate->modifyContent( 'slot', $c ) instead.

Moving back to review as all dependencies (452470, 452413) are also up for review.

Change 445200 merged by jenkins-bot:
[mediawiki/core@master] [MCR] Render multi-slot diffs

https://gerrit.wikimedia.org/r/445200