Context
We saw failing tests in Cite triggered by the change made in Maintain keyIndexes when rebuilding InternalList ( 1196452 )
See https://integration.wikimedia.org/ci/job/quibble-vendor-mysql-php81/45819/console
Steps
At this point it's not entirely clear if the tests fail due to their complex setup, or if there's a use case that we're breaking. So this task is about To make sure it's not a latter we're reviewing and improving the tests affected by this change.
- Improve failing ve.dm.MWReferenceModel test to narrow down the issue
- Improve failing ve.dm.Transaction test test to narrow down the issue
- https://gerrit.wikimedia.org/r/c/1199001
- https://gerrit.wikimedia.org/r/c/1199283
- https://gerrit.wikimedia.org/r/c/1199828
- Figure out why the test is failing and if it's necessary
- Change the test
- Improve test coverage in the VE library
- https://gerrit.wikimedia.org/r/c/VisualEditor/VisualEditor/+/1202061
- https://gerrit.wikimedia.org/r/c/VisualEditor/VisualEditor/+/1202074
- https://gerrit.wikimedia.org/r/c/VisualEditor/VisualEditor/+/1202119
Insights:
The failing ve.dm.MWReferenceModel test was setup in a quite complicated way. So I'm reducing that complexity to narrow it down on what really needs to be tests here.
It's a bit more complicated with the failing transaction test, but here's my understanding so far:
- The transaction test merges two documents and checks if the transactions created for that merge make sense
- Merging two documents will also trigger the merge of the InternalList of each document
- The later is done in ve.dm.InternalList.merge()
- In there the code checks for duplicate keys used in both lists and tries to suggest a solution on how to deal with these by providing a mapping
- Here the test fails because the ref that should get merged into the list get's a different listkey than expected
- Note, that the listKey of the new ref is still changed as expected to auto/1 nevertheless
- Also so InternalItem of the ref that should be merged into the doc is not merged anymore but left out
InternalList.merge() input
I can see different input for ve.dm.InternalList.merge() when the ve.dm.Transaction test is executed with or without 1196452. Especially when it comes to the keyIndexes.
Without the change
With the change
What's weird here is that there are keys with mwReferences// and with mwReferences/ and also mwReferences/null seems weird.
( Fixed in https://gerrit.wikimedia.org/r/c/1199828 )
Why is the InternalItem removed?
According to a comment in the code in ve.dm.InternalList.merge() checks for duplicate keys and discards the other InternalItem when duplicates are found. The linked references will be re-mapped to the duplicate in the original. This also means that the content of the reference that gets merged in will be lost. With that in mind the behavior might be more correct with 1196452.
OTOH:
There's another comment in the code in ve.dm.TransactionBuilder.newFromDocumentInsertion() that states:
any differences between internal items that doc and newDoc have in common will be resolved in newDoc's favor
So both comments contradict each other and without 1196452 it seems that neither of them is true when we assume that the keyIndexes is unintentionally incomplete.
But also with 1196452 only the former comment is true but not the latter.
What's the test newFromDocumentInsertion testing?
ve.dm.TransactionBuilder.newFromDocumentInsertion is mainly used for these different use cases:
ve.ce.ClipboardHandler
- newFromDocumentInsertion is called in afterPasteInsertExternalData to merge the data.
- Before afterPasteInsertExternalData is called it's made sure that listKeys don't conflict by calling ve.dm.LinearData.prototype.remapInternalListKeys
- So in this use case we will not run into conflicting listKeys it seems and testing the correct remapping of these in newFromDocumentInsertion should not be a valid use case
ve.dm.VisualDiff
- newFromDocumentInsertion is used to merge the old version of the document into the new version, so that removes references can be made visible in the diff.
- So we need to make sure, that references that do not exist in the new version of the doc are merge from the old version.
- When it comes to the InternalList this means, that we might want to retrieve lost or disconnected InternalListItems. Merging the two versions of the document will give us exactly that by adding these items back and restoring the mapping.
- But: Conflicting listKeys should not be an issue here. If there's a conflict, it means the listKey exists on the old and new version, we don't care about these.
- So in that case it does not matter if keyIndexes is set up correctly or not. InternalList.merge() will make sure that missing keys are added.
- Furthermore: When the keyIndexes are set, InternalList.merge() will also make sure, that the items of the old doc do not override anything in the new doc but instead will be remapped to their new content.
ve.dm.MWReferenceModel.insertInternalItem & ve.dm.MWReferenceModel.updateInternalItem
- These methods are used to update/insert content that goes into the InternalItemNodes.
- That content is the content of the references.
- It's not supported to use references in that content so nesting is not possible.
- Therefor conflicts with listKeys between the content that's inserted and the main document is not possible.
ve.ui.MWGalleryDialog.insertOrUpdateNode
- The caption on galleries can be edited in VE in a separate dialog
- When saving the document edited in the dialog will be merged into the main document
- References can be used in both documents I played with this a lot and saw no regressions
ve.dm.MWImageModel.updateImageNode
- Same as the ve.ui.MWGalleryDialog.insertOrUpdateNode I saw no regressions
ve.ui.TableAction.importTable
- This method is used by the ve.ce.ClipboardHandler in there remapInternalListKeys is also called before merging anything
- So I also do not expect regressions here
Cloned InternalLists vs tests
In all of the above cases newFromDocumentInsertion is called in places where the inputs were at least once cloned from a document or from data. So in either case, the InternalList does not have any information about keyIndexes because it's not copied.
So ve.dm.InternalList.merge() would never run into issues with conflicting listKeys. Therefore testing newFromDocumentInsertion with populated InternalItems and keyIndexes seems to be something that's not covering any real world cases.
But Maintain keyIndexes when rebuilding InternalList ( 1196452 ) changes the above assumptions. So now it's made sure, that the InternalList is always intact with keyIndexes. Conflicts between listKeys would now be handled during merge and might lead to different and unexpected outcomes.
ve.ce.ClipboardHandler
This should still not be an issue in ve.ce.ClipboardHandler. As already mentioned ve.dm.LinearData.prototype.remapInternalListKeys make sure that there are no conflicts when merging. So if the new code make sure that keyIndexes stay intact it should not be a problem for the ClipboardHandler.
ve.dm.VisualDiff
As stated above the diffing is totally fine if InternalList.merge() is called with correctly set keyIndexes. It might even get a better setup doc to work on, since only missing content is merged.
Conclusion
It seems a bug to me, that the keyIndexes are lost while cloning InternalList. InternalList.merge() seems to depend on it and it seems that the code using it also expects that. That's why the clipboard handler remaps indexes to avoid conflicts and also for the diff it seems to make more sense to work with a document that includes removed references but also just updates the others.
IMO the current test setup in the Cite extension testing newFromDocumentInsertion tests something that's not realistic and should be changed. It might make more sense to test the behavior with populated indexes on both sides or have another test that includes the remapping step happening in the ClipboardHandler.



