See https://meta.wikimedia.org/w/index.php?title=Community_Engagement_Insights/2018_Report/Communications_Department&diff=18339687&oldid=18339625 The next diff shows what should have happened. This was all very simple editing: Place cursor in the middle of a paragraph, press Return, add list formatting, remove a few words.
This is still reproducible on that page.
Looks like we send HTML like this to Parsoid:
<p id="mwUg">In examining this question by gender, we can observe some differences. We cannot say whether they are significant. When asked about using media channels for learning about features and services from the Wikimedia Foundation we observed the following: </p><ul><li><p id="mwUg">68% of males reported using at least one channel. 80% of females reported using at least one channel.</p></li><li><p id="mwUg">A higher proportion of males used Wikimedia projects pages. Female editors reported a higher use of mailing lists, social media, the Wikimedia Foundation blog, and conferences.</p></li></ul>
Note how all of the paragraphs have the same id="mwUg" attribute.
I don't know whether this is a VE bug or a Parsoid bug.
@Esanders I think the data-parsoid/data-mw is intended to never affect visual rendering, so there should be no reason to duplicate the id attributes on the copied content. For cut-and-paste sure preserve the original ID, but future pastes should assign new IDs. (Or just do a pass over the output before you send it to parsoid and delete all but the first appearance of a given id.)
There are probably corner cases (headings were mentioned) where for some reason copying the ID gives better behavior; we should add additional *non-data-parsoid*/*non-data-mw* attributes to control that behavior.
I cannot reproduce this bug right now. I tried on mediawiki.org and on meta on the same page (and accidentally even saved one of my test edits). So, either this was some peculiar edge case on the page or this has been resolved in the interim because of changes in VE or Parsoid.
My attempt to reproduce:
>> ve.init.target.doc.body.innerHTML "<p id=\"mwAg\">This is a paragraph. This is sentence two of the paragraph.</p>" >> ve.init.target.docToSave.body.innerHTML "<p id=\"mwAg\">This is a paragraph. </p><ul><li><p id=\"mwAg\">This is sentence </p></li><li><p id=\"mwAg\">of the paragraph.</p></li></ul>"
That looks fine. I think we must have fixed this sometime since this bug was filed.
This was probably some edge case. If you look closely at the diff, there were some dirty diffs of the gallery as well which indicates that something else was going on. We cannot reproduce this, so just closing it out for now. If this were a common bug, we would have lots of reports and easy ways to reproduce it.