See session below that illustrates the problem.
[subbu@earth:~/work/wmf/parsoid] echo '<div>foo' > /tmp/wt [subbu@earth:~/work/wmf/parsoid] php bin/parse.php < /tmp/wt > /tmp/old.html [subbu@earth:~/work/wmf/parsoid] sed 's/<\/div>/<\/div><p>boo<\/p>/g' < /tmp/old.html > /tmp/new.html [subbu@earth:~/work/wmf/parsoid] php bin/parse.php --html2wt < /tmp/new.html <div>fooboo [subbu@earth:~/work/wmf/parsoid] php bin/parse.php --selser --oldtextfile /tmp/wt --oldhtmlfile /tmp/old.html < /tmp/new.html <div>fooboo [subbu@earth:~/work/wmf/parsoid] sed 's/<\/div>/<\/div><p>boo<\/p>/g;s/"autoInsertedEnd":true/"autoInsertedEnd":false/' < /tmp/old.html > /tmp/new.html [subbu@earth:~/work/wmf/parsoid] php bin/parse.php --html2wt < /tmp/new.html <div>foo</div>boo [subbu@earth:~/work/wmf/parsoid] php bin/parse.php --selser --oldtextfile /tmp/wt --oldhtmlfile /tmp/old.html < /tmp/new.html <div>foo</div>boo
Original report my @matmarex
Parsoid seems to never insert the closing tag for an unclosed element. This is usually good since it avoids dirty diffs, but in some cases adding it is necessary to preserve the layout.
For example, consider this VE edit, where there is a <div ...> element on the page that is missing the closing tag. I typed a new paragraph after the end of that element in VE, but after saving the edit, it appeared inside of it instead (and a paragraph break is missing) – a </div> tag should have been added:
The same problem occurs with an unclosed table {| ... where the closing |} is not inserted.