In T157418: RFC: Make some aspects of Tidy's whitespace stripping behavior part of wikitext parsing "spec", we made leading/trailing whitespace in list items, table cells, headings insignificant and does not make it out to the HTML output. However, this means that Parsoid no longer has any information about original whitespace in these wikitext items.
This does not matter for unedited content or newly added content.
- For unedited content, the selective serialization algorithm preserves original wikitext as is.
- For newly added content, wikitext norms (as coded in Parsoid's html -> wt code) require that for readability reasons, whitespace be added appropriately.
All good so far. However, for original list items / headings / table cells that got edited, without any additional work, we'll start seeing "dirty diffs" since Parsoid will start trimming leading/trailing whitespace from these.
But, there is no clear solution that works well in all cases. Here are 3 possibilities:
- If we leave things as is, this will cause dirty diffs in edited original content as above.
- If we add readable whitespace always (for all content, not just newly added content), this will cause dirty diffs in the other direction, i.e. for example, list items that didn't have whitespace after bullets will ge them
- If we add additional logic to Parsoid to figure out how the original content looked and preserve it, that whitespace will get locked in forever for all edits done via VisualEditor (or other such HTML clients). There is no available mechanism for these clients to tell Parsoid to add/remove that whitespace. Editors wishing to alter whitespace would have to directly edit wikitext in a source editor.
Thoughts? My hunch is that we'll probably gravitate towards solution 3.