Page MenuHomePhabricator

VisualEditor causes extra edit conflict noise when headings include whitespace
Closed, DeclinedPublic

Description

Steps to reproduce:

  • Given that I'm editing a page which includes whitespace around section titles, e.g. "== section name =="
  • Cause an edit conflict in regular article text.

The conflict will include whitespace changes, it seems that VE transforms the headings to "==section name==".
In the example below, the only text edited was "mine" and "other":

image.png (474×764 px, 21 KB)

Event Timeline

Could you please expand on steps to reproduce.

Trying to reproduce on enwiki or ruwiki doesn't make VE trim the whitespaces. If it's your local setup - do you have restbase?

matmarex subscribed.

I think this is basically the expected behavior.

As I understand it, Parsoid HTML doesn't include metadata about the internal whitespace of wikitext headings, and Parsoid relies on selser to avoid these dirty diffs (selective serialization – detecting unchanged fragments of the HTML and using the original wikitext for those fragments). Selser is only available when you're using RESTBase, because otherwise Parsoid doesn't receive the original HTML to compare to the new HTML. (There just isn't a parameter for it in the API; if there was, we would use it in VE.)

Tagging Parsoid so that someone can verify my poor explanation :)

I see, thank you. I found that a local VE edit actually does introduce this whitespace change as you explained, so it isn't a conflict issue but a general Parsoid thing. I'll close this task...

I think this is basically the expected behavior.

As I understand it, Parsoid HTML doesn't include metadata about the internal whitespace of wikitext headings, and Parsoid relies on selser to avoid these dirty diffs (selective serialization – detecting unchanged fragments of the HTML and using the original wikitext for those fragments). Selser is only available when you're using RESTBase, because otherwise Parsoid doesn't receive the original HTML to compare to the new HTML. (There just isn't a parameter for it in the API; if there was, we would use it in VE.)

Tagging Parsoid so that someone can verify my poor explanation :)

Yes, correct. But, Parsoid internally does reparse the page to get original HTML for running selser (I will have to look at code to see if there are exceptions to that), but, Parsoid parses aren't always deterministic ( T93715 ) but are probably more deterministic than not with Parsoid/PHP, so, even in the no-RESTBase case, dirty diffs should be minimized, but yes, we don't guarantee similar serialization behavior. Just to satisfy my own curiosity, I might check play around to see why this happened. But, yes, declining it is the right resolution for the task.

But, Parsoid internally does reparse the page to get original HTML for running selser (I will have to look at code to see if there are exceptions to that), but, Parsoid parses aren't always deterministic ( T93715 ) but are probably more deterministic than not with Parsoid/PHP

See https://github.com/wikimedia/parsoid/blob/master/src/WikitextContentModelHandler.php#L24-L58