Page MenuHomePhabricator

VE removes space in wikitext in list elements starting with wikilinks
Closed, ResolvedPublic

Description

As originally reported in https://de.wikipedia.org/wiki/Wikipedia:Technik/Text/Edit/VisualEditor/R%C3%BCckmeldungen#Leerzeichen_fehlt:_*Liste :

When editing list elements starting with a wikilink, VE removes the space between "*" and "[[":
https://de.wikipedia.org/w/index.php?title=Benutzer:Tkarcher/Spielwiese&diff=next&oldid=196793725

(Expected wikitext output would be * [[Test]], not *[[Test]])

Event Timeline

Tkarcher created this task.Feb 13 2020, 9:14 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptFeb 13 2020, 9:14 PM
JTannerWMF moved this task from To Triage to Triaged on the VisualEditor board.Mar 10 2020, 3:33 PM
JTannerWMF added a project: Parsoid.
JTannerWMF added a subscriber: JTannerWMF.

Tagging Parsoid for visibility.

LGoto triaged this task as Low priority.Mar 13 2020, 4:12 PM
LGoto moved this task from Needs Triage to Bugs & Crashers on the Parsoid board.
Pols12 added a subscriber: Pols12.EditedJun 6 2020, 8:15 PM

Please look at this diff.

In list 1, leading space have been removed each time I have formatted the first word of the list item.
Note that, when I remove formatting of the first word, the leading space remains.

In list 2, I have only formatted the first word of one item and added a new paragraph before the list. Results: all items which had their first word formatted lose their eventual leading space.

This issue in list 2 may cause unexpected big diff: example on fr.wp (look at diff after == Œuvre == addition).

ssastry raised the priority of this task from Low to Medium.Jun 8 2020, 10:59 PM

Change 604035 had a related patch set uploaded (by Arlolra; owner: Arlolra):
[mediawiki/services/parsoid@master] [WIP] Preserve leading space, even for non-text nodes

https://gerrit.wikimedia.org/r/604035

Arlolra claimed this task.Jun 9 2020, 5:06 PM
Arlolra added a comment.EditedJun 9 2020, 5:41 PM

Please look at this diff.

Here's the same edit made today,
https://www.mediawiki.org/w/index.php?title=User:Arlolra/sandbox&type=revision&diff=3902422&oldid=3902417&diffmode=source

It's unclear why untouched lines have diffs.

The patch in T245206#6206247 fixes the cases from the expected diff today.

Are there steps to reproduce the other part?

Change 604035 merged by jenkins-bot:
[mediawiki/services/parsoid@master] html2wt: Newly inserted elements shouldn't disrupt whitespace heuristics

https://gerrit.wikimedia.org/r/604035

Pols12 added a comment.EditedJun 9 2020, 10:01 PM

Please look at this diff.

Here's the same edit made today,
https://www.mediawiki.org/w/index.php?title=User:Arlolra/sandbox&type=revision&diff=3902422&oldid=3902417&diffmode=source

It's unclear why untouched lines have diffs.

The patch in T245206#6206247 fixes the cases from the expected diff today.

Are there steps to reproduce the other part?

Yes: to reproduce list2 bug, you must add a new paragraph just before the list (in addition to format first word of one element)

Yes: to reproduce list2 bug, you must add a new paragraph just before the list (in addition to format first word of one element)

Thanks!

VE seems to be duplicating the id when inserting that paragraph,

<p id="mwDg"></p><p id="mwDg">list 2 (all elements starts with space, some have first word formatted):</p>

which, in turn, makes Parsoid think the list is newly inserted,

<p data-parsoid='{"dsr":[389,461,0,0]}' data-parsoid-diff='{"id":15580374,"diff":["children-changed","subtree-changed"]}'></p><meta typeof="mw:DiffMarker/deleted" data-parsoid="{}"/><meta typeof="mw:DiffMarker/deleted" data-parsoid="{}"/><meta typeof="mw:DiffMarker/deleted" data-parsoid="{}"/><p data-parsoid='{"dsr":[389,461,0,0]}' data-parsoid-diff='{"id":15580374,"diff":["inserted"]}'>list 2 (all elements starts with space, some have first word formatted):</p><meta typeof="mw:DiffMarker/inserted" data-parsoid="{}"/>
<ul data-parsoid='{"dsr":[462,666,0,0]}' data-parsoid-diff='{"id":15580374,"diff":["inserted"]}'>

Continuing from the some of the discussion in https://gerrit.wikimedia.org/r/c/mediawiki/services/parsoid/+/604035,

The space is considered a separator for an element node and emitSepForNode will restore it on its own. Without it, you end up with duplicated spaces.

we only reuse original separators in we're !$state->inModifiedContent,
https://github.com/wikimedia/parsoid/blob/dd05f9126175f6f4651f80b4f387fc33cb2c38ea/src/Html2Wt/Separators.php#L626

Change 605678 had a related patch set uploaded (by Subramanya Sastry; owner: Subramanya Sastry):
[mediawiki/vendor@master] Bump Parsoid to 0.12.0-a17

https://gerrit.wikimedia.org/r/605678

Change 605678 merged by jenkins-bot:
[mediawiki/vendor@master] Bump Parsoid to 0.12.0-a17

https://gerrit.wikimedia.org/r/605678

Arlolra closed this task as Resolved.Jun 29 2020, 9:00 PM

VE seems to be duplicating the id when inserting that paragraph,

Opened up a broader discussion for that in T256687

Restricted Application added a project: User-Ryasmeen. · View Herald TranscriptJun 29 2020, 9:00 PM
Pols12 added a comment.EditedAug 20 2020, 11:15 PM

I’m not sure to understand what happened on this diff on French Wiktionary: the item starting with ''Contactée lost its leading space but not the following one…
Should I open a new task?

EDIT: another similar example (where the list haven’t been deliberately edited at all)