Spaces added before colons and links in some cases
Closed, ResolvedPublic

Description

See https://fr.wikipedia.org/w/index.php?title=COE&diff=prev&oldid=93166267 for example.

I only removed the link, but VisualEditor added a space before the colons.

Another example: https://fr.wikipedia.org/w/index.php?title=MSS&diff=prev&oldid=93166411

There was already two spaces, but VisualEditor added a third.


Version: unspecified
Severity: major
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=51024

bzimport added a project: Parsoid.Via ConduitNov 22 2014, 1:43 AM
bzimport set Reference to bz48570.
Ltrlg created this task.Via LegacyMay 17 2013, 7:43 AM
Elitre added a comment.Via ConduitJun 23 2013, 10:33 AM

The same happens when you edit the first word of a sequence "word + word + wikilink", it will add an unnecessary space before the link. Examples,
http://it.wikipedia.org/w/index.php?title=Amazon.com&diff=prev&oldid=59604951 -
http://it.wikipedia.org/w/index.php?title=Google&diff=59622535&oldid=59377529 (<<talmente popolare ch in [[lingua inglese|inglese]] >>).
This last one is a deliberate vandalism from me to check the issue, I later rollbacked.

Jdforrester-WMF added a comment.Via ConduitJun 23 2013, 7:59 PM

(In reply to comment #0)

See https://fr.wikipedia.org/w/index.php?title=COE&diff=prev&oldid=93166267
for example.

I only removed the link, but VisualEditor added a space before the colons.

Another example:
https://fr.wikipedia.org/w/index.php?title=MSS&diff=prev&oldid=93166411

There was already two spaces, but VisualEditor added a third.

These have been now fixed - sorry for the difficulty.

(In reply to comment #1)

The same happens when you edit the first word of a sequence "word + word +
wikilink", it will add an unnecessary space before the link. Examples,
http://it.wikipedia.org/w/index.php?title=Amazon.
com&diff=prev&oldid=59604951 -
http://it.wikipedia.org/w/index.php?title=Google&diff=59622535&oldid=59377529
(<<talmente popolare ch in [[lingua inglese|inglese]] >>).
This last one is a deliberate vandalism from me to check the issue, I later
rollbacked.

I *think* this is also fixed but I'm not absolutely sure; please re-open if you can still reproduce this.

Elitre added a comment.Via ConduitAug 2 2013, 11:11 AM

We unfortunately can:
http://en.wikipedia.org/w/index.php?title=Little_Mosque_on_the_Prairie&diff=prev&oldid=566820635

It seems to be triggered by removal of '.
Another user verified this (not actually saving, only preview mode) in http://en.wikipedia.org/wiki/Indian_Bank , removing the ' after "Binny&Co" would cause a whitespace to be added right before the next wikilink.

Thryduulf added a comment.Via ConduitAug 2 2013, 11:23 AM

It's not just ' either, also seen when removing - and " in my sandbox https://en.wikipedia.org/w/index.php?title=User%3AThryduulf%2Fsandbox&diff=566841354&oldid=566840716

spaces were added before internal and external links, except in the first instance where a space was added before a colon instead.

Thryduulf added a comment.Via ConduitAug 2 2013, 11:26 AM

Could the fix to bug 2035 be related to this?

Thryduulf added a comment.Via ConduitAug 2 2013, 11:26 AM

Ignore last comment, should be: Could the fix to bug 52035 be related to this?

Esanders added a comment.Via ConduitAug 19 2013, 4:51 PM

This looks like 1. a Parsoid issue, 2. fixed in master. Reassigning to Parsoid to confirm.

Ltrlg added a comment.Via ConduitSep 17 2013, 8:09 PM

See also bug 51024

Thryduulf added a comment.Via ConduitSep 17 2013, 10:01 PM
  • Bug 51024 has been marked as a duplicate of this bug. ***
Thryduulf added a comment.Via ConduitSep 17 2013, 10:02 PM

Bug 51024 has several more examples of this bug occurring if anyone is looking for them.

ssastry added a comment.Via ConduitSep 17 2013, 11:42 PM

Confirmed on simple test case. Investigating.

[subbu@earth tests] echo "foo : bar" > /tmp/wt
[subbu@earth tests] node parse < /tmp/wt > /tmp/old.html
[subbu@earth tests] cat /tmp/old.html | sed 's/bar/bars/g;' > /tmp/new.html
[subbu@earth tests] node parse --html2wt --selser --oldtextfile /tmp/wt --oldhtmlfile /tmp/old.html < /tmp/new.html
foo : bars

gerritbot added a comment.Via ConduitSep 18 2013, 1:04 AM

Change 84701 had a related patch set uploaded by Subramanya Sastry:
(Bug 48570) Fix subtle selser bug handling separator-only nodes

https://gerrit.wikimedia.org/r/84701

Catrope added a comment.Via ConduitSep 19 2013, 6:02 PM
  • Bug 50637 has been marked as a duplicate of this bug. ***
gerritbot added a comment.Via ConduitSep 20 2013, 4:56 PM

Change 84701 merged by jenkins-bot:
(Bug 48570) Fix subtle selser bug handling separator-only nodes

https://gerrit.wikimedia.org/r/84701

Elitre added a comment.Via WebFeb 7 2015, 5:06 PM

Can it be this is happening again? At cs.wp they are filtering edits adding extra spaces, like this or this (I haven't much info ATM).

ssastry added a comment.Via WebFeb 9 2015, 3:06 PM

@Elitre: This doesn't look like space before colons or links .. Could they be null edits / edit undos that might have left behind spaces in VE that Parsoid is preserving?

Elitre added a comment.Via WebFeb 9 2015, 3:38 PM

(Sorry if I'm adding to the wrong task, I think a more relevant one was closed as dupe of this one. I haven't heard from cs.wp yet but it looks like there are too many occurrences to be the case, IMHO).

Ironholds removed a subscriber: Ironholds.Via WebFeb 11 2015, 4:55 AM
Elitre added a comment.Via WebFeb 11 2015, 5:12 PM

According to a cs.wp editor such changes are intentional, so possibly no need to investigate further. Thanks.

Add Comment