Wrap invisible characters in new content with <span type="mw:Entity">
Open, LowestPublic

Description

RoanKattouw: Ignoring copy-paste bugs and other input issues (T85941), I expect you'd get \u00A0 , never &nbsp; unless there was <span rel="mw:Entity">&nbsp;</span> in the source
cscott: note that generating \u00A0 is very unfriendly to old-school editors editing the wikitext, who can't tell that it's a \u00A0 and not a \u0020
RoanKattouw: I'd be happy for Parsoid to map \u00A0 --> &nbsp; (HTML -> wt)
RoanKattouw: I'd also be happy for VE to write &nbsp; rather than \u00A0 when hypothetically inserting them, although right now we always insert Unicode characters
cscott-free: yeah, it certainly blurs the line of responsibility between Parsoid and VE.
cscott-free: but it's a bit more consistent for Parsoid to use the <span> to unambiguously indicate when the author wanted an entity in the wikitext. Punting it back over to VE to generate a <span> for this particular case.
cscott-free: or maybe, more generally to insert a <span> for any characters in the \Zs class, which are otherwise invisible.
***James_F nods.
cscott-free: let's say I open a phab for "use <span> around \Zs" targetted to VE, and then we can bat it around a bit.
RoanKattouw: I don't know about <span>s
RoanKattouw: But we could do entities
cscott-free: <span type="mw:Entity"> i mean
cscott-free: which is the way to indicate to Parsoid that you'd like that character represented as an entity
RoanKattouw: There's a strong case for VE-as-an-HTML-editor to generate &nbsp; rather than raw \u00A0, but a very weak case for it to generate magic spans
cscott-free: semantically, when parsing the HTML there is no difference between &nbsp; and \u00A0
cscott-free: i'm not sure we even see the difference in our DOM
RoanKattouw: Hmm
RoanKattouw: Then I'm not sure we can produce the difference either
cscott-free: you can wrap it with <span type="mw:Entity"> ;)
cscott-free: technically, parsoid could play selser games and default to entity representations in wikitext for newly-created content which would otherwise be invisible. but i feel like that's a UX decision which should really be in the editor. you might decide there are other characters which you'd rather see in wikitext. plus I don't really like adding differences between selser and non-selser pathways.

cscott created this task.Jan 6 2015, 7:10 PM
cscott updated the task description. (Show Details)
cscott raised the priority of this task from to Needs Triage.
cscott added a project: VisualEditor.
cscott added a subscriber: cscott.
Jdforrester-WMF triaged this task as Lowest priority.Mar 5 2015, 9:44 PM
Jdforrester-WMF set Security to None.
Jdforrester-WMF moved this task from To Triage to Backlog on the VisualEditor board.