Page MenuHomePhabricator

[Story] Mark and show invisible characters in diff
Closed, DuplicatePublic

Description

so, in hewiki, we find occasionally "invisible direction markers". these are invisible unicode characters that act as their html counterparts of ‏ and ‎

these can be destructive and puzzling at times, by mangling wiki-code (e.g., when such a character appears between the first and second [ of internal link, it's not an internal link anymore).

when such markers are added or removed, the DIFF is confusing, because both sides look identical on the screen, and one has to use external tools (e.g., browser extensions such as "Unicode Analyzer") to understand what's going on.

methinks diff should indicate such changes better.

there are other invisible characters - these two (specifically: u200E and u200F), keep popping up in hewiki, and for us, it will be enough to indicate those.

as a developer, i'd probably strive to mark _any_ invisible character - short search found 3 more: mongolian vowel separator u180e, and zero-width space and zero-width-no-break-space: u200b and ufeff . there are probably others.

peace.

Event Timeline

Kipod raised the priority of this task from to Needs Triage.
Kipod updated the task description. (Show Details)
Kipod subscribed.
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
JanZerebecki renamed this task from Mark and show invisible characters in diff to [Story] Mark and show invisible characters in diff.Sep 24 2015, 10:57 AM
JanZerebecki triaged this task as Medium priority.
JanZerebecki set Security to None.
JanZerebecki moved this task from incoming to needs discussion or investigation on the Wikidata board.

Is this specific to Wikidata or a general diff problem?

Lydia_Pintscher changed the task status from Open to Stalled.Oct 6 2015, 8:57 AM

I can remember fixing external identifiers containing hidden characters, there is no indication of these in the diff view.

I don't see how this is Wikidata specific or how this is related to the Diff extension.

Instead this sounds like T5672 which has been marked as a duplicate of T15466.