Page MenuHomePhabricator

Add bidi isolation to HTML tags in CodeMirror 6
Open, Needs TriagePublic

Description

See CodeMirror docs: https://codemirror.net/examples/bidi/

Background

HTML tags and similar markup may appear jumbled due to the standard algorithm used for character placement in RTL:

Screenshot from 2024-02-29 15-31-28.png (49×393 px, 6 KB)

CodeMirror should detect such markup and apply bidirectional isolation to it, so that it appears like:

Screenshot from 2024-04-09 21-43-29.png (41×372 px, 6 KB)

Acceptance criteria

  • In CodeMirror 6, HTML and similar tags (such as <ref>) used in wikitext should appear as left-to-right, even on a page using an RTL language.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
MusikAnimal renamed this task from Add bidi siolation to HTML tags in CodeMirror 6 to Add bidi isolation to HTML tags in CodeMirror 6.Mar 6 2024, 8:21 PM

Change 1009363 had a related patch set uploaded (by MusikAnimal; author: MusikAnimal):

[mediawiki/extensions/CodeMirror@master] CodeMirror 6: Add bidi isolation to HTML tags

https://gerrit.wikimedia.org/r/1009363

Change 1009363 merged by jenkins-bot:

[mediawiki/extensions/CodeMirror@master] CodeMirror 6: Add bidi isolation to HTML tags

https://gerrit.wikimedia.org/r/1009363

Change 1011163 had a related patch set uploaded (by MusikAnimal; author: MusikAnimal):

[mediawiki/extensions/CodeMirror@master] CM6: move bidiIsolation to be part of CodeMirrorModeMediaWiki

https://gerrit.wikimedia.org/r/1011163

Change 1011163 merged by jenkins-bot:

[mediawiki/extensions/CodeMirror@master] CM6: move bidiIsolation to be part of CodeMirrorModeMediaWiki

https://gerrit.wikimedia.org/r/1011163

Per T357795 it's deployed on all wikis on Beta. Would it be helpful to keep at least one wiki still running CM5?

Otherwise, you could compare/contrast with a production wiki where it's not yet deployed, such as enwiki.

Per T357795 it's deployed on all wikis on Beta. Would it be helpful to keep at least one wiki still running CM5?

Ah, right, thanks.

I am finding some articles on en-rtl beta wiki where the span.cm-bidi-isolate does not go around the whole of the tag. For example https://en-rtl.wikipedia.beta.wmflabs.org/w/index.php?title=Phonos_Alignment&action=edit, where it is only around the brackets (and not the first one).

Should the spans also go around links and templates? That seems to be happening here and here but not in most places.

What I am seeing is quite like in the task description. The closing tag is on the left. For example:

codemirror_rtl.png (61×344 px, 5 KB)

I am finding some articles on en-rtl beta wiki where the span.cm-bidi-isolate does not go around the whole of the tag. For example https://en-rtl.wikipedia.beta.wmflabs.org/w/index.php?title=Phonos_Alignment&action=edit, where it is only around the brackets (and not the first one).

I think this is because they are self-closing tags. I'll look into a fix.

Should the spans also go around links and templates? That seems to be happening here and here but not in most places.

Bizarre! It is not coded to do this to links and templates, though I was almost wondering if it should (if contextually it is written LTR).

What I am seeing is quite like in the task description. The closing tag is on the left.

My apologies. The flow of the tags should not be changed, so the closing tag still comes first (but it's actually last because the page is RTL!). I've updated the task description with an accurate screenshot.

Change #1018394 had a related patch set uploaded (by MusikAnimal; author: MusikAnimal):

[mediawiki/extensions/CodeMirror@master] CodeMirrorBidiIsolation: fix for when the tag is the first element

https://gerrit.wikimedia.org/r/1018394

Change #1018394 merged by jenkins-bot:

[mediawiki/extensions/CodeMirror@master] CodeMirrorBidiIsolation: fix for when the tag is the first element

https://gerrit.wikimedia.org/r/1018394

@dom_walden All the examples from T358804#9680122 had self-closing tags, and with that fixed they look okay to me now. Please have another look at your convenience, and thanks so much for uncovering this! I will wait for your OK before proceeding with deployment to our first pilot RTL wiki.

@MusikAnimal Is it supposed to look different in Firefox/Chrome compared to Safari?

Chromium:

codemirror_rtl_chromium.png (162×588 px, 8 KB)

Safari:

codemirror_rtl_safari.png (105×544 px, 19 KB)

I find it hard to tell where the cursor is, especially in Firefox or Chromium. Safari is a bit easier to understand. But I am not used to RTL editing.

@MusikAnimal Is it supposed to look different in Firefox/Chrome compared to Safari?

No... :(

This should be fun to fix… and by that I mean I have no idea how I'm going to do it, lol. You don't see errors in the JS console, do you? I used SauceLabs to do a quick test of the official CodeMirror example, and it looks correct in Safari. So, I believe the issue is something in our code :(

I wonder if the Safari compatibility issue should be a separate task. It should be looked into, but in the grand scheme of things, it doesn't seem to be a major issue (assuming there are no JS errors).

I find it hard to tell where the cursor is, especially in Firefox or Chromium. Safari is a bit easier to understand. But I am not used to RTL editing.

I've noticed that too. Sometimes even the cursor seems offset. Are you seeing that too?

For this I'll ask some colleagues for help. Maybe some of the behaviour is expected.

The <br> tag here is not being fully nested in the .cm-bidi-isolate span https://en-rtl.wikipedia.beta.wmflabs.org/w/index.php?title=Br&action=edit

This should be fun to fix… and by that I mean I have no idea how I'm going to do it, lol. You don't see errors in the JS console, do you? I used SauceLabs to do a quick test of the official CodeMirror example, and it looks correct in Safari. So, I believe the issue is something in our code :(

I don't see any errors in the console.

I find it hard to tell where the cursor is, especially in Firefox or Chromium. Safari is a bit easier to understand. But I am not used to RTL editing.

I've noticed that too. Sometimes even the cursor seems offset. Are you seeing that too?

It depends what you mean by "offset". But, yeah, it does not seem to appear in the right place to my LTR eyes.

The <br> tag here is not being fully nested in the .cm-bidi-isolate span https://en-rtl.wikipedia.beta.wmflabs.org/w/index.php?title=Br&action=edit

This is possibly an issue with the parsing of apostrophes. If we do instead boldened syntax (no extraneous apostrophes), things look better. Definitely still a bug, I just don't know how common it will be, and it seems to sort of be an issue for LTR as well. T304567 may be related.

I find it hard to tell where the cursor is, especially in Firefox or Chromium. Safari is a bit easier to understand. But I am not used to RTL editing.

So it seems this is somewhat expected since the page language is RTL but there is only LTR content. If you replace it with RTL content, things are a lot better. But I still see some weirdness with cursor movement, and the Home and End keys don't see to work like I'd expect. I'll look into it.


I really want to get this shipped to hewiki, so I think for now I'll disable bidi isolation unless a URL query param ?cm6bidi=1 is provided. That way we can still ask folks to try it out, and iterate on it until it's stable or good enough to re-enable by default. Bidi isolation was more of an "icing on the cake" kind of feature, anyway. I think most are used to not having it (there may be a gadget or something).

Change #1019431 had a related patch set uploaded (by MusikAnimal; author: MusikAnimal):

[mediawiki/extensions/CodeMirror@master] CodeMirrorModeMediaWiki: put bidi isolation behind URL feature flag

https://gerrit.wikimedia.org/r/1019431

Change #1019432 had a related patch set uploaded (by MusikAnimal; author: MusikAnimal):

[mediawiki/extensions/CodeMirror@master] Add $wgCodeMirrorRTL to control rollout to RTL wikis

https://gerrit.wikimedia.org/r/1019432

Change #1019431 merged by jenkins-bot:

[mediawiki/extensions/CodeMirror@master] CodeMirrorModeMediaWiki: put bidi isolation behind URL feature flag

https://gerrit.wikimedia.org/r/1019431

Change #1019432 merged by jenkins-bot:

[mediawiki/extensions/CodeMirror@master] Add $wgCodeMirrorRTL to control rollout to RTL wikis

https://gerrit.wikimedia.org/r/1019432

The <br> tag here is not being fully nested in the .cm-bidi-isolate span https://en-rtl.wikipedia.beta.wmflabs.org/w/index.php?title=Br&action=edit

This is possibly an issue with the parsing of apostrophes. If we do instead boldened syntax (no extraneous apostrophes), things look better. Definitely still a bug, I just don't know how common it will be, and it seems to sort of be an issue for LTR as well. T304567 may be related.

I believe I've confirmed this in fact the same bug as T304567. This has been an issue since CodeMirror was first deployed in 2016, so I don't think it's a show-stopper for bidi isolation or deployment to RTL wikis, unless linguistically they are more likely to use italic syntax with an apostrophe next to it. I don't think that is the case.

I still see some weirdness with cursor movement, and the Home and End keys don't see to work like I'd expect. I'll look into it.

This and related issues appear to be the main issue with bidi isolation. If I can manage to fix cursor movement, I think we're mostly OK to move forward.

Currently not working on this. Will work on refinement later.