Page MenuHomePhabricator

Mixed use of RTL and LTR script with titles and usernames in a single line without isolation produce incorrect arrangement
Open, Needs TriagePublic

Description

See this link. The source written in the section was ":Yes, please read their constitution: [[s:ar:دستور سوريا]]. --[[Special:Contributions/60.26.9.220|60.26.9.220]] 08:38, 3 April 2019 (UTC)", however in my browser (Chrome) it was rendered as like this.

Event Timeline

Amire80 renamed this task from Mixed Use of RTL and LTR script in a single line produce incorrect arrangmenet. to Mixed use of RTL and LTR script with titles and usernames in a single line without isolation produce incorrect arrangement.Apr 6 2019, 8:06 AM

This happens because the Unicode bidi algorithm tries to re-arrange numbers and letters from different alphabets and cannot guess the user's intention correctly. This is especially common on MediaWiki sites with page titles, usernames, and IP addresses.

In this particular case, when editing in wiki syntax, the Arabic title is separated from the IP address numbers by "Special:Contributions" so it looks correct (to people who are accustomed to reading a lot of wiki syntax), but when the page is rendered in HTML, the numbers immediately follow the Arabic letters, so they jump to the wrong side of the page title.

Ideally, all page titles, usernames, and IP addresses should be bidi-isolated using the <bdi> tag or CSS such as unicode-bidi: isolate (more robust and semantic, but may have compatibility issues) or display: inline-block (more compatible, but less semantic and may have other issues). Since links, by their nature, usually point to titles and user names, this should be applied to links.

Until this bug is resolved comprehensively by applying it consistently everywhere, I can recommend applying the solutions above in templates that mention usernames or page titles, and to add <span dir="rtl">title</span> or <bdi>title</bdi> when mentioning RTL page titles in LTR wikis (and vice versa).

Here's a screenshot I made of the example in the description, in case the linked screenshot ever breaks:

Bildschirmfoto_2023-02-12_14-54-54.png (30×646 px, 9 KB)

On Wikidata I've added some CSS to MediaWiki:Common.css which adds unicode-bidi: isolate to links (diff) because RTL usernames were affecting the text after them, e.g. from this page:

Before I added the CSS:

Bildschirmfoto_2023-02-13_02-17-35.png (25×281 px, 4 KB)

After:

Bildschirmfoto_2023-02-13_02-26-25.png (25×279 px, 4 KB)