VisualEditor: Support "Language conversion blocks" for multi-script wikis
Closed, ResolvedPublic40 Estimated Story Points
Actions

Description

The "language conversion blocks" are a wikitext feature that allow users to define text content in parallel scripts. The most high profile is in Chinese, which has two major writing systems and automated conversion between them, but there are 28 others, some of which are not automated conversion (so VE will need to not just mark the text with an appropriate <span>, but allow the user to understand when they have edited text that needs editing twice or thrice (possibly in scripts they cannot/do not want to use?) - see https://meta.wikimedia.org/wiki/Wikipedias_in_multiple_writing_systems

Documentation of the feature (focussed on the syntax) is here: https://www.mediawiki.org/wiki/Writing_systems/Syntax

Parsoid will need to add support for this first, which is T43716: [EPIC] Support language variant conversion in Parsoid.

Details

Reference: bz47411

Subject	Repo	Branch	Lines +/-
Inspectors for editing LanguageConverter markup	mediawiki/extensions/VisualEditor	master	+938 -3
Context item for LanguageConverter markup	mediawiki/extensions/VisualEditor	master	+265 -2
Display LanguageConverter markup in VisualEditor	mediawiki/extensions/VisualEditor	master	+744 -0

Customize query in gerrit

Related Objects
Search...

Status	Subtype	Assigned	Task
Open		None	T265163 Create a system to encode best practices into editing experiences
Open		Trizek-WMF	T331946 [RELEASE TICKET] Make Edit Check (references) available to all newcomers at all Wikipedias
Open		None	T52000 Enable VisualEditor by default for all users of all Wikimedia wikis
Open		None	T51999 Enable VisualEditor by default for all users of all Wikipedias
Resolved		None	T93388 Enable VisualEditor by default for all users of all "phase 7" Wikipedias
Resolved		None	T97315 Please enable VisualEditor at the Kazakh Wikipedia
Open		None	T132495 Enable VisualEditor by default for all users of all Wikivoyages
Stalled		None	T136996 Enable VisualEditor by default for all users of the Chinese Wikivoyage
Resolved		Jdforrester-WMF	T53792 VisualEditor: Non-English Wikipedia issues (tracking)
Resolved		cscott	T49411 VisualEditor: Support "Language conversion blocks" for multi-script wikis
Open		None	T43716 [EPIC] Support language variant conversion in Parsoid
Open		None	T21044 Document LanguageConverter
Resolved		cscott	T53587 Parsoid needs to run findVariantLink or some equivalent thing
Invalid		• GWicke	T48658 Tpl-style encapsulation for <include> and lang-variant conversions
Resolved		liangent	T45547 MediaWiki needs a fictitious variant for English for easier variant development work
Resolved		thiemowmde	T156280 Wikibase assumes English doesn't have a variant
Open		None	T54661 Preprocessor/Parser irregularities with -{...}- variant constructs.
Resolved		cscott	T146304 Preprocessor should handle -{...}- variant constructs in template arguments
Resolved		cscott	T153761 Incorrect parser output if -{{ appears in wikitext
Resolved		• Elitre	T165175 Support communications around the preprocessor fixups
Resolved		cscott	T146305 Parser should protect -{...}- variant constructs in links
Resolved		cscott	T54192 Markups in alt param of <gallery> are "eaten" during parsing
Resolved		cscott	T54190 <gallery> with \|link=<external link> doesn't work on wikis with LanguageConverter
Resolved		cscott	T153135 doBlockLevels breaks with embedded language converter markup
Resolved		cscott	T153140 -{ ... }- markup breaks tables
Open		None	T153265 Language converter source text and language names cannot use <nowiki> escaping.
Duplicate	BUG REPORT	None	T353501 new Parsoid cannot parse the converter wikitext syntax
Resolved		cscott	T153341 Export LanguageConverter enabled status in page info from core
Open		None	T204966 Production use of LanguageConverter for read views of Phase 2A languages
Open		None	T204968 Production use of LanguageConverter for read views of Phase 2B languages
Open		None	T204969 Production use of LanguageConverter for read views of Phase 2C languages
Open		None	T222328 [extlink] parsing - link cannot contain language variant or extension tags
Resolved	BUG REPORT	Jgiannelos	T305383 [BUG] Kazakh Wikipedia Character mapping
Open		None	T320733 Support and document how language conversion work with multidirectional wikitext <=> HTML conversion on language-conversion-supported extensions.
Resolved		Jdforrester-WMF	T95674 Decide on a strategy for supporting language variants in VisualEditor (or decide to give up)

Event Timeline

• bzimport raised the priority of this task from to Medium.Nov 22 2014, 1:19 AM

• bzimport added projects: VisualEditor-EditingTools, I18n.

• bzimport set Reference to bz47411.

Jdforrester-WMF created this task.Apr 19 2013, 2:02 PM

Jdforrester-WMF added a project: VisualEditor.Nov 23 2014, 10:47 PM

Jdforrester-WMF moved this task from To Triage to Freezer on the VisualEditor board.Nov 24 2014, 1:29 AM

Jdforrester-WMF edited projects, added VisualEditor-ContentLanguage; removed I18n, VisualEditor-EditingTools.Dec 3 2014, 3:01 AM

Jdforrester-WMF set Security to None.

Jdforrester-WMF merged a task: T49913: VisualEditor: Support language variant conversion labels.Apr 27 2015, 5:49 PM

Jdforrester-WMF added a parent task: T97315: Please enable VisualEditor at the Kazakh Wikipedia.

Jdforrester-WMF added a subtask: T95674: Decide on a strategy for supporting language variants in VisualEditor (or decide to give up).

Jdforrester-WMF added a subscriber: Waihorace.

• Tbayer subscribed.Aug 23 2015, 10:59 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptAug 23 2015, 10:59 PM

Jdforrester-WMF added a parent task: T93388: Enable VisualEditor by default for all users of all "phase 7" Wikipedias.Aug 23 2016, 6:18 PM

Jdforrester-WMF added a parent task: T136996: Enable VisualEditor by default for all users of the Chinese Wikivoyage.

Jdforrester-WMF added a project: Epic.Sep 1 2016, 11:59 PM

Jdforrester-WMF updated the task description. (Show Details)

Jdforrester-WMF set the point value for this task to 40.

Krinkle unsubscribed.Sep 2 2016, 12:18 AM

Jdforrester-WMF changed the task status from Open to Stalled.Sep 2 2016, 6:24 PM

Shizhao added a project: Chinese-Sites.Dec 27 2016, 8:48 AM

Shizhao moved this task from Backlog to Extensions/Skins on the Chinese-Sites board.Dec 27 2016, 8:54 AM

Liuxinyu970226 changed the status of subtask T43716: [EPIC] Support language variant conversion in Parsoid from Open to Stalled.Jan 1 2017, 5:44 AM

Legoktm changed the status of subtask T43716: [EPIC] Support language variant conversion in Parsoid from Stalled to Open.Jan 1 2017, 10:04 AM

cscott changed the task status from Stalled to Open.May 31 2017, 4:49 PM

@cscott Any updates for us on this since it's reopened? :-)

Change 356739 had a related patch set uploaded (by C. Scott Ananian; owner: C. Scott Ananian):
[mediawiki/extensions/VisualEditor@master] WIP: Specialized inspector for LanguageConverter markup

https://gerrit.wikimedia.org/r/356739

gerritbot added a project: Patch-For-Review.Jun 1 2017, 9:16 PM

@Deskana sure: Theres's fully-functioning Parsoid support in https://gerrit.wikimedia.org/r/140235 (T43716 phase 1). VE currently "alienates" these blocks, which means they are invisible and uneditable but round-trip fine if you don't touch them.

I'm currently working on some basic support for displaying/editing the language converter rules in VE (which is what this task is about). Very rough start of a patch above; there are some representation issues to work out. VE is a little weak on directly editing generated content. I'll copy some discussion with @Catrope here:

(05:17:05 PM) cscott-free: Roan: is there any precedent in VE for an inline BranchNode ?
(05:17:30 PM) cscott-free: that is, a Node that behaved like an annotation: you could edit inside it, and it was laid out inline.
(05:17:43 PM) RoanKattouw: No I don't think so
(05:18:01 PM) RoanKattouw: There are inline nodes like images
(05:18:14 PM) cscott-free: yes, but they are ve.ce.LeafNodes, not allowed to have any contents
(05:18:22 PM) RoanKattouw: But no inline nodes that have children
(05:18:33 PM) cscott-free: even BlockImage is a LeafNode, and you have to click into the inspector to edit the caption
(05:18:38 PM) RoanKattouw: Yeah
(05:18:52 PM) RoanKattouw: So, I would like us to have inline editing of image captions
(05:19:01 PM) RoanKattouw: But it wouldn't be done that way
(05:19:09 PM) RoanKattouw: What is the application you have in mind?
(05:19:34 PM) cscott-free: I tried to use toDataElement to convince the dm that my <span> was actually an annotation, but that didn't work quite right with generated content.
(05:20:24 PM) cscott-free: -{R|foo}- is a silly sort of <nowiki>, right? But it's represented by Parsoid as <span typeof="mw:LanguageVariant" data-mw-variant='{....text:"foo"}'></span>
(05:20:43 PM) RoanKattouw: What does that syntax mean again?
(05:21:01 PM) cscott-free: It just means "protect foo from language conversion"
(05:21:09 PM) RoanKattouw: Aha OK
(05:21:20 PM) cscott-free: but there are other similar forms
(05:21:22 PM) RoanKattouw: Does it also output Foo?
(05:21:25 PM) cscott-free: yes
(05:21:37 PM) RoanKattouw: Then shouldn't the span be not empty?
(05:21:49 PM) cscott-free: well... maybe.
(05:22:16 PM) RoanKattouw: I mean, at least for display purposes you would want that, right?
(05:22:22 PM) cscott-free: see https://www.mediawiki.org/wiki/Parsoid/MediaWiki_DOM_spec/Language_conversion_blocks#Alternative_2
(05:22:45 PM) RoanKattouw: Also, am I allowed to have annotations inside?
(05:22:56 PM) RoanKattouw: Can part of foo be bold, or a link?
(05:23:01 PM) cscott-free: yes, you can have annotations inside
(05:22:56 PM) cscott-free: the idea is that eventually we'll do the actual language conversion client-side, and so we'd fill the empty spans with the correct thing based on the currently-selected variant
(05:23:12 PM) RoanKattouw: Oh, I see
(05:23:30 PM) RoanKattouw: So it behaves kind of like a reference then
(05:23:42 PM) RoanKattouw: Or a mix between that and an auto numbered link
(05:23:43 PM) cscott-free: in other cases, there might be more than one possible output in there, and I only want to see one of them (at a time)
(05:23:49 PM) RoanKattouw: Right
(05:24:10 PM) RoanKattouw: So the content isn't text then, it's HTML
(05:24:14 PM) cscott-free: yeah
(05:24:38 PM) cscott-free: and it's got the usual "could be block, could be inline" thing that MWTransclusionNode deals with
(05:24:55 PM) RoanKattouw: That's a tricky one, I'd recommend picking edsanders and David's brains too
(05:25:12 PM) RoanKattouw: The fact that it can be block is very annoying
(05:25:23 PM) RoanKattouw: That makes it hard to make it an annotation
(05:26:01 PM) cscott-free: RoanKattouw: yes, and it's something I'd eventually like to fix on the PHP side. But there are a few cases like -{zh-cn:==Foo==;zh-tw:==Bar==}-
(05:26:15 PM) cscott-free: which should really be rewritten as == -{zh-cn:Foo;zh-tw:Bar}- ==
(05:27:19 PM) cscott-free: usual annoying story about balance, templates, markup boundaries, etc.
(05:26:14 PM) RoanKattouw: Right
(05:25:13 PM) cscott-free: At any rate, I guess the easiest way to get started here is to implement it as a 'boring' LeafNode and not allow direct editing, you'll have to use an inspector
(05:27:15 PM) RoanKattouw: If it's only annotated text, then you could make it an annotation that would have to do some magic to generate its own content and feed the right content back into data-mw
(05:29:03 PM) cscott-free: yeah, with an annotation i just hit a roadblock in ve.dm.Converter:getDomSubtreeFromData which doesn't give me a way to return the annotation *and* the data contained as an array from toDataElement, the way that nodes can
(05:29:30 PM) cscott-free: so I might patch that and handle the case where toDataElement returns an array of length > 1 from toDataElement
(05:29:45 PM) cscott-free: but figured I'd check to make sure I wasn't missing something obvious first
(05:30:01 PM) RoanKattouw: Yeah you'll have to invent something new here
(05:30:16 PM) RoanKattouw: In both directions
(05:30:31 PM) RoanKattouw: Or add pre- and post-processing steps
(05:30:15 PM) cscott-free: I might also just give in an generate explicit <span typeof="mw:LanguageVariant/raw">foo</span> for some of these cases
(05:30:24 PM) cscott-free: which would be a more direct analog of <nowiki>
(05:30:43 PM) RoanKattouw: Also look at how nowiki works on the way out
(05:31:07 PM) RoanKattouw: It's not really the same, it drops the annotation, but it might give you ideas
(05:31:04 PM) cscott-free: If you did -{foo<div>bar}- then I *think* the usual HTML5 treebuilding would split the <span> over the <div> and you'd effectively get -{foo}-<div>-{bar}-
(05:31:32 PM) cscott-free: that works for the "raw output" case, but things get really hairy if there are multiple alternatives.
(05:31:38 PM) RoanKattouw: Well if the resulting HTML is too weird, you can just alienate it
(05:32:25 PM) RoanKattouw: Which might even happen automatically because of special treatment of the mw: prefix and protections against misnesting
(05:33:13 PM) cscott-free: yeah, right now VE is alienating everything which is fine but because the <span>s are empty the result is that the content goes missing
(05:33:43 PM) cscott-free: again, maybe an indicator that this whole "empty span to be filled with converted output" idea isn't all that hot. we'll see.
(05:34:20 PM) cscott-free: anyway, I think using a Node and just dealing with non-direct editing is the way to go for the crappy-first-draft.
(05:34:32 PM) cscott-free: although i'm curious how you planned to allow direct figure caption editing
(05:41:02 PM) RoanKattouw: cscott: Basically, make captions work the way references work, with their contents being in an internalList item or subdocument, then create a surface on that subdoc and embed it in the image frame
(05:41:22 PM) RoanKattouw: The subdocuments thing is a refactor I started in 2014 and never finished
(05:42:25 PM) RoanKattouw: You could also do it without changing the DM representation if you are able to make a surface for a subset of the document
(05:42:54 PM) RoanKattouw: But the change in representation would allow inline images to retain captions
(05:44:55 PM) cscott-free: Yeah, I just would want to be able to cursor seamlessly "into" the embedded subdoc.

Change 358396 had a related patch set uploaded (by C. Scott Ananian; owner: C. Scott Ananian):
[mediawiki/extensions/VisualEditor@master] Node inspector for LanguageConverter markup

https://gerrit.wikimedia.org/r/358396

Change 361921 had a related patch set uploaded (by C. Scott Ananian; owner: C. Scott Ananian):
[mediawiki/extensions/VisualEditor@master] WIP: Dialog for editing LanguageConverter markup

https://gerrit.wikimedia.org/r/361921

See https://www.mediawiki.org/wiki/Parsoid/Language_conversion#Testing_LanguageConverter_with_Parsoid_and_VisualEditor for setup & test hints.

Srdjan subscribed.Jul 21 2017, 3:36 PM

Change 356739 merged by jenkins-bot:
[mediawiki/extensions/VisualEditor@master] Display LanguageConverter markup in VisualEditor

https://gerrit.wikimedia.org/r/356739

ReleaseTaggerBot added a project: MW-1.30-release-notes (WMF-deploy-2017-07-25_(1.30.0-wmf.11)).Jul 25 2017, 5:00 PM

Change 358396 merged by jenkins-bot:
[mediawiki/extensions/VisualEditor@master] Context item for LanguageConverter markup

https://gerrit.wikimedia.org/r/358396

ReleaseTaggerBot edited projects, added MW-1.30-release-notes (WMF-deploy-2017-08-15 (1.30.0-wmf.14)); removed MW-1.30-release-notes (WMF-deploy-2017-07-25_(1.30.0-wmf.11)).Aug 11 2017, 3:00 PM

Change 361921 merged by jenkins-bot:
[mediawiki/extensions/VisualEditor@master] Inspectors for editing LanguageConverter markup

https://gerrit.wikimedia.org/r/361921

Jdforrester-WMF closed subtask T95674: Decide on a strategy for supporting language variants in VisualEditor (or decide to give up) as Resolved.Sep 15 2017, 4:36 PM

ReleaseTaggerBot edited projects, added MW-1.30-release-notes (WMF-deploy-2017-09-19 (1.30.0-wmf.19)); removed MW-1.30-release-notes (WMF-deploy-2017-08-15 (1.30.0-wmf.14)).Sep 15 2017, 5:00 PM

I believe there's still more outstanding here, right?

In T49411#3691289, @Deskana wrote:

I believe there's still more outstanding here, right?

Yes, see T43716: [EPIC] Support language variant conversion in Parsoid - phase 1 support is complete, which brings VE to parity with the PHP parser -- in both cases, as soon as you start to edit the page you're confronted with the raw un-converted text (not the text in your preferred variant).

VE can do better (in theory) -- that's "phase 3" of T43716. But that's probably another bug. Since VE is now at parity with PHP source editing, I think this task can be resolved, and we'll open specific bugs for deficiencies in the current UX (ie, T182910). I'll open a new task for the VE side of Parsoid's phase 3, once we get there, but that's a new feature.

Restricted Application added a project: User-Ryasmeen. · View Herald TranscriptDec 14 2017, 7:50 PM

Liuxinyu970226 unsubscribed.Dec 15 2017, 4:21 AM

Shizhao moved this task from Extensions/Skins to Closed on the Chinese-Sites board.Dec 20 2017, 7:06 AM

cscott mentioned this in T277546: Can't create language variant markup from VisualEditor.Mar 16 2021, 3:52 PM

Winston_Sung subscribed.Jul 27 2021, 8:22 AM