This is a parent task for various subtasks having to do with irregularities parsing -{...}- constructs, especially if they contain embedded vertical bar characters. See the subtasks for specific bugs having to do with different places in the parser and preprocessor where irregularities have been found.
Description
Details
- Reference
- bz52661
Subject | Repo | Branch | Lines +/- | |
---|---|---|---|---|
Other language converter bugs (test case tweaks) | mediawiki/core | master | +1 -2 |
Status | Subtype | Assigned | Task | |
---|---|---|---|---|
· · · | ||||
Open | None | T43716 [EPIC] Support language variant conversion in Parsoid | ||
Open | None | T54661 Preprocessor/Parser irregularities with -{...}- variant constructs. | ||
Resolved | cscott | T146304 Preprocessor should handle -{...}- variant constructs in template arguments | ||
Resolved | cscott | T146305 Parser should protect -{...}- variant constructs in links | ||
Resolved | cscott | T54192 Markups in alt param of <gallery> are "eaten" during parsing | ||
Resolved | cscott | T54190 <gallery> with |link=<external link> doesn't work on wikis with LanguageConverter | ||
Resolved | cscott | T153135 doBlockLevels breaks with embedded language converter markup | ||
Resolved | cscott | T153140 -{ ... }- markup breaks tables | ||
Open | None | T153265 Language converter source text and language names cannot use <nowiki> escaping. | ||
Duplicate | BUG REPORT | None | T353501 new Parsoid cannot parse the converter wikitext syntax | |
· · · |
Event Timeline
From IRC:
07:36:15 PM) TimStarling: but it wouldn't be too hard to make the preprocessor annotate it, the same way it does with links
(07:36:33 PM) TimStarling: you know the preprocessor is responsible for expanding templates
(07:36:55 PM) TimStarling: but it marks up links for the sole purpose of getting correct template DOM
(07:37:28 PM) TimStarling: e.g. for parameter splitting in {{ a [[b|c]] }}
(07:37:45 PM) TimStarling: it would probably be beneficial for -{}- to be handled in the same way
(07:38:15 PM) TimStarling: then {{ a -{b|c}- }} would work in the intuitive way
(07:38:10 PM) cscott-free: yes. i think i'm going to add [[File:foobar.jpg|-{R|rawcaption}-]] as a parser test and open a bugzilla for that. for the future.
Change 78330 had a related patch set uploaded by Cscott:
Add parserTests for language converter markup.
btw There's another wikitext snippet that isn't handled well currently:
;-{zh-cn:AAA;zh-tw:BBB}-
Is this resolvable with the preprocessor change?
(In reply to comment #3)
;-{zh-cn:AAA;zh-tw:BBB}-
Is this resolvable with the preprocessor change?
Yes, I believe this has the same root cause.
(In reply to comment #4)
(In reply to comment #3)
;-{zh-cn:AAA;zh-tw:BBB}-
Is this resolvable with the preprocessor change?Yes, I believe this has the same root cause.
Lists are not handled by the preprocessor. The issue here is that the list handler (doBlockLevels) is not aware of -{ }- either and (wrongly) recognizes the embedded colon as a single-line dt/dd pair.
Right. But if the preprocessor lifts out the -{...}- constructs, then doBlockLevels won't get confused. So yes, same root cause.
If you reintroduce language conversion blocks only after doBlockLevels is done, then you'll need to find a different way to parse the contents of those blocks independently of the main content.
Also:
-{zh-cn:[[Category:A]];zh-tw:[[Category:B]];}-
This shouldn't be in both A and B (should it?). We don't want the category to depend on the variant. So maybe it *should* be in both?
I think it should be in neither. (gwicke agrees.)
[[Category:foo]] would add it to the 'foo' category. in a variant where foo=>bar, it might appear like [[Category:foo|bar]], and be edited that way by VE, but that wouldn't change the category of the page. Category links inside -{...}- would be forbidden (that is, parsed as plain text).
Change 311849 had a related patch set uploaded (by C. Scott Ananian):
WIP: protect language converter markup in the preprocessor.
Change 312066 had a related patch set uploaded (by C. Scott Ananian):
Other language converter bugs (test case tweaks)
There are also irregularities in how lists and tables with language converter markup are handled; see https://gerrit.wikimedia.org/r/312066
Change 312066 abandoned by C. Scott Ananian:
Other language converter bugs (test case tweaks)
Reason:
Squashed into https://gerrit.wikimedia.org/r/327127
I added T153761 as blocker since it would be nice to have a test for that case (I see things are moving: T146305#2891350).
There's an issue with autolink URLs, like:
-{en-us:http://elevator.com;en-gb:http://lift.net}-
because the autolink regexp won't stop at the semicolon and thus will grab the en-gb and break the language converter nesting. See T166429: Getting a unclean output with {{#property:P856}} on site which enables Language Converter.
Not entirely sure this is fixable, it seems to be a genuine priority mismatch between autolink and language converter constructs.