Ordered application of annotations to avoid fragmentation (e.g. ''[[Foo|Fo]]''[[Foo|o]])
Open, LowPublic

Description

Fairly self-explanatory (and weird).


Version: unspecified
Severity: major
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=49985

bzimport set Reference to bz50098.

Annotations get applied to the text in the order that they're applied in reality; we should probably make links higher-priority. Thus [[Foo|F'''''o''o''']] rather than [[Foo|F]]'''''[[Foo|o]]''[[Foo|F]]'''.

Except you'd probably rather have ''[[Foo|Bar]]'' than [[Foo|''Bar'']]

(In reply to comment #2)

Except you'd probably rather have ''[[Foo|Bar]]'' than [[Foo|''Bar'']]

Yeah; so it should only break out when it's not entirely nested? Spoke to Roan about this - he says it's a relatively-major change in DM that he did "about a quarter" of the work for as part of DM rewrite 2 (or similar). Pull from release?

To re-visit this, some rules I think encapsulate what we want:

  • Spanning annotations should never be broken (because that changes the render/interaction result), in the following order:
    • Links
    • Superscript / Subscript
    • Underline / Strikethrough

So:

  • <a href="Foo"><i>Foo</i>Bar</a> -> [[Foo|''Foo''Bar]], not ''[[Foo]]''[[Foo|Bar]]
  • <u>Foo<b>Bar</b></u> -> <u>Foo'''Bar'''</u>, not <u>Foo</u>'''<u>Bar</u>'''
  • <u>Foo<a href="Bar">Bar</u>Baz</a> -> <u>Foo</u>[[Bar|<u>Bar</u>Baz]], not <u>Foo[[Bar]]</u>[[Bar|Baz]]
  • <sup>Foo<s>Bar</sup>Baz</s> -> <sup>Foo<s>Bar</s></sup><s>Baz</s>, not <sup>Foo</sup><sup><s>Bar</s></sup><s>Baz</s>
  • Annotations to links' anchors which are otherwise identical to their target

So:

  • <a href="Foo"><i>Foo</i></a> -> ''[[Foo]]'', not [[Foo|''Foo'']]
  • Otherwise, annotations should be minimally-spanning

So:

  • <a href="Foo"><i>Bar</i></a> -> ''[[Foo|Bar]]'', not [[Foo|''Bar'']]
  • <i><b>Foo</b>Bar</i> -> '''''Foo'''Bar'' not '''''Foo'''''<nowiki />''Bar''

Does this achieve what we want? (Obviously some of this is already done by Parsoid.)

  • Bug 52912 has been marked as a duplicate of this bug. ***

In the meantime, can we suggest workarounds to "get the code right"?
I.e. to avoid [[Foo|''Bar'']] you should italicize first, and link only after that. I added this to the Italian User guide, other tips I might be missing? Thanks.

Bug 51422 seems to cover the same ground as this, is it worth keeping as a separate case or better to merge?

(In reply to comment #7)

Bug 51422 seems to cover the same ground as this, is it worth keeping as a
separate case or better to merge?

Merge - thanks for the spot.

  • Bug 51422 has been marked as a duplicate of this bug. ***
  • Bug 54092 has been marked as a duplicate of this bug. ***
  • Bug 51054 has been marked as a duplicate of this bug. ***
  • Bug 73201 has been marked as a duplicate of this bug. ***
Jdforrester-WMF edited a custom field.Apr 1 2015, 7:11 PM

Now that T105239: Enable scrubWikitext=1 in VisualEditor's save route to Parsoid is done, this should only impact non-MW users in terms of save output. Still a blocker for sane RTC, but that's not a priority.

Jdforrester-WMF lowered the priority of this task from "Normal" to "Low".Jul 10 2015, 6:11 PM
Jdforrester-WMF changed the title from "VisualEditor: Ordered application of annotations to avoid fragmentation (e.g. ''[[Foo|Fo]]''[[Foo|o]])" to "Ordered application of annotations to avoid fragmentation (e.g. ''[[Foo|Fo]]''[[Foo|o]])".Jul 10 2015, 6:21 PM
Jdforrester-WMF removed Esanders as the assignee of this task.
Jdforrester-WMF edited a custom field.

Hmm, so is the simplest case of [[Foo|''Foo'']] being handled via Parsoid now (converted to ''[[Foo]]''), or am I misunderstanding the comments above?

Hmm, so is the simplest case of [[Foo|''Foo'']] being handled via Parsoid now (converted to ''[[Foo]]''), or am I misunderstanding the comments above?

I haven't read all the comments about, but that scenario is currently not handled, but https://www.mediawiki.org/wiki/Talk:Parsoid/Normalizations has normalizations that are still on our plate.

https://www.mediawiki.org/wiki/Parsoid/Normalizations#Tag_minimization_.28.3Ci.3E.2F.3Cb.3E_tags.29 might handle some of the i/b scenarios that matter.

Is this still an issue? We've implemented tag minimization for <a> tags and it has been in production for a long time now. https://www.mediawiki.org/wiki/Parsoid/Normalizations#Tag_minimization_.28.3Ca.3E_tags.29

In the merged example, VE is generating <p><b><a href="Eat" rel="mw:WikiLink">Foo</a></b><a href="Eat" rel="mw:WikiLink">d</a></p> and Parsoid is turning it into '''[[Eat|Foo]]'''[[Eat|d]] which is a pretty faithful representation of VE's stupid DOM.

Ah, I see. At one point (actually my very first set of commits to Parsoid), I had implemented a complex minimization algorithm that would have dealt with this and other complex scenarios, but I removed it in favour of a simpler algorithm since that other algorithm couldn't keep up with all the DOM changes and other complexities that arrived over time and continued to be broken. But, I'll keep this in mind for future enhancements of our DOM normalization unless VE gets there first.

Sure. But even if Parsoid does (even more) magic stuff, we should do it properly in VE.

Ltrlg added a subscriber: Ltrlg.Dec 5 2015, 5:16 PM

Add Comment