Page MenuHomePhabricator

Parentheticals: Words incorrectly concatenated due to too simple removing of spaces
Closed, DuplicatePublic

Description

Bug due to T69225: Hovercards: Space before removed parentheses should also be removed in the extract in src/preview/model.js:

https://de.wikipedia.org/w/index.php?title=Deutscher_Orden&oldid=164154458 has the sentence
Mit dem Johanniter- und dem Malteserorden steht er in der (Rechts-)Nachfolge der Ritterorden aus der Zeit der Kreuzzüge.

As Popups/Hovercards removes parantheses and stuff inside of them and also spaces in front of them without checking whether that is always a good choice, that text in the popup misses a space between der and Nachfolge:
Mit dem Johanniter- und dem Malteserorden steht er in derNachfolge der Ritterorden aus der Zeit der Kreuzzüge.

The code should check via look-ahead/look-behind if there is a punctuation marks xor a letter/number/etc after the corresponding closing parenthesis and act accordingly.

As different languages use different punctuation marks I doubt the entire approach and underlying idea is actually I18n safe.

Related Objects

Event Timeline

Aklapper created this task.Apr 5 2017, 12:46 AM
Jdlrobson added subscribers: Prtksxna, Jdlrobson.

Detecting punctuation in code is tricky :)
Should we revert the fix for T69225 until this is fixed?

Jdlrobson triaged this task as Normal priority.Apr 6 2017, 4:58 PM
Jdlrobson moved this task from Incoming to Needs Prioritization on the Readers-Web-Backlog board.

We should really look into the I18n aspect of it before reverting, or fixing anything.

Jdlrobson changed the task status from Open to Stalled.May 2 2017, 5:00 PM

Until we've worked out a proposal in T113094

ovasileva changed the task status from Stalled to Open.Jun 20 2017, 1:19 PM

We are at a fork in the road.

  1. This should be done by editors (explicit and less error prone but less coverage)
  2. We continue to guess and strip them programmatically (at risk of extra bugs)

We need to make a decision about T91344 imo to continue this.

Jdlrobson renamed this task from Words incorrectly concatenated due to too simple removing of spaces to Parentheticals: Words incorrectly concatenated due to too simple removing of spaces.Jun 22 2017, 6:36 PM
Jdlrobson changed the task status from Open to Stalled.Jul 6 2017, 6:13 PM

We need to make a decision about T91344 imo to continue this.

bearND added a subscriber: bearND.Jul 20 2017, 8:25 PM

Fixing this is a good step in the right direction. FWIW, in this case I think keeping the parentheses would be preferable since it's to better clarify the following noun. I know that this is hard to determine programmatically. Having said that, editors could also just remove the parentheses from in der (Rechts-)Nachfolge to make it in der Rechtsnachfolge if it is really important enough.