Page MenuHomePhabricator

ContentTranslation fails to adapt links with spaces
Closed, ResolvedPublic

Description

Links with spaces are not adapted.

To reproduce:

  • Translate Róisín Murphy from English to Hebrew.
  • In the "Early life" section in the source column, click the "Roman Catholic" link.

Observed: the link is not adapted, even though https://en.wikipedia.org/wiki/Catholic_Church has a corresponding article in he.wikipedia.

In the fetchLinkPairs function, the link titles are cached with underscores ('Catholic_Church'), but they are read with spaces ('Catholic Church').

This is a regression, I remember fixing this months ago.

Event Timeline

Amire80 created this task.Jun 27 2015, 9:46 AM
Amire80 updated the task description. (Show Details)
Amire80 raised the priority of this task from to High.
Amire80 added a project: ContentTranslation.
Amire80 added a subscriber: Amire80.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJun 27 2015, 9:46 AM

This happens both in production and in http://he.wikipedia.beta.wmflabs.org/ .

Amire80 renamed this task from ContentTranslation fails to adapt some links to ContentTranslation fails to adapt links with spaces.Jun 27 2015, 4:07 PM
Amire80 updated the task description. (Show Details)
Amire80 set Security to None.
Amire80 assigned this task to santhosh.Jun 28 2015, 8:50 AM
Amire80 added a subscriber: santhosh.
Nikerabbit updated the task description. (Show Details)Jun 28 2015, 9:04 AM

The cause for this seems to be that the code accesses the global cache.linkPairs variable directly. This means that some code call title.toString directly while other code relies on stored title as a text which is in different format.

My recommendation is to make linkPair fetching encapsulated in it's own class which can be instantiated as an object with getters that perform the appropriate normalization. This requires some refactoring so that the current users of the global variable can handle promises instead. This will also solve race conditions if there are any, because the getters can check that the data is loaded before accessed.

Change 221568 had a related patch set uploaded (by Santhosh):
Consistently use title.toText() for cache access

https://gerrit.wikimedia.org/r/221568

Change 221568 merged by jenkins-bot:
Consistently use title.toText() for cache access

https://gerrit.wikimedia.org/r/221568

Change 221605 had a related patch set uploaded (by KartikMistry):
Consistently use title.toText() for cache access

https://gerrit.wikimedia.org/r/221605

Change 221605 merged by jenkins-bot:
Consistently use title.toText() for cache access

https://gerrit.wikimedia.org/r/221605

Is T104126 an actual bug or just technical debt?