Page MenuHomePhabricator

Monument listing are missing some text in extlinks in Parsoid Read Views on ruwikivoyage
Closed, ResolvedPublic

Description

From @ssastry :

Okay, so, this ruwikivoyage diff is representative of all the biggest diffs. Two issues: (a) Minor issue: margin/padding in the listing of monuments at the top of the page which we can ignore, but introduces some noise in visual diffing. (b) real issue; the monument listings are missing some text in the extlinks.
There are lots of pages with monument listings -- some with 100s of rows. But, looks like possibly something easy that could be fixed.
But the margin/padding issue is present on lots of other pages too, so if that is fixed, it will reduce noise as well.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Looking at the page source for the legacy parser output, the linked text in the HTML is галерея like with Parsoid. Looks like some JS script is adding the additional (0) or (33) or whatever else is present.

Turns out it is this line in the CulturalHeritageImagesCount gadget that breaks with Parsoid HTML. The category name is here.

The ref in Parsoid HTML is href="https://commons.wikimedia.org/wiki/Category:Protected%20areas%20of%20Russia/0630016" whereas in legacy HTML, it is href="https://commons.wikimedia.org/wiki/Category:Protected_areas_of_Russia/0630016".

So, Parsoid should either switch href encoding to use "_" for spaces instead of "%20" .. or we should make the gadget agnostic to "%20" or "_".

But, not sure we can easily make that href encoding change since there might be clients consuming Parsoid HTML now which would break if we make this change.

listingTableHtml.findCommonsCategory(pageType.parentCategoryName) could also be tweaked to do space/underscore normalization.

What's the actual HTML tag which is generating these hrefs, though? Generally speaking category links, etc, should be done by the skin, so I'd be surprised if Parsoid was generating them in a different format than core. But if this is an <a> tag maybe?

Change #1087215 had a related patch set uploaded (by C. Scott Ananian; author: C. Scott Ananian):

[mediawiki/services/parsoid@master] Make interwiki and language link hrefs consistent with wikilink hrefs

https://gerrit.wikimedia.org/r/1087215

We're going to fix this by patching the gadget for now, after which time we can resolve this task. T379645: Parsoid does not convert underscores to spaces for interwiki links has been opened for any future reconsideration of the Parsoid design decision.

This edit fixes the gadget and the rendering.