Page MenuHomePhabricator

"<translate>" and <tvar|*> tag visible in Page Previews: HtmlFormatter only flattens spans
Closed, ResolvedPublic

Description

Problem

It seems that Page Previews shows the "<translate>" tag in the preview when the destination page has it. It shouldn't be displayed.

It's possible this only happens if a translate tag is not closed in the current section and exintro parameter is used (see T168743).
e.g.

<languages/>
<translate> <!-- open translate tag in section 0 -->
<!--T:1-->
This page links instructions for common '''administrative tasks''' which you may wish to perform once your [[<tvar|inst-guide>Manual:Installation guide</>|installation]] of MediaWiki is completed.
== New section ==
</translate> <!-- closed translate tag in section 1 -->

Replication steps

without explaintext

OR https://www.mediawiki.org/wiki/Special:ApiSandbox#action=query&format=json&maxage=300&prop=extracts&titles=Manual%3AConfiguration&redirects=1&formatversion=2&exchars=525&exintro=1

  • Note &lt;translate&gt; in the response.

With explaintext

OR https://www.mediawiki.org/wiki/Special:ApiSandbox#action=query&format=json&maxage=300&prop=extracts&titles=Manual%3AConfiguration&redirects=1&formatversion=2&exchars=525&exintro=1&explaintext=1

  • Note <translate> in the response

Cause

T168743

Examples

Any one of the following links shows the problem,

  1. go to https://meta.wikimedia.org/wiki/Technical_Collaboration
  2. hover over "Community Engagement", which points to https://meta.wikimedia.org/wiki/Community_Engagement
  3. hover over "Volunteer Supporters Network", which points to https://meta.wikimedia.org/wiki/Volunteer_Supporters_Network

In that page, there is also the link "Improve collaboration with communities in product development", which is a more complex case because anchors and HTML are involved in the destination page: https://meta.wikimedia.org/wiki/Wikimedia_Foundation_Annual_Plan/2016-2017/Final#Program_7:_Improve_collaboration_with_communities_in_product_development

AC

Possible workarounds

  • Any <translate> tags encountered should be "flattened" inside the filterContent method
  • Any <tvar|.*> tags encountered should be "flattened" inside the filterContent method - note the tag name will always begin with "tvar|" but the text after the pipe is arbitary e.g. <tvar|wmf>

This can be fixed either in the HtmlFormatter (upstream) or in the ExtractFormatter
A test is provided (only for translate tag) however it only seems to fail locally without HHVM.

Related Objects

Event Timeline

QuimGil created this task.Jun 14 2017, 7:05 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJun 14 2017, 7:05 AM
bmansurov updated the task description. (Show Details)Jun 14 2017, 3:31 PM
bmansurov updated the task description. (Show Details)
bmansurov moved this task from Incoming to Needs Prioritization on the Readers-Web-Backlog board.
Jdlrobson edited projects, added TextExtracts; removed Page-Previews.
Jdlrobson added a subscriber: Jdlrobson.

I could have sworn there was a ticket open for this... The underlying issue is in TextExtracts so tagging appropriately.

ovasileva triaged this task as High priority.Jun 20 2017, 12:11 PM
ovasileva lowered the priority of this task from High to Normal.Jun 20 2017, 4:42 PM

As mentioned yesterday we'll need to do some digging to work out where the problem lies here before committing to work. Right now it's not known which layer this issue happens, why it happens and whether it's trivial to fix or there is a bigger problem at hand here. This impacts meta.wiki and mediawiki

What's the easiest way to get the <translate> tag working locally? If I can do this I am happy to try and isolate the issue here..

Change 360984 had a related patch set uploaded (by Jdlrobson; owner: Jdlrobson):
[mediawiki/extensions/TextExtracts@master] Test for T167852

https://gerrit.wikimedia.org/r/360984

Jdlrobson renamed this task from "<translate>" tag visible in Page Previews to "<translate>" tag visible in Page Previews: HtmlFormatter only flattens spans.Jun 22 2017, 9:45 PM
Jdlrobson removed a project: Patch-For-Review.
Jdlrobson updated the task description. (Show Details)
Jdlrobson updated the task description. (Show Details)
Jdlrobson moved this task from Needs Prioritization to Upcoming on the Readers-Web-Backlog board.
Jdlrobson added subscribers: ovasileva, phuedx.

After some investigation, I think is well defined and believe this is ready. Moving to upcoming. cc @phuedx @ovasileva

Jdlrobson updated the task description. (Show Details)Jun 22 2017, 9:53 PM
Jdlrobson updated the task description. (Show Details)Jun 23 2017, 5:57 PM
Jdlrobson updated the task description. (Show Details)

RESTBase, which PP is fetching previews from in this case, doesn't set the explaintext parameter:

The explaintext isn't the problem here.
explaintext return extracts as plain text instead of limited HTML.
Limited HTML has a similar problem which is inherited by not being aware of the translate tags existence. When parsed it things it has plain text. T107206 may be related to this problem?

  1. I can't reproduce this issue on-wiki with and without the explaintext parameter set:

That's because you are using the wrong page. I've updated the description but it's Community_Engagement that you should be requesting the extract from.

That's because you are using the wrong page. I've updated the description but it's Community_Engagement that you should be requesting the extract from.

Mibad. That's why I mentioned the explaintext parameter.


It should flatten translate tags

After looking at the API response, is that what the problem is here? Will flattening translate tags fix the following, which appears in the HTML and plain text extract from the example in the description?

[[&lt;tvar|wmf&gt;Special:MyLanguage/Wikimedia Foundation&lt;/&gt;|Wikimedia Foundation]]

Jdlrobson renamed this task from "<translate>" tag visible in Page Previews: HtmlFormatter only flattens spans to "<translate>" and <tvar|*> tag visible in Page Previews: HtmlFormatter only flattens spans.Jun 23 2017, 6:29 PM
Jdlrobson updated the task description. (Show Details)

After some more debuggin, this is just one way of solving it.
The fundamental issue if we go up a level is this: T168743

Then let's fix the fundamental issue… Thanks for getting to the root of the issue, @Jdlrobson!

Our estimations were a bit all over the place 3,3,5,8,coffee
The code is a little scary and in multiple places.
3's felt like we were doing this for other tags

We feel like we should not workaround this issue as this is going to bite us later when we actually want to localise these messages.
Olga pointed out that the links in the description are working now.. did something change?

Jdlrobson updated the task description. (Show Details)Jun 27 2017, 6:21 PM
Jdlrobson updated the task description. (Show Details)
Jdlrobson updated the task description. (Show Details)Jun 27 2017, 6:25 PM

Is there a way we can test this on the beta cluster? Currently templates are showing up within article text, so no way to know.

Is there a way we can test this on the beta cluster? Currently templates are showing up within article text, so no way to know.

No. The Beta Cluster is a wiki farm – a single server hosting multiple wikis – with each wiki corresponding to a different language. Installing the Translate extension on the Beta Cluster doesn't make much sense. We should probably do this on the staging server with the main wiki being multilingual.

Change 360984 abandoned by Jdlrobson:
Test for T167852

https://gerrit.wikimedia.org/r/360984

Can be signed off next train (Wed 5th) on mediawiki. You may need to edit the page to force a flush of the existing HTML.

I think we also need to take the time to document the behavior on-wiki. I'll add a pre-signoff AC.

phuedx updated the task description. (Show Details)Jun 30 2017, 7:45 AM

Sign off is blocked till 12th. No train this week.

Jdlrobson added a comment.EditedJul 11 2017, 12:23 AM

@phuedx hey not sure what you have in mind about updating docs.

  • This relates to TextExtracts or Translate extension so should probably be documented there... not on Popups
  • Not 100% sure what you want to document here. That it will not be translated?

Note: We can actually sign this tomorrow or Wednesday (train is 19:00–21:00 UTC Tuesday).

@phuedx hey not sure what you have in mind about updating docs.

  • This relates to TextExtracts or Translate extension so should probably be documented there... not on Popups

But it does directly affect the behavior of Popups as it should now only display previews in the content language of the page regardless of whether the Translate extension is used.

I say should because…

  • Not 100% sure what you want to document here. That it will not be translated?

Is that accurate? AFAICT any balanced translate parser tags in the extract will be translated whereas imbalanced ones won't. Am I understanding the changes correctly?

Jdlrobson assigned this task to phuedx.Jul 11 2017, 10:31 PM

The fix is live in 1.30.0-wmf.9 and I've verified it works for the mediawiki Configuration example

We'll need to wait till tomorrow for the meta fix.

AFAICT any balanced translate parser tags in the extract will be translated whereas imbalanced ones won't. Am I understanding the changes correctly?

Correct. Sorry for confusion.

I've had a go at editing: https://www.mediawiki.org/wiki/Extension:Popups#Known_problems

@phuedx feel free to edit wording and would you mind signing off?

I've tweaked the wording a little to clarify that we're requesting previews in the content language of the page (?uselang=content). Thanks, @Jdlrobson!

We'll need to wait till tomorrow for the meta fix.

TextExtracts also caches extracts in memcached with a TTL of wgParserCacheExpiryTime. We'll have to wait for pages to either be touched or fall out of the TextExtracts cache naturally.

phuedx closed this task as Resolved.Jul 12 2017, 9:31 AM

Per the above.