Page MenuHomePhabricator

Extend translations graph to show also published translations that need review
Open, MediumPublic


The Content Translation Key metrics dashboard shows in multiple graphs the translations published over time.

Some of the published articles may be published with potentially too much unmodified content (T190279). These are added to a special tracking category (T190798) on each wiki, but it is hard to get the overall number for all languages. Extending the graph to include those articles would be useful.

For the graph aggregating all translations published it can be expanded with a new line named "Published needing review" to represent the articles created despite the "too much unmodified content" warning. As part of a related ticket (T210138), deleted translaitons could be also integrated.

Event Timeline

Pginer-WMF triaged this task as Medium priority.Nov 19 2018, 5:59 PM
Pginer-WMF edited projects, added ContentTranslation; removed CX-analytics.
Pginer-WMF edited projects, added CX-analytics; removed ContentTranslation.
Pginer-WMF added a subscriber: Amire80.

Change 476480 had a related patch set uploaded (by Santhosh; owner: Santhosh):
[mediawiki/extensions/ContentTranslation@master] Add a counter for published translations with unreviewd MT

Change 476480 merged by jenkins-bot:
[mediawiki/extensions/ContentTranslation@master] Add a counter for published translations with unreviewed MT

Petar.petkovic moved this task from In Review to QA on the Language-Team (Language-2018-October-December) board.

@Petar.petkovic, I don't see the new line in the graph. Is this waiting for deployment or is further development needed?

I checked today (the task is on my list of tasks to check after wmf.8 deployment) - the link displays the error in the Console:

Wed, 12 Dec 2018 23:23:12 GMT

TypeError: e is undefined scripts.js:36:7482

The link was working last week.

This is what I think happened: @Amire80 updated the tab names, so that the new link is (cf. but the link in the task description was not updated.

Thx, @Nikerabbit - is working, but the new graph "Published needing review" is still not present.

santhosh added a subscriber: santhosh.

I don't know if @Amire80 implemented anything for the language-reportcard. The patches above are my patches that adds a graph to grafana See the graph about "Published translations with high amount of unreviewed MT"

I will unassign and assign to Amir

Given that this may take some more time due to technical complexities, I created a ticket for an initial report that will provide some initial insights and useful data for upcoming communications: T218020: Measure percentage of translations published with and without the expected level of modified content

Pginer-WMF raised the priority of this task from Medium to High.Apr 5 2019, 4:40 PM

Change 502312 had a related patch set uploaded (by Amire80; owner: Amire80):
[mediawiki/extensions/ContentTranslation@master] Log need-review events

Change 502312 merged by jenkins-bot:
[mediawiki/extensions/ContentTranslation@master] Log need-review events

So I have the events being logged, and to show them on the chart, I'll have to do something like this regularly:

  count(distinct(concat_ws(',', event.sourceTitle, event.sourceLanguage, event.targetTitle, event.targetLanguage, event.token))) as count
  event.action = 'need-review' and
  year = 2019 and
  month = 5 and
  day = 3;

This will have to be in hive. As a first step, I'll probably make a new dashboard for this, and once it works, I'll merge the existing dashboard into it.

A question for @mforns: The query above selects just one number, per day. If I understand correctly, the output is supposed to include two columns: a date and a number. If I'm going to use RU, what's the right way to add the number? And what's the most robust way to pass this date to the query in a way that will show it as a column, and will break it correctly for the year/month/day conditions?

When we use RU for Hive, we have to use a script instead of the query.
That is so, because RU doesn't have yet a Hive client. So we use a bash script that calls hive -e "<query>".
The way RU passes dates (and other params) to the script is different from the way it passes dates to sql files.
In a nutshell, to add a date column in a Hive query (bash script) use:

    '$1' AS date,

$1 is the first parameter that RU passes to the script, which is the date in question.
You can find this and other infos in the RU documentation:
Also, take a look at this example of another Hive-based RU report:
You can basically copy the way hive is called (hive -e "..." 2> /dev/null | grep -v parquet.hadoop).
And also, copy the way $1 is used.

I eventually decided to do this in mysql, and perhaps later move everything to hive if it's desirable. This is supposed to make the initial deployment easier.

Change 509007 had a related patch set uploaded (by Amire80; owner: Amire80):
[analytics/limn-language-data@master] Add need-review chart to published CX2 translation

Change 509007 merged by jenkins-bot:
[analytics/limn-language-data@master] Add need-review chart to published CX2 translation

OK, so now thanks to @mforns the chart works at . However, this task ask to put the chart at the "CX2 translations that need review" tab together with the "CX2 translations" tab, as just one tab with two lines. I guess that this may be possible by editing the JSON configuration at , but I'm not sure how exactly.

We may want to review the task request, probably incorporating the new data in the new Superset dashboard.
We may consider deprecating the specific CX2 tags since that is now the only version (no longer coexisting with CX1).

ldelench_wmf lowered the priority of this task from High to Medium.Jul 27 2021, 5:05 PM
ldelench_wmf moved this task from Triage to Current Quarter on the Product-Analytics board.
Pginer-WMF renamed this task from Extend CX2 translations graph to show also published translations that need review to Extend translations graph to show also published translations that need review.Aug 3 2021, 4:08 PM
Pginer-WMF updated the task description. (Show Details)