Page MenuHomePhabricator

ContentTranslation doesn't know that an article already exists in the Norwegian Bokmål Wikipedia
Open, MediumPublicBUG REPORT

Description

List of steps to reproduce (step by step, including full links if applicable):

What happens?:

  • Many/most of the top suggestions already have articles in Norwegian Bokmål
  • Some examples from just now: World War I, Joseph Stalin, Elizabeth I, Georgia (country)

What should have happened instead?:

  • It should not suggest articles that already exist in the Norwegian Bokmål Wikipedia.

Software version (if not a Wikimedia wiki), browser information, screenshots, other information, etc.: MediaWiki 1.39.0-wmf.7 (800f261), ContentTranslation 670863b


This is most likely yet another iteration of our no/nb problems. The Norwegian Bokmål Wikipedia is located at https://no.wikipedia.org/ and has the database name nowiki, but its language code is (correctly) set to nb. So what I guess CX is doing is to check whether an article exists in nbwiki on Wikidata, while in fact it should check whether the article exists in nowiki.

Event Timeline

Pginer-WMF triaged this task as Medium priority.Jun 6 2022, 11:47 AM

Change 854473 had a related patch set uploaded (by Santhosh; author: Santhosh):

[mediawiki/extensions/ContentTranslation@master] nowiki: Use the 'no' domain code for getting translation suggestions

https://gerrit.wikimedia.org/r/854473

Change 854473 merged by jenkins-bot:

[mediawiki/extensions/ContentTranslation@master] nowiki: Use the 'no' domain code for getting translation suggestions

https://gerrit.wikimedia.org/r/854473

I'm sorry, but this has started happening again (don't know when it started).

To reproduce:

My suggestions for translations from English to Norwegian Bokmål:

Screenshot_20250308_130006.png (1×1 px, 213 KB)

The list includes Greenland and (for reasons I don't quite understand) a bunch of actors; Greenland already has an article, of course, as do most of these actors.

  • When you click one of these to (try to) start translating, however, it warns you that the article already exists:

Screenshot_20250308_130438.png (219×714 px, 22 KB)

Text with yellow background = "This page already exists in Norwegian Bokmål"

@jhsoby I was testing this again, and I don't see the issue now.
More importantly, the CX dashboard's new version is coming up. It will replace the current dashboard.
The following URL can be used to access new dashboard.
You can access it in https://no.wikipedia.org/w/index.php?title=Special:ContentTranslation&unified-dashboard=1&filter-type=automatic&filter-id=previous-edits&active-list=draft&from=en&to=nb#/

Could you please help verify if no/nb related recurring issues are resolved there or not? Thanks in advance

@santhosh It still happens to me, both in the old and new dashboards. Screenshots attached.
It is not as noticeable with the new dashboard, since it seems to give more varied suggestions.

Old dashboard

Screenshot_20250319_091305.png (770×1 px, 178 KB)

New dashboard

Screenshot_20250319_091542.png (916×1 px, 114 KB)

I had to refresh the suggestions a few times for the selection in the second screenshot to appear, but there we do have articles on both the first and the second of the three suggested articles.

A translation that was suggested to me (that I forgot to screenshot) was the article on Vladimir Putin. In the new dashboard, as opposed to the old, when I started translating it I didn't get a warning that the article existed. It let me start translating, and only when I went to publish did it warn me that I would be overwriting an existing article. This seems like a step back from the behaviour of the old dashboard, where at least it would warn me before I started translating.

Change #1129205 had a related patch set uploaded (by Santhosh; author: Santhosh):

[research/recommendation-api@master] Consider special language codes while checking for article existence

https://gerrit.wikimedia.org/r/1129205

. It let me start translating, and only when I went to publish did it warn me that I would be overwriting an existing article. This seems like a step back from the behaviour of the old dashboard, where at least it would warn me before I started translating.

This is being discussed at https://phabricator.wikimedia.org/T386895#10574040

Change #1129205 merged by jenkins-bot:

[research/recommendation-api@master] Consider special language codes while checking for article existence

https://gerrit.wikimedia.org/r/1129205

Change #1131128 had a related patch set uploaded (by KartikMistry; author: KartikMistry):

[operations/deployment-charts@master] Update recommendation-api to 2025-03-25-091801-production

https://gerrit.wikimedia.org/r/1131128

Change #1131128 merged by jenkins-bot:

[operations/deployment-charts@master] Update recommendation-api to 2025-03-25-091801-production

https://gerrit.wikimedia.org/r/1131128

Mentioned in SAL (#wikimedia-operations) [2025-03-26T16:34:43Z] <kart_> Updated recommendation-api-ng to 2025-03-25-091801-production (T306508)

@jhsoby The patch for this issue has been deployed. Could you please review it and confirm whether the fix works as expected? I saw this has been tested a couple of times, and it looks like you know more about where to test to see if this is ok now. Let us know if you encounter any further issues.

Hey @jhsoby! I'm following up as it's been a little while. The patch for this issue was deployed over a week ago, and we haven’t heard back if you came across anything else. We’ll go ahead and consider this resolved for now. If you run into any remaining issues or notice anything unexpected, feel free to reopen the task or comment and we’ll be happy to take another look. Thanks again!

Hi @GMikesell-WMF! Sorry, I didn't have time to check before.

I did a check just now, and sadly the problem is not solved yet:

Screenshot_20250408_094126.png (568×1 px, 46 KB)

We already have an article on the Minority Report film: https://no.wikipedia.org/wiki/Minority_Report

Screenshot_20250408_094326.png (571×1 px, 60 KB)

We already have an article on donkeys ("Esel" from the screenshot): https://no.wikipedia.org/wiki/Tamesel

Screenshot_20250408_094618.png (570×1 px, 49 KB)

We already have an article on the Arab World ("الوطن العربي" from the screenshot): https://no.wikipedia.org/wiki/Den_arabiske_verden