Page MenuHomePhabricator

Link translated articles with other languages through Wikidata automatically
Closed, ResolvedPublic1 Estimated Story Points

Description

Once a user publishes an article, the next step is normally to add the language links (i.e., link to equivalent articles in Wikidata). That is some manual step that can be automated.

We need to check whether the legal terms the user accepts when translating cover their contributions to Wikidata in order to be able to do so in the user's behalf.

This was reported at https://www.mediawiki.org/w/index.php?title=Topic:S6w7xbz00oie9c9t&topic_showPostId=sa6xct4j7tq0rea6#flow-post-sa6xct4j7tq0rea6

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Amire80 triaged this task as Medium priority.Feb 4 2015, 12:04 AM
Amire80 set Security to None.
Amire80 subscribed.

"Normal" prio; it's an important feature, but arguably less urgent than breaking bugs.

I don't see the concern here. There is no protectable interest in merely the fact that this article is a translation of the other article, so no one needs to consent to it being added to the database.

(But I may be missing something - I'm not sure I completely follow the relevant workflow.)

Thanks a lot for the quick reply, @LuisV_WMF!

The workflow is quite simple:

  1. Translate the article "Eviatar Banai" from English to Catalan using ContentTranslation.
  2. Click the "Publish" button.
  3. The article appears as a usual wiki page in the Catalan Wikipedia.
  4. A site link to the new Catalan page is automatically added to Wikidata item https://www.wikidata.org/wiki/Q5418308 . This is the same thing that is done when the user presses the "Add links" element in the empty interlanguage links list area, but we can avoid this manual step, because we already know that the article is a translation.

I just tested it, and there is no legal text in the Add links dialog, and based on that I'd assume that no legal text will be needed in our workflow.

@Lydia_Pintscher, anything to add?

Sounds good to me, too.
Are you covering the case when there is no Wikidata item yet?

Sounds good to me, too.
Are you covering the case when there is no Wikidata item yet?

I guess that we shall have to cover it. Thanks for the suggestion.

Before you write any actual code here, feel free to contact me or someone else from the team. I have worked on similar things before and am probably able to show you what we already got, depending on how you actually want to implement this (server side vs. client side etc.).

Arrbee lowered the priority of this task from Medium to Low.Feb 25 2015, 3:24 PM
Arrbee moved this task from Backlog to Candidates on the LE-Sprint-83 board.

@hoo, thanks for offering help.

I plan to work on it this week.

The scenario goes more or less like this:

  1. A Danish-speaking user opens the translation interface (Special:ContentTranslation) and loads the article https://de.wikipedia.org/wiki/Aase_Birkenheier , with the intention to translate to Danish.
  2. When the translation is ready, the user presses the "Publish" button.
  3. This takes the translated text and sends it to the cxpublish API.
  4. The translated text is HTML (the translation interface is contenteditable; not VE). The cxpublish API translates it to wiki syntax using Parsoid and creates a wiki page with the title https://da.wikipedia.org/wiki/Aase_Birkenheier .
  5. When the API action is over and the page is created, the user sees a notification that the page was created and a link to the page.

Till here everything is already implemented. Step 4 is backend; everything else is frontend.

What this task wants to do is to add da:Aase_Birkenheier as a sitelink to https://www.wikidata.org/wiki/Q18601567 without any more interaction with the user. In case of Aase Birkenheier, a Wikidata item already exists, but it's also possible that one doesn't exist yet, so it will have to be created. Functionally, it is supposed to be very similar to what happens when the user runs the "Add links" tool, the entry point to which appears in Wikipedia after the interlanguage links list, but without any user interaction at all, because we already know that the newly created Danish page corresponds to the Danish one.

Given the above, should it be a step at the end of the backend cxpublish API? Or maybe a separate step initiated by the frontend after cxpublish finishes? Or something else?

Tips, opinions, examples, etc. - all welcome.

@santhosh, @Nikerabbit - you are welcome to chime in as well, of course.

Hi, I don't have a lot of time today (actually I'm not working the whole week and next week), but I still want to give some initial ideas. You can do it in the front end and make use of https://github.com/wikimedia/mediawiki-extensions-Wikibase/blob/master/client/resources/wikibase.client.PageConnector.js (that's what the Add links dialog uses) or you can make use of the wblinktitles api https://www.wikidata.org/w/api.php?action=help&modules=wblinktitles (that module can't perform item merges, but I guess that's not an issue here).

Using that api is probably the easiest and most stable thing to do. Binding against the Wikibase extension in PHP is probably not that much of a good idea given that we don't really have stable PHP interfaces.

Thanks!

ContentTranslation is developed mostly with Wikipedia in mind, but doesn't forget non-Wikimedia sites. What would be the right way that the current site is a Wikibase client? Check that $wgWBClientSettings is defined?

Thanks!

ContentTranslation is developed mostly with Wikipedia in mind, but doesn't forget non-Wikimedia sites. What would be the right way that the current site is a Wikibase client? Check that $wgWBClientSettings is defined?

Server side we're using defined( 'WBC_VERSION' ) for this... on the client (JavaScript), you would probably check for the existence of the RL modules.

Arrbee moved this task from Done to In Progress on the LE-Sprint-85 board.
Arrbee updated the task description. (Show Details)
Arrbee edited projects, added LE-Sprint-86; removed LE-Sprint-85.

I think we just need wblinktitles api

api.php?action=wblinktitles&fromsite=enwiki&fromtitle=Hydrogen&tosite=dewiki&totitle=Wasserstoff with an edit token

I think we just need wblinktitles api

api.php?action=wblinktitles&fromsite=enwiki&fromtitle=Hydrogen&tosite=dewiki&totitle=Wasserstoff with an edit token

Yes, that would work. You can also use the wbsetsitelink api, if you prefer (that way you don't need to know about other articles).

In T87410#1310762, @hoo wrote:

I think we just need wblinktitles api

api.php?action=wblinktitles&fromsite=enwiki&fromtitle=Hydrogen&tosite=dewiki&totitle=Wasserstoff with an edit token

Yes, that would work. You can also use the wbsetsitelink api, if you prefer (that way you don't need to know about other articles).

Since the purpose is linking articles translated from other wikis, I think their titles and site ids are already known. Aren't they?

Change 214119 had a related patch set uploaded (by Amire80):
WIP: Add a Wikibase link

https://gerrit.wikimedia.org/r/214119

Change 214119 had a related patch set uploaded (by Amire80):
WIP: Add a Wikibase link

https://gerrit.wikimedia.org/r/214119

OK, so in https://gerrit.wikimedia.org/r/214119 I tried to do it first with raw API calls, and it didn't work, so I tried it by using PageConnector, as I tried at first, and it still doesn't do the right thing. @hoo, everybody points at you as the PageConnector expert, and it's probably something very simple; can you please take a look? Thanks!

... And I think that I managed to get it to work! Should be ready for review.

A bit of background: I cheated while developing, because I don't have a local Wikibase installation for testing. To test I copied the function from https://gerrit.wikimedia.org/r/#/c/214119/13/modules/wikibaselink/ext.cx.wikibase.link.js to https://ca.wikipedia.org/wiki/Usuari:Amire80/common.js , and then I translated the article https://ca.wikipedia.org/wiki/Assaf_Amdurski - and it worked, the links were added automatically and without any disruption.

Change 214119 merged by jenkins-bot:
Add a Wikibase link after publishing a page

https://gerrit.wikimedia.org/r/214119

According to this comment, when an existing article is translated into the user namespace, Wikidata links are modified to point to the user namespace. They should not be modified. We should just add links when new articles are created in the main namespace.
I think we already checked this, but a closer look may be needed if it still fails under some circumstances.

This was updated in production yesterday. The comment in ca.wikipedia probably refers to something that happened shortly before the update.

I just tested it carefully:

It definitely doesn't happen anymore. You can see that no link was added to the Wikidata item. Compare it with https://www.wikidata.org/w/index.php?title=Q134396&action=history , where links to test pages were auto-added and manually deleted.

As part of the work on version 2 we asked advanced users to create articles and several reported that wikidata links were not added automatically when publishing (1) (2) (3) (4).

We may want to investigate if there is a regression about this, or in case that it takes too much time for the links to update (e.g., due to Wikidata cache), evaluate what can be done to reduce the confusion.