Page MenuHomePhabricator

Link translated articles with other languages through Wikidata automatically
Closed, ResolvedPublic1 Story Points

Description

Once a user publishes an article, the next step is normally to add the language links (i.e., link to equivalent articles in Wikidata). That is some manual step that can be automated.

We need to check whether the legal terms the user accepts when translating cover their contributions to Wikidata in order to be able to do so in the user's behalf.

This was reported at https://www.mediawiki.org/w/index.php?title=Topic:S6w7xbz00oie9c9t&topic_showPostId=sa6xct4j7tq0rea6#flow-post-sa6xct4j7tq0rea6

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Amire80 triaged this task as Normal priority.Feb 4 2015, 12:04 AM
Amire80 set Security to None.
Amire80 added a subscriber: Amire80.

"Normal" prio; it's an important feature, but arguably less urgent than breaking bugs.

I don't see the concern here. There is no protectable interest in merely the fact that this article is a translation of the other article, so no one needs to consent to it being added to the database.

(But I may be missing something - I'm not sure I completely follow the relevant workflow.)

Thanks a lot for the quick reply, @LuisV_WMF!

The workflow is quite simple:

  1. Translate the article "Eviatar Banai" from English to Catalan using ContentTranslation.
  2. Click the "Publish" button.
  3. The article appears as a usual wiki page in the Catalan Wikipedia.
  4. A site link to the new Catalan page is automatically added to Wikidata item https://www.wikidata.org/wiki/Q5418308 . This is the same thing that is done when the user presses the "Add links" element in the empty interlanguage links list area, but we can avoid this manual step, because we already know that the article is a translation.

I just tested it, and there is no legal text in the Add links dialog, and based on that I'd assume that no legal text will be needed in our workflow.

@Lydia_Pintscher, anything to add?

No need that I can see.

Sounds good to me, too.
Are you covering the case when there is no Wikidata item yet?

Sounds good to me, too.
Are you covering the case when there is no Wikidata item yet?

I guess that we shall have to cover it. Thanks for the suggestion.

hoo added a comment.Feb 4 2015, 10:47 AM

Before you write any actual code here, feel free to contact me or someone else from the team. I have worked on similar things before and am probably able to show you what we already got, depending on how you actually want to implement this (server side vs. client side etc.).

Lydia_Pintscher moved this task from incoming to monitoring on the Wikidata board.Feb 9 2015, 9:54 AM
Arrbee edited a custom field.Feb 25 2015, 5:54 AM
Arrbee moved this task from Backlog to Candidates on the LE-Sprint-83 board.Feb 25 2015, 3:24 PM
Arrbee lowered the priority of this task from Normal to Low.
Arrbee moved this task from Candidates to Backlog on the LE-Sprint-83 board.Feb 25 2015, 3:39 PM
Amire80 claimed this task.Feb 25 2015, 3:54 PM

@hoo, thanks for offering help.

I plan to work on it this week.

The scenario goes more or less like this:

  1. A Danish-speaking user opens the translation interface (Special:ContentTranslation) and loads the article https://de.wikipedia.org/wiki/Aase_Birkenheier , with the intention to translate to Danish.
  2. When the translation is ready, the user presses the "Publish" button.
  3. This takes the translated text and sends it to the cxpublish API.
  4. The translated text is HTML (the translation interface is contenteditable; not VE). The cxpublish API translates it to wiki syntax using Parsoid and creates a wiki page with the title https://da.wikipedia.org/wiki/Aase_Birkenheier .
  5. When the API action is over and the page is created, the user sees a notification that the page was created and a link to the page.

Till here everything is already implemented. Step 4 is backend; everything else is frontend.

What this task wants to do is to add da:Aase_Birkenheier as a sitelink to https://www.wikidata.org/wiki/Q18601567 without any more interaction with the user. In case of Aase Birkenheier, a Wikidata item already exists, but it's also possible that one doesn't exist yet, so it will have to be created. Functionally, it is supposed to be very similar to what happens when the user runs the "Add links" tool, the entry point to which appears in Wikipedia after the interlanguage links list, but without any user interaction at all, because we already know that the newly created Danish page corresponds to the Danish one.

Given the above, should it be a step at the end of the backend cxpublish API? Or maybe a separate step initiated by the frontend after cxpublish finishes? Or something else?

Tips, opinions, examples, etc. - all welcome.

@santhosh, @Nikerabbit - you are welcome to chime in as well, of course.

hoo added a comment.Mar 2 2015, 9:13 PM

Hi, I don't have a lot of time today (actually I'm not working the whole week and next week), but I still want to give some initial ideas. You can do it in the front end and make use of https://github.com/wikimedia/mediawiki-extensions-Wikibase/blob/master/client/resources/wikibase.client.PageConnector.js (that's what the Add links dialog uses) or you can make use of the wblinktitles api https://www.wikidata.org/w/api.php?action=help&modules=wblinktitles (that module can't perform item merges, but I guess that's not an issue here).

Using that api is probably the easiest and most stable thing to do. Binding against the Wikibase extension in PHP is probably not that much of a good idea given that we don't really have stable PHP interfaces.

Thanks!

ContentTranslation is developed mostly with Wikipedia in mind, but doesn't forget non-Wikimedia sites. What would be the right way that the current site is a Wikibase client? Check that $wgWBClientSettings is defined?

hoo added a comment.Mar 4 2015, 3:38 AM

Thanks!

ContentTranslation is developed mostly with Wikipedia in mind, but doesn't forget non-Wikimedia sites. What would be the right way that the current site is a Wikibase client? Check that $wgWBClientSettings is defined?

Server side we're using defined( 'WBC_VERSION' ) for this... on the client (JavaScript), you would probably check for the existence of the RL modules.

Amire80 moved this task from Backlog to In Progress on the LE-Sprint-83 board.Mar 9 2015, 11:19 PM
Amire80 moved this task from Backlog to In Progress on the LE-Sprint-85 board.Apr 19 2015, 5:58 PM
Arrbee moved this task from Long term to CX5 on the ContentTranslation board.Apr 20 2015, 6:08 AM
Arrbee moved this task from In Progress to Done on the LE-Sprint-85 board.Apr 21 2015, 8:05 AM
Arrbee moved this task from Done to In Progress on the LE-Sprint-85 board.
Arrbee updated the task description. (Show Details)
Arrbee edited projects, added LE-Sprint-86; removed LE-Sprint-85.
Arrbee moved this task from Backlog to In Progress on the LE-Sprint-86 board.Apr 21 2015, 8:09 AM
Amire80 moved this task from In Progress to Blocked on the LE-Sprint-86 board.May 6 2015, 7:51 PM
Amire80 updated the task description. (Show Details)May 12 2015, 12:39 PM
He7d3r added a subscriber: He7d3r.May 14 2015, 4:04 PM
Arrbee edited projects, added LE-Sprint-87; removed LE-Sprint-86.May 25 2015, 8:02 AM
Arrbee moved this task from Backlog to Blocked on the LE-Sprint-87 board.May 25 2015, 8:15 AM

I think we just need wblinktitles api

api.php?action=wblinktitles&fromsite=enwiki&fromtitle=Hydrogen&tosite=dewiki&totitle=Wasserstoff with an edit token

hoo added a comment.May 25 2015, 6:49 PM

I think we just need wblinktitles api

api.php?action=wblinktitles&fromsite=enwiki&fromtitle=Hydrogen&tosite=dewiki&totitle=Wasserstoff with an edit token

Yes, that would work. You can also use the wbsetsitelink api, if you prefer (that way you don't need to know about other articles).

In T87410#1310762, @hoo wrote:

I think we just need wblinktitles api

api.php?action=wblinktitles&fromsite=enwiki&fromtitle=Hydrogen&tosite=dewiki&totitle=Wasserstoff with an edit token

Yes, that would work. You can also use the wbsetsitelink api, if you prefer (that way you don't need to know about other articles).

Since the purpose is linking articles translated from other wikis, I think their titles and site ids are already known. Aren't they?

Amire80 moved this task from Blocked to In Progress on the LE-Sprint-87 board.May 27 2015, 11:57 AM

Change 214119 had a related patch set uploaded (by Amire80):
WIP: Add a Wikibase link

https://gerrit.wikimedia.org/r/214119

Change 214119 had a related patch set uploaded (by Amire80):
WIP: Add a Wikibase link

https://gerrit.wikimedia.org/r/214119

OK, so in https://gerrit.wikimedia.org/r/214119 I tried to do it first with raw API calls, and it didn't work, so I tried it by using PageConnector, as I tried at first, and it still doesn't do the right thing. @hoo, everybody points at you as the PageConnector expert, and it's probably something very simple; can you please take a look? Thanks!

... And I think that I managed to get it to work! Should be ready for review.

Amire80 moved this task from In Progress to In Review on the LE-Sprint-87 board.May 28 2015, 2:37 PM

A bit of background: I cheated while developing, because I don't have a local Wikibase installation for testing. To test I copied the function from https://gerrit.wikimedia.org/r/#/c/214119/13/modules/wikibaselink/ext.cx.wikibase.link.js to https://ca.wikipedia.org/wiki/Usuari:Amire80/common.js , and then I translated the article https://ca.wikipedia.org/wiki/Assaf_Amdurski - and it worked, the links were added automatically and without any disruption.

Change 214119 merged by jenkins-bot:
Add a Wikibase link after publishing a page

https://gerrit.wikimedia.org/r/214119

Arrbee moved this task from In Review to Done on the LE-Sprint-87 board.Jun 1 2015, 7:14 AM
KartikMistry closed this task as Resolved.Jun 5 2015, 4:32 AM
KartikMistry added a subscriber: KartikMistry.
Pginer-WMF reopened this task as Open.Jun 19 2015, 7:33 AM

According to this comment, when an existing article is translated into the user namespace, Wikidata links are modified to point to the user namespace. They should not be modified. We should just add links when new articles are created in the main namespace.
I think we already checked this, but a closer look may be needed if it still fails under some circumstances.

Amire80 closed this task as Resolved.Jun 19 2015, 7:41 AM

This was updated in production yesterday. The comment in ca.wikipedia probably refers to something that happened shortly before the update.

I just tested it carefully:

It definitely doesn't happen anymore. You can see that no link was added to the Wikidata item. Compare it with https://www.wikidata.org/w/index.php?title=Q134396&action=history , where links to test pages were auto-added and manually deleted.

Thanks for double-checking @Amire80!

Pginer-WMF added a comment.EditedJun 13 2018, 10:16 AM

As part of the work on version 2 we asked advanced users to create articles and several reported that wikidata links were not added automatically when publishing (1) (2) (3) (4).

We may want to investigate if there is a regression about this, or in case that it takes too much time for the links to update (e.g., due to Wikidata cache), evaluate what can be done to reduce the confusion.