Page MenuHomePhabricator

Populate language links on MDWiki based on Wikidata item
Open, MediumPublic

Description

MDWiki supports linking to other language editions of Wikipedia. For example, if you add [[de:Mutterschaft Minderjähriger]] to the article on teenage pregnancy https://mdwiki.org/wiki/Teenage_pregnancy you will get a German language link instead of a page link to “De:Mutterschaft Minderjähriger” somewhere in the article.

MDWiki currently uses UnlinkedWikibase to associate its articles with their Wikidata items. In parallel, https://www.wikidata.org/wiki/Property:P11143 is used to associate Wikidata items with MDWiki articles. UnlinkedWikibase has no interaction with the database, meaning we can't use page_props to populate lang_links.

Event Timeline

Note that UnlinkedWikibase is installed

Harej triaged this task as Medium priority.Jan 29 2024, 10:47 PM
Harej renamed this task from Language dropdown for MDWiki to Proposal for article language dropdown via UnlinkedWikibase.Apr 30 2024, 3:54 AM
Harej updated the task description. (Show Details)
Harej added a subscriber: Doc_James.
Harej renamed this task from Proposal for article language dropdown via UnlinkedWikibase to Improve UnlinkedWikibase to support retrieval of page links.Apr 30 2024, 4:44 PM
Harej removed Harej as the assignee of this task.
Harej updated the task description. (Show Details)
Harej updated the task description. (Show Details)

I'm not sure I understand. The language selector is for switching to other language versions in the same project — but MDwiki only has one language edition doesn't it? i.e. de.mdwiki.org isn't a thing. So you'd want the language list to somehow know that you want to link to Wikipedias, rather than say Wikisources, but I'm not sure how it could.

(Although I must admit, I haven't really dug into the details of how the language selector is built, nor how it is generalised between skins.)

All the languages of Wikipedia are stored here, so basically want the system to pretend MDWiki is a language of Wikipedia, just like Simple Wikipedia is.

image.png (1×1 px, 159 KB)

I'm not sure I understand. The language selector is for switching to other language versions in the same project — but MDwiki only has one language edition doesn't it? i.e. de.mdwiki.org isn't a thing. So you'd want the language list to somehow know that you want to link to Wikipedias, rather than say Wikisources, but I'm not sure how it could.

I suppose MDWiki would be a "language" of Wikipedia.

I am looking more into the internals of how UnlinkedWikibase works. If I am not mistaken, it seems to only provide Lua methods to access Wikidata items, so that e.g. infoboxes still work. But it has nothing to do with the page_props table which is where you would normally define Wikidata IDs. This would make UnlinkedWikibase a poor fit for this purpose. Do I understand that correctly?

(Although I must admit, I haven't really dug into the details of how the language selector is built, nor how it is generalised between skins.)

I don't think it's necessary to get into that if we add the Wikipedia articles as language links in the lang_links table, which would require MDWiki to recognize itself as a sort of Wikipedia.

Harej renamed this task from Improve UnlinkedWikibase to support retrieval of page links to Populate language links on MDWiki based on Wikidata item.May 11 2024, 10:37 PM
Harej updated the task description. (Show Details)

If we want to avoid bringing UnlinkedWikibase into this, what we could instead do is:

  • Have a maintenance script that retrieves, from the Wikidata Query Service, a list of Wikipedia articles connected to items that are associate with MDWiki via P11143.
  • Uses this mapping to update lang_links on a periodic basis
  • Somehow convince MDWiki.org that it is a language edition of Wikipedia.

@Skizzerz @Bawolff interested in your thoughts.

Would be wonderful if we could have the system believe that MDWiki is a language like Simple EN...

  • Have a maintenance script that retrieves, from the Wikidata Query Service, a list of Wikipedia articles connected to items that are associate with MDWiki via P11143.

My own counterargument to this is that the Wikidata query to retrieve this information is very expensive:

SELECT ?item ?value ?article
WHERE {
  ?item wdt:P11143 ?value.
  ?article schema:about ?item;
           schema:isPartOf [ wikibase:wikiGroup "wikipedia" ].
}

This times out on the regular Wikidata Query Service, and even on my own Wikidata Query Service it takes several minutes. Sure, you don't need to run the query all that often, but if an error or vandalism got in, it wouldn't be removed until the next update. Even if you forced an update, it could still take several minutes for the query to return. Not ideal.

A less direct, but less expensive, route is to just get the mapping of Wikidata IDs and then retrieve the sitelinks from each item individually. https://query.wikidata.org/#SELECT%20%3Fitem%20%3Fvalue%0AWHERE%20%7B%0A%20%20%3Fitem%20wdt%3AP11143%20%3Fvalue.%0A%7D

Every article on MDWiki does list the Wikidata ID at the top if that helps https://mdwiki.org/w/index.php?title=Gout&action=edit

  • Have a maintenance script that retrieves, from the Wikidata Query Service, a list of Wikipedia articles connected to items that are associate with MDWiki via P11143.

My own counterargument to this is that the Wikidata query to retrieve this information is very expensive:

SELECT ?item ?value ?article
WHERE {
  ?item wdt:P11143 ?value.
  ?article schema:about ?item;
           schema:isPartOf [ wikibase:wikiGroup "wikipedia" ].
}

This times out on the regular Wikidata Query Service, and even on my own Wikidata Query Service it takes several minutes. Sure, you don't need to run the query all that often, but if an error or vandalism got in, it wouldn't be removed until the next update. Even if you forced an update, it could still take several minutes for the query to return. Not ideal.

An optimizer hint should help with that https://query.wikidata.org/#SELECT%20%3Fitem%20%3Fvalue%20%3Farticle%0AWHERE%20%7B%0A%20%20%3Fitem%20wdt%3AP11143%20%3Fvalue.%0A%20%20hint%3APrior%20hint%3ArunFirst%20%22true%22.%0A%20%20%3Farticle%20schema%3Aabout%20%3Fitem%3B%0A%20%20%20%20%20%20%20%20%20%20%20schema%3AisPartOf%20%5B%20wikibase%3AwikiGroup%20%22wikipedia%22%20%5D.%0A%7D

Change #1034435 had a related patch set uploaded (by Brian Wolff; author: Brian Wolff):

[mediawiki/extensions/UnlinkedWikibase@master] Add optional feature to get interlanguage links from wikibase

https://gerrit.wikimedia.org/r/1034435

Change #1034435 merged by jenkins-bot:

[mediawiki/extensions/UnlinkedWikibase@master] Add optional feature to get interlanguage links from wikibase

https://gerrit.wikimedia.org/r/1034435

Hi, langlinks displayed at sidebar in skin vector like here but not in vector-2022. Also it will be useful to access to langlinks throw API.

I've added some documentation about this feature: https://www.mediawiki.org/wiki/Extension:UnlinkedWikibase#Interlanguage_links (please edit as you see fit).

That might be all that remains for this task.

Oh sorry, no there's also the langlinks and langlinkscount API results.

We have language links partly working. But it is hit and miss and not sure why?

I suspect this is due to the new caching mechanism, in which the data is fetched by a job in the job queue. I've been thinking that this should change, and at least the item specified with {{#unlinkedwikibase: id=Q123 }} should be fetched immediately (during parsing). And perhaps up to another five items. The idea of keeping it smaller is that some pages might request tens or hundreds of items, and doing that during parsing makes for a bad experience.

That makes sense. Will if pull in all the language links with one request?
And then the wiki data element in the sidebar with another request?

J

Sent from Gmail Mobile

Will if pull in all the language links with one request?
And then the wiki data element in the sidebar with another request?

It'll get all language links with one request. The sidebar link doesn't actually need any request (although we should probably normalize it so it's using the canonical Wikibase ID, in case of redirects).

I am not sure if we have progress on getting a wikidata sidebar link?

Best
J

I am not sure if we have progress on getting a wikidata sidebar link?

That was done in T376843.

It looks like MDWiki is still on version 8c45cd2 from 20 September 2024, so is missing out on the sidebar link and better caching. I've tagged a new version (although I don't think you're installing via Composer).

So I was wrong above in T355725#10270793, it's not the new caching that's breaking things, it's hopefully the fact that that's missing! :-)

So, upgrade and let's see where things are at.

Thanks Sam

Ryan, when will we be updating to see if this fixes the wikidata sidebar
issue?

James

@Doc_James UnlinkedWikibase has been updated on MDWiki to the latest commit.

So Ryan has updated UnlinkedWikibase; however, I am only seeing the three
languages that are listed locally... not all languages still

[image: image.png]

Not sure if anything further is needed.

J

{F57778701}