Page MenuHomePhabricator

[Task] Add Lua function to get Wikibase entity by site link (title)
Closed, ResolvedPublic

Description

Please add a Lua function similar to mw.wikibase.getEntityObject for getting an entity by site link (title), just like Special:ItemByTitle or the wbgetentities API module with titles parameter.

On cswiki articles translated from other language wikis are marked by Translated template (with source wiki, article and revision parameters filled in). The template could detect (using Module:Wikidata), whether the article is connected with that source article and categorize it if not. But if that source article is connected with a different article, it is false positive there (could be categorized too, but in a different maintenance category). I can not single out these false positives, because I can not find out (using Lua function in Module:Wikidata) the Q-id for a given wiki:page.

On enwiki in Module:Wikidata talk page is another request from @czar: QID lookup from enwp article title.

Patch-For-Review:

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Lydia_Pintscher raised the priority of this task from Medium to High.Aug 22 2015, 11:01 AM
Lydia_Pintscher added a subscriber: Bene.
Jonas renamed this task from Add mw.wikibase.getEntityObject by site link (title) Lua function to [Task] Add mw.wikibase.getEntityObject by site link (title) Lua function.Sep 10 2015, 4:19 PM
Jonas updated the task description. (Show Details)
Jonas set Security to None.

I suppose it can't be done easily because the usage tracker would have to be site&title-based and not id-based, right?

I suppose it can't be done easily because the usage tracker would have to be site&title-based and not id-based, right?

For hits (where we can find an Item) tracking that as a sitelink usage of the Item would be work (as sitelinks are unique). If you want to handle usage tracking beyond this, I'm afraid you'll need a more sophisticated approach.

Link to the discussion on the EN-WIKI on the subject https://en.wikipedia.org/wiki/Wikipedia_talk:Wikidata/Archive_4#QID_lookup_from_enwp_article_title

I came here looking for the same answer. I know the wiki and the page name and want to look up the Q-ID

@Candalua help me understand how this task proposal is in the scope of the developer wishlist: https://www.mediawiki.org/wiki/Developer_Wishlist#Scope? thank you!

The scope of the survey includes the MediaWiki platform (core software, APIs, developer environment, enablers for extensions, gadgets, templates, bots, dumps), the Wikimedia server infrastructure, the contribution process, and documentation.

The proposed Lua function would make life easier for template and module developers, fostering a deeper integration between Wikidata and its clients.

This proposal is selected for the Developer-Wishlist voting round and will be added to a MediaWiki page very soon. To the subscribers, or proposer of this task: please help modify the task description: add a brief summary (10-12 lines) of the problem that this proposal raises, topics discussed in the comments, and a proposed solution (if there is any yet). Remember to add a header with a title "Description," to your content. Please do so before February 5th, 12:00 pm UTC.

Dvorapa updated the task description. (Show Details)
Tgr renamed this task from [Task] Add mw.wikibase.getEntityObject by site link (title) Lua function to [Task] Add Lua function to get Wikibase entity by site link (title).Feb 5 2017, 8:53 AM
Tgr updated the task description. (Show Details)

I just wondered if there are any updates on this. In case it is still stuck, we may want to make the blockers transparent.

Change 355230 had a related patch set uploaded (by Tpt; owner: Tpt):
[mediawiki/extensions/Wikibase@master] Adds mw.wikibase.getItemIdForLink to Scribunto

https://gerrit.wikimedia.org/r/355230

Change 357126 had a related patch set uploaded (by Tpt; owner: Tpt):
[mediawiki/extensions/Wikibase@master] Adds mw.wikibase.getEntityIdForTitle to Scribunto

https://gerrit.wikimedia.org/r/357126

To summary the current state of the work on this task:

The plan is to have at least one function mw.wikibase.getEntityIdForTitle( pageTitle ) -> string|null that returns the entity id of the item related to the page on the current wiki with for title pageTitle. It is implemented in https://gerrit.wikimedia.org/r/357126

Some questions remain :

  • Should mw.wikibase.getEntityIdForTitle take as parameter a string, a mw.title object, or allow the two?
  • Should we have a new mw.wikibase.getEntityIdForGlobalTitle function that allows to specify a global site id like "enwiki"?
  • Instead of having a mw.wikibase.getEntityIdForGlobalTitle, should we resolve interwiki prefixes like "s:fr:" to site ids in order to allow the retrieval of entity ids from interwiki links using mw.wikibase.getEntityIdForTitle (e.g. mw.wikibase.getEntityIdForTitle('s:fr:Auteur:Jean-François_Champollion') would return Q260 even if called from English Wikipedia). This approach is probably more convenient for Lua module writers because interwiki links are usually written this way in wikitext.
In T74815#3314217, @Tpt wrote:

Some questions remain :

The mw.title object is a strong weapon, which solves all remaining questions: If mw.wikibase.getEntityIdForTitle(titleString) is a shorthand for mw.wikibase.getEntityIdForTitle(mw.title.new(titleString)), we will get a simple and consistent API and we can refer the user to mw.title for all peculiarities. No additional ("Global") function will be needed.

Let us also have shorthands:

  • getEntityId()getEntityIdForCurrentPage()
  • getEntityId(title)getEntityIdForTitle(title)

Finally it should be considered that a given title may refer to a redirect, which itself may or may not be connected to a WD entity. In the latter case, the user may want to follow the redirect (or redirect chain if multiple redirects are allowed) to the (first) page which is connected. This behavior could be controlled by a table of options supplied to the function as a second parameter. The user can resolve this using mw.title:redirectTarget.

In T74815#3314217, @Tpt wrote:

The plan is to have at least one function mw.wikibase.getEntityIdForTitle( pageTitle ) -> string|null that returns the entity id of the item related to the page on the current wiki with for title pageTitle.

No the syntax should not be tied to "current wiki", as this makes it of little use for Commons, Wikisource, etc. There should be alternative syntax mw.wikibase.getEntityIdForTitle( pageTitle, globalSiteId ) -> string|null with globalSiteId the same as in mw.wikibase.entity:getSitelink function.

No the syntax should not be tied to "current wiki", as this makes it of little use for Commons, Wikisource, etc. There should be alternative syntax mw.wikibase.getEntityIdForTitle( pageTitle, globalSiteId ) -> string|null with globalSiteId the same as in mw.wikibase.entity:getSitelink function.

It's why I am talking of maybe adding an other function named something like mw.wikibase.getEntityIdForGlobalTitle( globalSiteId, pageTitle ) -> string|null. It could also be implemented maybe by a second optional parameter to the mw.wikibase.getEntityIdForTitle function or by adding some code that would understand that "w:fr:Foo" is refering to the page with title "Foo" on frwiki. So, starting by having a simple mw.wikibase.getEntityIdForTitle dealing with the "current wiki" only does not forbid any improvements in the future.

Someone please review @Tpt's patch mentioned in T74815#3314217. The most important functionality is already there and we have been waiting for it for years. That part of API is not going to be affected by the discussion of the final API.

@matej_suchanek pointed out during the review, that mw.wikibase.getEntityIdForTitle(pageTitle) should add an entry to pagelinks, i.e. (if I am correct) record the target page as linked from current page, and I think that he is right. However, I believe it is ok to proceed without that now and leave it for the "improvements in the future", as @Tpt said. In majority of cases, the target page will be linked by other means anyway. For example, a flag template, which uses property P41 (flag image), will create a clickable link to the article about the country.
Note that this has nothing to do with usage tracking (T142093).

For the sake of naming consistency, maybe the new function's name should be mw.wikibase.getEntityIdForSitelink because we already have functions mw.wikibase.sitelink and mw.wikibase.entity:getSitelink. The latter has an optional parameter globalSiteId with values like "enwiki".

Change 357126 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Adds mw.wikibase.getEntityIdForTitle to Scribunto

https://gerrit.wikimedia.org/r/357126

mw.wikibase.getEntityIdForTitle( pageTitle ) seems to work as expected at testwiki:Template talk:Country ID.

Although I advocated for support of mw.title, now I suggest to keep things simple and as the next step add
mw.wikibase.getEntityIdForTitle( pageTitle, globalSiteId ) (both parameters are strings), as proposed by @Jarekt in T74815#3314832.

It seems that even some usage is tracked, although I am not sure if it is 100% correct: Employment of the new function leads to the following text appearing in the Edit source page:

Wikidata entities used in this page

To summary, currently mw.wikibase.getEntityIdForTitle allows to retrieve the id of the item connected to a page of the current wiki from its title as a string.

The improvement paths I see are:

  1. Allow the parameter to be a mw.title.Title object
  2. Allow to retrieve the item id from a page in an other wiki. I see two possible implementations:
    • Resolve page title prefixes like w:en:. It may be very convenient because it is the way interwiki links are currently stored in wikitext.
    • Add a second parameter (or an other function mw.wikibase.getEntityIdForGlobalTitle) that takes for second parameter the site id (like enwiki)

I have high hopes that T99899 will be implemented in the future as well. It would be nice if, solution we will pick for this task would help, or at least not hurt T99899 task.

Change 355230 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Allow specifying a non-local globalSiteId in getEntityIdForTitle

https://gerrit.wikimedia.org/r/355230

matej_suchanek assigned this task to Tpt.
matej_suchanek removed a project: Patch-For-Review.
matej_suchanek updated the task description. (Show Details)
matej_suchanek removed a subscriber: matej_suchanek.