Page MenuHomePhabricator

[Epic] Wikidata 3rd party client (Instant Wikidata)
Open, LowPublic

Assigned To
None
Authored By
Qgil
Mar 26 2013, 12:46 AM
Referenced Files
None
Tokens
"Love" token, awarded by Sj."Barnstar" token, awarded by GreenReaper."Love" token, awarded by krosylight."Love" token, awarded by Lens0021."Love" token, awarded by CXuesong."Love" token, awarded by MGChecker."Love" token, awarded by Liuxinyu970226."Yellow Medal" token, awarded by Wesalius."Love" token, awarded by Tgr."Love" token, awarded by Samwilson."Love" token, awarded by Physikerwelt.

Description

Currently Wikidata is only set up to directly serve data to Wikimedia projects.
The goal of this project is to also allow 3rd party clients to consume Wikidata data in the same way.

Possibly this could work similar to InstantCommons (https://www.mediawiki.org/wiki/InstantCommons), which is a way to use Wikimedia Commons content on third-party wikis.

We could have implementations of EntityLookup that access entities via the api and would work for this use case. Unless we have some cache invalidation mechanism, Wikidata content in third party wikis would only be updated when pages are purged or edited. (might be sufficient for initial implementation of 3rd party access)

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 1:18 AM
bzimport set Reference to bz46556.
bzimport added a subscriber: Unknown Object (MLST).

[replacing wikidata keyword by adding CC - see bug 56417]

vladjohn2013 wrote:

Hi, this project is still listed at https://www.mediawiki.org/wiki/Mentorship_programs/Possible_projects#3rd_party_client

Should this project be still listed in that page? If not, please remove it. If it still makes sense, then it could be moved to the "Featured projects" section if it has community support and mentors.

It definitely still makes sense but at the moment we can't take it on. I removed it.

closing this as there is no use keeping it around

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Addshore renamed this task from Wikidata 3rd party client to [Epic] Wikidata 3rd party client.Aug 24 2015, 9:37 AM
Addshore lowered the priority of this task from Medium to Low.
Addshore set Security to None.
Addshore updated the task description. (Show Details)
Addshore added a subscriber: Tarrow.

So as we have slowly been creating more and more libraries such as https://github.com/wmde/WikibaseDataModelServices tackling something like this starts to seem easier.

I created a public draft where I simply shove some of my lookup implementations using the Wikidata API into the WikibaseClient:
https://gerrit.wikimedia.org/r/#/c/233224/

As this just works, I can now include data from wikidata onto my local test wiki, of course more things need to be considered:

  1. The patch above is using my wikibase-api library which can be found at http://github.com/addwiki/wikibase-api. This in turn eventually requires the guzzle framework which can be found at https://packagist.org/packages/guzzle/guzzle and would likely never be approved for use on the wikimedia cluster simply because it is a pile of code that we don't own / maintain ourselves....
    1. The easiest but suckiest way of doing this would be to just fork the Wikibase repo and add the code this needs, but that is ugly etc.
    2. Another option would be to factor all of the things this would also need out of WikibaseClient and into a WikibaseClientLib or something similar which could be used in both implementations of the Client! (api and dispatch)
    3. Another option would be to provide a way for an extension to override the lookups created in the WikibaseClient class and thus an extension could wrap WikibaseClient overriding these things lookups, this would mean that all of the lovely client code and stay where it is now (although we should seriously push forward with trying to split LIB REPO and CLIENT!!!!!!! T75863
  2. Having not worked on Client much I can't say for sure but I presume the data is current retrieved on the parsing of the page and will remain the same until the page is then reparsed. As an initial version this is probably fine and clients can purge their pages to get fresh data, but other ways forward here should be considered:
    1. have a maintenance script that reloads all? oldest? loaded information
    2. on page load if the infomation is X days old reload from wikidata?
  3. As this client heavily depends on the wikibase api being kind of stable we should seriously consider pushing forward with versioning the API T92961 otherwise changes to the API would instantly break all of these API based clients. We could also look at using the restbase api thingy for this??? https://www.mediawiki.org/wiki/RESTBase

Other things we need to consider include:

  • Currently this would not work alongside an actually WikibaseClient installation, right now it would either be one or the other, we would need to think about renaming the propertyparser function in this the extension? and the LUA stuff?
  • If something like this would get used a lot then of course this results in more requests to our API... a good way of detecting this would be to have the client pass a header that we can easily identify to track the usage.....
  • We should probably create a PrefetchingEntityLookup in the wikibase-api library to that we can prefecth entitys rather than doing 1 api call for each entity used.

We could also try and implement all of the API stuff within the client itself and continue having a single extension. The property parser function etc could then be expanded to accept more arguments / handle larger arguments... Currently we can use things like {{#property:P122|from=Q12}} but we could allow {{#property:http://www.wikidata.org/entity/P122|from=http://www.wikidata.org/entity/Q122}}

All thoughts and comments welcome of course, sorry for this slightly rambely comment!

we are training thousands of students each month on an old Wikipedia mirror at the university of Jordan cause we cant link a new Wikipedia dump to wikidata. This is very frustrating. we have to use a pre-wikidata copy so we can render the pages. and it is all obsolete technology. Students still think we have to use bots to connect to other languages.
working on life Wikipedia is frustrating also, as the community destroys most of the students work, forcing them to even stop trying, and giving the teachers an extremely hard time grading the work. A no working mirror means switching back to other options (away from Wikipedia) to conduct the course.

Lydia_Pintscher renamed this task from [Epic] Wikidata 3rd party client to [Epic] Wikidata 3rd party client (Instant Wikidata).Dec 5 2017, 5:50 PM
Lydia_Pintscher added a subscriber: ChristianKl.

I'm currently working on an overview document with the team here to make sure we're talking about the same things and can start tackling them: https://docs.google.com/document/d/1YYIuQzcWz2cH9zTUbbfiBrTecVxMWAu3_9-U4DcKzEU/edit#

This might be useful for adding references for third party wikis, in terms of using bibliographic metadata from Wikidata in references.

Change 233224 had a related patch set uploaded (by Addshore; owner: Addshore):
[mediawiki/extensions/Wikibase@master] WIP DRAFT WOO - Instant Wikidata client?

https://gerrit.wikimedia.org/r/233224

Change 233224 abandoned by Addshore:
WIP DRAFT WOO - Instant Wikidata client?

https://gerrit.wikimedia.org/r/233224

Linked an old patch to phab, nothing to get excited about!

@Samwilson I just stumbled upon https://www.mediawiki.org/wiki/Extension:UnlinkedWikibase and was wondering how much of what we want here is covered by it

@Lydia_Pintscher I'm not sure — I'm still experimenting. But so far it's pretty fun, and is working well for me: it provides a Lua function called mw.ext.UnlinkedWikibase.getEntity( id ) which does what it sounds like. The main stuff that I'm still concerned about is caching and performance. I'm wondering if it'll be better to move the API calls to happen in the job queue, although that'll make the parse not be final, it will at least be faster. Although it might also not matter too much — it might be easier to just cache for 24 hours or something. I'm very much open to suggestions!

You can have a play with it at https://freo.wiki if you like.

We would love to have this for MDWiki.org.

Definitely useful to communities I'm close to! Came across this recently in a wikicite / shared source metadata conversation.

Instant WD, Instant Commons, and Q/P reuse in federation together will make strong reasons for people to choose MediaWiki when deciding on a platform for knowledge collab.

Michael subscribed.

(Removing MediaWiki-extensions-WikibaseClient, because it is about an external client, not about what exists in the code called "client". Not sure if there is any other project that would be useful to add here.)

@Michael forgive my confusion, isn't part of this being able to use one Repository with multiple Clients, or to enable WikibaseClient to draw from both a local Repository and the global Wikidata repository?

In T48556#9613585, @Sj wrote:

@Michael forgive my confusion, isn't part of this being able to use one Repository with multiple Clients, or to enable WikibaseClient to draw from both a local Repository and the global Wikidata repository?

I think the root issue is that "WikibaseClient" is somewhat misleadingly named. MediaWiki-extensions-WikibaseClient is about the specific section of the Wikibase code (sub-extension) that uses direct database access to get the repository data.

This project here, as I understand it, is about creating an alternative to that which uses network APIs over the public internet. This goes more in the direction of Wikibase - Federated Properties, or as mentioned above MediaWiki-extensions-UnlinkedWikibase.

I'm also adding Wikibase (3rd party installations) and, for visibility, Wikibase Suite Team.