Page MenuHomePhabricator

Represent editions as interwiki links on Wikisource
Open, Needs TriagePublic

Description

There are works on Wikisource that have several editions/translations for the same language, for instance "The Raven"
https://www.wikidata.org/wiki/Q22726
Each edition is linked with the properties "edition" (p747) and "edition or translation of" (p629). Each one of the items contain a sitelink to a different wikisource, for that reason the links don´t show up on Wikisource.
It is necessary to collect all the sitelinks linked from items in the star-like structure and represent the links on the sidebar for each one of the editions. The "edit" link would have a different effect linking to the item representing the work.

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

A similar problem exists on Incubator. Several languages may have articles on identical topics and should have links to each other.

Indeed, not the same. Only similar. I wonder if one technique could remedy both - a "possibly" guts feeling at best. :-)

Thanks for pointing me to the task.

Just to give the issue another twist, here is something in between the original issue (different editions) and that mentioned by Purodha:

For Tóbiás Coberus https://www.wikidata.org/wiki/Q868104 exists one "person" article in hungarian Wikipedia, and two transcribed articles from biographic encyclopedias in German wikisource: https://www.wikidata.org/wiki/Q21235363 (ADB article) and https://www.wikidata.org/wiki/Q23020668 (BLKÖ article). The items are in Wikidata related by properties P921 and inversely by P1343 (indirectly with qualifier!).

The task would be to provide the Wikisource view of one of these biography articles with links to (at least) the "corresponding" Wikipedia articles, since the articles themselves are just biography entries in nature, so quite comparable.

Just to give the issue another twist, here is something in between the original issue (different editions) and that mentioned by Purodha:

For Tóbiás Coberus https://www.wikidata.org/wiki/Q868104 exists one "person" article in hungarian Wikipedia, and two transcribed articles from biographic encyclopedias in German wikisource: https://www.wikidata.org/wiki/Q21235363 (ADB article) and https://www.wikidata.org/wiki/Q23020668 (BLKÖ article). The items are in Wikidata related by properties P921 and inversely by P1343 (indirectly with qualifier!).

The task would be to provide the Wikisource view of one of these biography articles with links to (at least) the "corresponding" Wikipedia articles, since the articles themselves are just biography entries in nature, so quite comparable.

Definitely not the "Described by source" and "Main subject" relationship. They are associated though no direct relationship between separately published works about a subject.

This ticket is about the direct relationship between a work and its editions or its translations (interwiki), and then hopefully leading to the second generation relationship between an edition in one language, and a translation in another (as an interlanguage link).

What I was aiming at: There seems to be a general problem underlying this concrete issue: Wikidata items connected to some WS page typically (?) have exactly one sitelink (namely to the WS page in question) but still there are ~closely related~ items "in other languages" (*and* also the same language) which could be presented. So a rather general mechanism could be thought of involving traversing up and back down a ~suitable~ WD property and its inverse - but what property that might be should be under control of the Wikisource article - e.g. "edition or translation of" in the case of poems and "main subject" in the case of encyclopedic articles. I'm quite sure we'll find even more examples on closer investigation.

@Gymel What you are describing is related to a topic/subject, not so much a work. Each article would be articulated on the Q-item as "described by source" and links can be provided to each currently. For example for an author, you could call each "described by ..." by use of $1 to $9, etc. in a template to call back each value listed in the link.

As I see your commentary it relates to a work about a subject related to another work about the same subject. I don't see that if I have an article on a person in the (English) Dictionary of National Biography that the DNB article has a direct relationship/interest in a Russian encyclopaedia about the same person, and vice versa.

If that is not what you mean, then I am not getting the gist of your argument.

@Billinghurst: I'm trying to take the point of view of WS and *not* that of WD. So when WD has a /clean/ modeling of the situation (separating items for persons from that for articles about that person) then there will be not only translations but a much bigger class of WS pages where the corresponding WD item will ever have exactly one sitelink but WS pages would like to link to ~corresponding~ (related, comparable) pages in other languages.

From a WS user's perspective an ADB article about a person on de:WS and a hu:WP article on the same person (and the BLKÖ article accidentally in the same language) are just encyclopedic articles about the same person he might wish to consult, I don't see a fundamental difference from the translated poem situation, it's all about being able to /compare/ the current resource with some other.

srishakatux subscribed.

this is already part of the community wishlist survey!

Tpt subscribed.

I should close the other task, sorry

Change 553866 had a related patch set uploaded (by Tpt; owner: Tpt):
[mediawiki/extensions/Wikibase@master] [WIP] Adds an hook for adding extra language links

https://gerrit.wikimedia.org/r/553866

I am not sure this is a good idea or not (its seems like there maybe a few proposals) and I am against implementing this in Wikidata beyond how it already implements things (multiple linked records with one sitelink per wiki per item). That said, I see no issue with some wiki client like a Wikisource one implementing such a thing locally, e.g., perhaps traversing through linked Wikidata items via Scribunto Lua modules, to find all the sitelinks of all edition items of a specific work item they are all linked to, or other similar arrangements it wants to.

I recommend this Task be interpreted as a request for the development of a local Wikidata-based Lua Module to accomplish its goals.

Whether is a good idea to collect and spam all such links into the toolbar a la Interwiki links (or in some other arrangement), I leave as a problem for individual wikis' user base and local solution designers/developers to decide and resolve.

On the contrary, the interwiki solution baked into MediaWiki makes assumptions about Wikidata's interwiki links that make sense on some wikis (like Wikipedia) and not on other wikis (like Wikisource). The interwiki system needs to be flexible enough to accommodate different data models used on Wikidata for its various interwiki links.

This is especially true if the WMF intends to introduce other wikis in the future, where we cannot predict how they will integrate with Wikidata at this point.

If Wikidata aspires to be anything more than a very expensive and overblown index for Wikipedia, this is a core functionality that it needs to be able to support.

@beleg_tal: I agree with your statements, especially "interwiki system needs to be flexible enough to accommodate different data models", however, I do not think this is an inherently Wikidata issue.

For example: Wikitionary has a very different model where it wants to cover every Lexeme in every language on every one of their language-based wikis in their project. It makes sense based on what they are trying to accomplish and as such they basically do not really need nor want to use Wikidata sitelinks in their Interwiki links since for any specify Lexeme they want to link to every one of their wikis regardless if they constitute the same logical "thing" from a Wikidata perspective. This is not somethings that Wikidata or its sitelinks should attempt to consider solving.

The point is Wikidata sitelinks are not a one tool for all MWF Interwiki link needs and as such we should not try to solve all such issues there. Wikidata sitelinks are great for solving links to MWF wikis where they represent a single "thing" from Wikidata's point of view. If that does not work for some MWF projects, those issues should be solved by their respective projects and not necessarily at Wikidata (though Wikidata can of course entertain changes that benefit all or many of the projects but they are not a solution for everything).

@Uzume okay, I see what you mean. and I think we are in agreement. The Wiktionary example in particular is very helpful.

The Interwiki system is not really a Wikidata thing at all but rather a MediaWiki thing that is implemented everywhere by default but doesn't work everywhere. It doesn't work at Wikitionary so Extension:Cognate was written to replace the default MediaWiki system on that wiki. Similarly, some sort of extension (or other solution) will need to be developed for Wikisource, and for all/any other wikis where the assumptions made by MediaWiki do not apply with regard to the Wikidata modelling of pages on those wikis.

So the problem as formulated remains the same, but the solution is not to be provided by changes to Wikidata's behaviour, but rather the solution is to be provided by changes or extensions to the MediaWiki system that pulls the data from Wikidata and displays interwiki links on the relevant wikis.

It is important to note, however, that this does not only affect Wikisource, but all wikis.

Consider a Wikipedia editor who sees a Wikipedia article about a work, for which a Wikisource edition exists. This editor will be tempted to incorrectly set sitelinks from the WD item to both the WP article and the WS edition, because this will result in a correct interwiki link on the WP article. However, this would result in an incorrect model of the WS data in Wikidata.

The changes to the interwiki system that allow Wikisource pages to link to each other, should also allow Wikipedia articles to link to Wikisource pages in the same manner.

WP article <=> WD item for work <=> WD item for edition <=> WS edition

Thus a solution will need to keep in mind the whole ecosystem of wikis, and not just be localized to the one wiki that it affects most.

The issue becomes how to represent multiple edition links in Mediawiki toolbars across multiple WMF wikis across their projects. Currently, as implemented via WD sitelinks, we only allow one link per wiki per project per WD item. This is in part owning to the limited space in the Mediawiki toolbars where such links are displayed. Even across wikis within a single project when only a single link is allowed per wiki, there can sometimes be a *very* large number of links (there are many languages in Wikipedia alone and already there are mechanisms that limit the number of sitelinks displayed in the toolbar by default).

Wikidata already has a method of linking works and their editions as well a providing sitelinks to WMF wiki pages for each WD item. I am not sure it is a good idea to consider stuffing all the links of all the editions of a work into the MW toolbar of a wiki even if a good method of how to organize the display of such links could found. That said, I can see the value in using the Wikidata work and edition links to provide an overview of the all the sitelinks across all editions of a work in some fashion--perhaps something like an autogenerated disambiguation page at the work level on Wikisource wikis or a navigation template that can exist somewhere on the Wikisource edition pages. Such navigation could be stuffed into the toolbar if another "Wikisource editions" section was added but I would maintain that I would not want that extraneous noise on wikis other than Wikisource wikis. At some point too many navigation links presented all at once with limited organization becomes a deficit with regard to navigation itself.

It should also be considered that currently it is expensive to access multiple Wikidata entities from within the render of a single MW page and that there is definitely an upper limit to this sort of traversal to collect sitelinks. It would probably be a poor idea to consider pushing this across all the wikis across all the projects. Remember this is not just an issue for Wikisource editions as Wikidata can and does have edition items for editions of works that do not and will probably never have pages at any MWF wiki including Wikisource wikis (of course there won't be any sitelinks then either). The expensive overhead is incurred just to access the entities whether the items have any sitelinks (e.g., to Wikisource) or not. Also the way Wikidata links works and editions is inverted to the way you want to traverse them (editions point to works not the other way around typically). So there is no way to know all the editions of a work just by looking at statements about the work. One has to conduct a search query using something like haswbstatement or generic SPARQL. Neither of these can currently be performed during a page render.

Change 594353 had a related patch set uploaded (by Tpt; owner: Tpt):
[mediawiki/extensions/Wikisource@master] Adds work sitelinks to the Other languages sidebar

https://gerrit.wikimedia.org/r/594353

Change 595256 had a related patch set uploaded (by Tpt; owner: Tpt):
[mediawiki/extensions/Wikibase@master] Move LangLinkHandler to the Hooks namespace

https://gerrit.wikimedia.org/r/595256

Change 595257 had a related patch set uploaded (by Tpt; owner: Tpt):
[mediawiki/extensions/Wikibase@master] Introduce LangLinkHandlerFactory

https://gerrit.wikimedia.org/r/595257

Change 595258 had a related patch set uploaded (by Tpt; owner: Tpt):
[mediawiki/extensions/Wikibase@master] Introduce SiteLinksForDisplayLookup

https://gerrit.wikimedia.org/r/595258

Change 595256 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Move LangLinkHandler to the Hooks namespace

https://gerrit.wikimedia.org/r/595256

Change 595257 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Introduce LangLinkHandlerFactory

https://gerrit.wikimedia.org/r/595257

Addshore subscribed.

All of the Wikibase patches are now reviewed +2ed and or merged (or merging).
So moving this to Done on the campsite board for now.
If you need any review in Wikisource then give me a poke, but I think you have that covered!

Change 595258 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Introduce SiteLinksForDisplayLookup

https://gerrit.wikimedia.org/r/595258

Change 553866 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Add a new hook to display extra language links

https://gerrit.wikimedia.org/r/553866

Change 594353 merged by jenkins-bot:
[mediawiki/extensions/Wikisource@master] Adds work sitelinks to the Other languages sidebar

https://gerrit.wikimedia.org/r/594353

Since yesterday, if a Wikisource page is connected to a Wikidata item that states that it is an edition of the work using P629, the sitelinks of the work item are displayed on the page in the "In other languages" sidebar.
The next step is to look for the other editions on Wikidata using P747.

Since yesterday, if a Wikisource page is connected to a Wikidata item that states that it is an edition of the work using P629, the sitelinks of the work item are displayed on the page in the "In other languages" sidebar.
The next step is to look for the other editions on Wikidata using P747.

Great job!

@Tpt, do you also plan to do something with the old-style Wikisource sitelinks from the work item (linking to a text, not to a disambig-style work page in Wikisource)?
They are often placed there when:

  1. this is the only version of this text is a particular laguage Wikisource,
  2. there is no information about a particular source edition for this text (that is required to create an edition item).

Or, "dumb" edition items need to be created for them to handle properly interwiki to them from other Wikisources?

This may be considered backward-compatiblity issue, and they are handled by the userspace tool used in pl/es.
Example page:
https://hr.wikisource.org/wiki/Hamlet

Also, translations made by Wikisource (accepted in some Wikisources) may need to be handled this way. They are generally accepted if no edition in a particular language exist.

Since yesterday, if a Wikisource page is connected to a Wikidata item that states that it is an edition of the work using P629, the sitelinks of the work item are displayed on the page in the "In other languages" sidebar.
The next step is to look for the other editions on Wikidata using P747.

Does that mean we should resolve this task? Or is the "other editions" bit also within this task? :)

Does that mean we should resolve this task? Or is the "other editions" bit also within this task? :)

I believe the "other editions" bit is also in this task. I plan to close when it's done. However, this missing bit just requires a change in the Wikisource extension. So, you could consider it "done" on the Wikibase boards if you cant.

@Tpt, do you also plan to do something with the old-style Wikisource sitelinks from the work item (linking to a text, not to a disambig-style work page in Wikisource)?

I plan to support them to. It's what I meant by talking about the "The next step is to look for the other editions on Wikidata using P747."
I believe that the approach would be:
For each language:

  1. If there is a local inter-language link to this language in the wikitext of the page, use it and stop the lookup.
  2. If there is a sitelink to this language in the Wikidata item of the page, use it and stop the lookup.
  3. If there is a sitelink to this language in a Wikidata item connected to the Wikidata item of the page using edition of (P629), use it and stop the lookup.
  4. If there is a sitelink to this language in a Wikidata item connected to the Wikidata item of the page using edition (P747), use it and stop the lookup.
  5. If there is a sitelink to this language in a Wikidata item connected to the Wikidata work item using edition (P747), the work item being connected to the Wikidata item of the page using edition of (P629), use it.

This way, I think we cover all the use cases. 1, 2 and now 3 are already implemented.

I believe that the approach would be:
For each language:

  1. If there is a local inter-language link to this language in the wikitext of the page, use it and stop the lookup.
  2. If there is a sitelink to this language in the Wikidata item of the page, use it and stop the lookup.
  3. If there is a sitelink to this language in a Wikidata item connected to the Wikidata item of the page using edition of (P629), use it and stop the lookup.
  4. If there is a sitelink to this language in a Wikidata item connected to the Wikidata item of the page using edition (P747), use it and stop the lookup.
  5. If there is a sitelink to this language in a Wikidata item connected to the Wikidata work item using edition (P747), the work item being connected to the Wikidata item of the page using edition of (P629), use it.

This way, I think we cover all the use cases. 1, 2 and now 3 are already implemented.

Thanks for claryfying. However, I have few questios:

  1. If I understand properly, after finding (3) you do not intend to look for (5). Eg. for https://fr.wikisource.org/wiki/Hamlet/Traduction_Hugo,_1865/Le_premier_Hamlet and lang=en you will list interwiki to the work page https://en.wikisource.org/wiki/Hamlet_(Shakespeare), and not to the particular edition pages. Am I right? (Well, this significanly minimizes the iw link number, but makes tools like doublewiki useless)
  2. Concerning (4): multiple edition (P747) items for a specific language may exist; do you think to find only the first, random one?
  3. What do you think of handling pages with multiple edition of (P629): get first of them (a random one), all of them, first with sitelinks? See the list. I am not sure if they all can be considered misuse of the property.
  4. The procedure will handle most of simple work/edition cases. However, some works already have more complex structure in Wikidata, eg Bible (Q20821062). It has a multi-level structure of editions of translations or editions of editions. In userspace tools this is handled by a recursive search for the root work item (with recursion level limitted, currently to 3). Do you think that it is possible to handle such structures in future? (I realize that unhandled loop server-side is much more dangerous than user-side, where it would kill the browser in the worst case)

BTW, if there is a better place to discuss such issues, please advice.

I think steps 3..5 would be better as:

  1. List all other editions of the work (P629+P747 for edition or P747 for work) with a sitelink in the target language
  2. If there is exactly one, use that one and stop
  3. Otherwise (zero or multiple), try to find a sitelink on a P629-linked item and use that one if it exists

Change 793867 had a related patch set uploaded (by Tpt; author: Tpt):

[mediawiki/extensions/Wikisource@master] Use also editions when filling sitelinks

https://gerrit.wikimedia.org/r/793867

I have written an implementation draft: https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Wikisource/+/793867

It follows @Ltrlg approach (thank you!) with a slight tweak to avoid having to guess the item kind (work vs edition).
The code looks first for the sibling editions (editions of the current item work item) using P629+P747 and, then, if there is no single sitelink for the language it uses the work item sitelinks (P629) and, if not, finally, the edition items sitelinks (P747).