Page MenuHomePhabricator

Add Lexeme to Wikibase's ontology.owl
Closed, ResolvedPublic3 Estimated Story Points

Description

We need to define classes, properties and data types that are used in the Lexeme namespace in the separate OWL file.

The full list of terms we have to define is:

Classes:

wikibase:Lexeme
wikibase:Form
wikibase:Sense

Properties:

wikibase:lexicalCategory
wikibase:lemma
wikibase:grammaticalFeature

Loading ontology.owl file should contain them, but in the WikibaseLexeme repo it should be separate file since WikibaseLexeme is a separate extension.

The WikibaseLexeme ontology can be found @ https://github.com/wikimedia/mediawiki-extensions-WikibaseLexeme/blob/master/docs/ontology.owl

Acceptance Criteria

Event Timeline

@daniel Do you think that the ontology terms specific to lexemes should live in the same namespace and ontology definition file as the other Wikibase terms or should we create a new namespace?

I think they should not live in the same file, for practical reasons. It's much easier to deploy one static file per extension. This also makes it easier to use the ontology URL as the vocabulary URI. Note that the ontology URL / vocabulary URI will *not* vary between Wikibase instances. But some of these instances will have WikibaseLexeme installed, while others do not. That would be confusing, if at all workable.

I think they should use separate namespaces, for conceptual reasons: in time, there may be several extensions on top of Wikibase, some of which may have similar models, and may use overlapping vocabulary. To avoid name clashes, each such extension should use their own RDF namespace, as a matter of general hygiene - just like we use separate PHP namespaces.

In summary: each extension declares a new vocabulary, using its own namespace, hosted in at a canonical URL it controls. For the lexeme extension, I porpose http://wikiba.se/lexeme-1.0.owl

Ok for the file. For the canonical namespace URI should we use something like "http://wikiba.se/ontology-lexeme#" to have something similar to "http://wikiba.se/ontology-beta#"? Or maybe "http://wikiba.se/lexeme/ontology#"? Should we also use the beta tag (that looks a lot like the FOAF version trap)?

Which prefix should we use, wikibase-lexeme: or something simpler? People writing SPARQL queries will interact a lot with it. The "-" is valid in URI prefixes.

why not just http://wikiba.se/ontology# ?

In any case, whatever we use, it should resolve to the actual file.

As to the BETA tag... I think we can remove that from the wikibase core ontology now (should make a ticket).
It's probably good to have it for Lexeme for now.

As to the prefix - in the context of wikibase, we could just use Wikidata Lexicographical data. But establishing a standard prefix usable elsewhere is probably a good idea. #wblex? Or is that too terse?

why not just http://wikiba.se/ontology# ?

If we do that it raises multiple concerns:

  1. It does not solve the name clash problem you mentioned earlier if two extensions want to create a class/property with the same name as an already existing one.
  2. To which file does it redirects to? http://wikiba.se/ontology-1.0.owl ? http://wikiba.se/lexeme-1.0.owl ?

As to the BETA tag... I think we can remove that from the wikibase core ontology now (should make a ticket).

Done: T195377

It's probably good to have it for Lexeme for now.

I would not do it because it would mean to introduce a breaking change afterward to use it. W3C stopped to do it for this reason.

why not just http://wikiba.se/ontology# ?

Sorry, I created confusion by making a silly copy&paste mistake. I meant to say:

why not just http://wikiba.se/lexeme#?

I agree that having separate OWL file would make sense. Predicates probably should still use same wikibase namespace - to avoid confusing people with too many prefixes, of which we have about 9000 anyway. Looking at https://www.mediawiki.org/wiki/Extension:WikibaseLexeme/RDF_mapping - we have to define:

  • wikibase:Lexeme
  • wikibase:lexicalCategory
  • wikibase:lemma

The only thing we may want to use http://wikiba.se/lexeme# for is to refer to ontology itself - since owl:Ontology requires some target to attach to. We may also want to declare owl:imports to base Wikibase ontology, since we probably want to declare the above in terms of Wikibase base types.

The full list of terms we have to define is:

Classes:

  • wikibase:Lexeme
  • wikibase:Form
  • wikibase:Sense

These 3 classes are direct mapping of the entity types so they should not cause name clash with other Wikibase extensions.

Properties:

  • wikibase:lexicalCategory
  • wikibase:lemma
  • wikibase:grammaticalFeature

I believe that we all agree of having a specific OWL file for lexemes, importing the base ontology.

About having the termed defined with a different base URI it seems that 1 pro and 1 con have been raised.

The pro: no risk of name clashes between different Wikibase extensions
The con: yet another prefix.

I agree with @Smalyshev about having just one prefix for Wikibase and all its "official" extensions because the number of them is probably going to be quite low and it would definitely improve the usability of the query service. If we have different namespaces the fact that lexemes have a wikibase-lexeme:lexicalCategory but use statements which rank could be wikibase:preferred looks quite confusing to me.

Two specific points:

  1. what about using http://wikiba.se/lexeme/ontology instead of http://wikiba.se/lexeme#? It seems a bit more consistent to me and it still allows wikiba.se to have an HTML page describing the lexeme extension at http://wikiba.se/lexeme and allows to do an easy redirection to the OWL definition page without content negotiation.
  2. If we use just one prefix for Wikibase and its Lexeme extension, we should probably redirect the http://wikiba.se/ontology# to something like the union of the base and lexeme ontologies in order to have all terms in this namespace defined. A simple way to implement it maybe to have a simple OWL file that just imports the base and lexeme ontologies.

using http://wikiba.se/lexeme/ontology instead of http://wikiba.se/lexeme#

I think making all redirects etc. work might be slightly easier with http://wikiba.se/lexeme but otherwise may be ok. Not sure if you are proposing prefix as http://wikiba.se/lexeme/ontology/ or as http://wikiba.se/lexeme/ontology#. I am for the latter option, unless there's a good reason to do the former (which I currently don't see).

we should probably redirect the http://wikiba.se/ontology# to something like the union of the base and lexeme ontologies

Well, since wikiba.se site is not related to any specific Wikidata install, there we can have the file that has both ontologies together. From what I understand, these files are different from ones in Wikibase repos?

Not sure if you are proposing prefix as http://wikiba.se/lexeme/ontology/ or as http://wikiba.se/lexeme/ontology#.

Sorry, I was also thinking of http://wikiba.se/lexeme/ontology# but forgot the #.

Well, since wikiba.se site is not related to any specific Wikidata install, there we can have the file that has both ontologies together.

Indeed. But I believe that it will increase the maintainance work of wikiba.se (a copy won't be enough to syynchronize the two versions of the file anymore).

I have updated the draft of RDF implementation change: https://gerrit.wikimedia.org/r/c/433953
It uses the existing namespace for Lexemes specific terms but provides a new ontology file which URI is http://wikiba.se/lexeme/ontology# (but it could be easily changed)

Vvjjkkii renamed this task from Add Lexeme to Wikibase's ontology.owl to fgcaaaaaaa.Jul 1 2018, 1:08 AM
Vvjjkkii raised the priority of this task from Medium to High.
Vvjjkkii updated the task description. (Show Details)
Vvjjkkii removed a subscriber: Aklapper.
CommunityTechBot renamed this task from fgcaaaaaaa to Add Lexeme to Wikibase's ontology.owl.Jul 2 2018, 4:40 AM
CommunityTechBot lowered the priority of this task from High to Medium.
CommunityTechBot updated the task description. (Show Details)
CommunityTechBot added a subscriber: Aklapper.
Lydia_Pintscher added a subscriber: Addshore.

@Addshore It'd be great if you could have a look and get this ready for pick-up so we can remove the final blockers for querying Lexemes asap.

Change 470663 had a related patch set uploaded (by Ladsgroup; owner: Ladsgroup):
[wikibase/wikiba.se@master] Add WikibaseLexeme to ontology-1.0.owl

https://gerrit.wikimedia.org/r/470663

Ladsgroup subscribed.

According to acceptance criteria, it should be part of the main file. Moving it another file is easy and possible but not sure if we want it.

Change 470663 merged by jenkins-bot:
[wikibase/wikiba.se@master] Add WikibaseLexeme to ontology-1.0.owl

https://gerrit.wikimedia.org/r/470663

We need to get someone to update the site....

@JeroenDeDauw ?

I really wanted to verify this today, but its not on wikiba.se yet.
On the + side, it meant that I looked into T99531 a bit and we made some progress thereeee.
This can wait in the verification column until it is updated on the wikiba.se site.
Note, the mirror @ https://wikibase.wmflabs.org is already updated.

Senses aren't in source/ontology-1.0.owl yet, but the current version of the description mentions them.