Page MenuHomePhabricator

Lexeme RDF export has labels repeated several times
Closed, ResolvedPublic

Description

Looking at https://www.wikidata.org/wiki/Special:EntityData/L1.ttl?flavor=dump I notice:

wd:L1 a ontolex:LexicalEntry ;
	rdfs:label "ama"@mis-x-Q36790 ;
	skos:prefLabel "ama"@mis-x-Q36790 ;
	schema:name "ama"@mis-x-Q36790 ;
	wikibase:lemma "ama"@mis-x-Q36790 ;
	rdfs:label "𒂼"@mis-x-Q401 ;
	skos:prefLabel "𒂼"@mis-x-Q401 ;
	schema:name "𒂼"@mis-x-Q401 ;
	wikibase:lemma "𒂼"@mis-x-Q401 ;
	dct:language wd:Q36790 ;
	wikibase:lexicalCategory wd:Q1084 ;
	a ontolex:LexicalEntry ;
	rdfs:label "ama"@mis-x-Q36790 ;
	skos:prefLabel "ama"@mis-x-Q36790 ;
	schema:name "ama"@mis-x-Q36790 ;
	wikibase:lemma "ama"@mis-x-Q36790 ;
	rdfs:label "𒂼"@mis-x-Q401 ;
	skos:prefLabel "𒂼"@mis-x-Q401 ;
	schema:name "𒂼"@mis-x-Q401 ;
	wikibase:lemma "𒂼"@mis-x-Q401 ;
	dct:language wd:Q36790 ;
	wikibase:lexicalCategory wd:Q1084 ;
	a ontolex:LexicalEntry ;
	rdfs:label "ama"@mis-x-Q36790 ;
	skos:prefLabel "ama"@mis-x-Q36790 ;
	schema:name "ama"@mis-x-Q36790 ;
	wikibase:lemma "ama"@mis-x-Q36790 ;
	rdfs:label "𒂼"@mis-x-Q401 ;
	skos:prefLabel "𒂼"@mis-x-Q401 ;
	schema:name "𒂼"@mis-x-Q401 ;
	wikibase:lemma "𒂼"@mis-x-Q401 ;
	dct:language wd:Q36790 ;
	wikibase:lexicalCategory wd:Q1084 .

Easy to see that the information is repeated 3 times. Should be only once.

Event Timeline

Looks like $this->builders is not initialized correctly in RdfBuilder - LexemeRdfBuilder appears three times in that array. I'm not even sure why entity RDF builders and sub-entity builders like TermsRdfBuilder are stored in the same array under different keys...

Change 455251 had a related patch set uploaded (by Smalyshev; owner: Smalyshev):
[mediawiki/extensions/Wikibase@master] Split generic and entity builders

https://gerrit.wikimedia.org/r/455251

Change 455251 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Split generic and entity builders

https://gerrit.wikimedia.org/r/455251

Smalyshev claimed this task.
Smalyshev triaged this task as Medium priority.