Page MenuHomePhabricator

Implement Lua access to Lexemes, Senses and Forms
Open, Needs TriagePublic

Description

Task to collect some preliminary work on T212843: [EPIC] Access to Wikidata's lexicographical data from Wiktionaries and other WMF sites. This initial implementation will likely not feature fine-grained usage tracking yet, and parser functions are out of scope for now.

Details

Related Gerrit Patches:
mediawiki/extensions/WikibaseLexeme : masterAdd mw.wikibase.lexeme.splitLexemeId function
mediawiki/extensions/WikibaseLexeme : masterCapitalize Lexeme more consistently
mediawiki/extensions/WikibaseLexeme : masterAdd Lua module for Senses
mediawiki/extensions/WikibaseLexeme : masterAdd Lua module for Forms
mediawiki/extensions/WikibaseLexeme : masterChange function declarations to Lua style
mediawiki/extensions/WikibaseLexeme : masterMake mw.wikibase.lexeme.entity.lexeme inherit mw.wikibase.entity
mediawiki/extensions/WikibaseLexeme : masterAdd getLemmas function to Lua modules
mediawiki/extensions/WikibaseLexeme : masterAdd all-usage for all subentities
mediawiki/extensions/WikibaseLexeme : masterSpecify Lua module to be used for Lexeme entities
mediawiki/extensions/WikibaseLexeme : masterAdd documentation for rudimentary Lua modules
mediawiki/extensions/WikibaseLexeme : masterAdd rudimentary mw.wikibase.lexeme.entity.lexeme Lua module
mediawiki/extensions/WikibaseLexeme : masterAdd rudimentary mw.wikibase.lexeme Lua module

Event Timeline

Change 544205 had a related patch set uploaded (by Lucas Werkmeister (WMDE); owner: Lucas Werkmeister (WMDE)):
[mediawiki/extensions/WikibaseLexeme@master] Add rudimentary mw.wikibase.lexeme Lua module

https://gerrit.wikimedia.org/r/544205

Change 544206 had a related patch set uploaded (by Lucas Werkmeister (WMDE); owner: Lucas Werkmeister (WMDE)):
[mediawiki/extensions/WikibaseLexeme@master] Add rudimentary mw.wikibase.lexeme.entity.lexeme Lua module

https://gerrit.wikimedia.org/r/544206

Change 544207 had a related patch set uploaded (by Lucas Werkmeister (WMDE); owner: Lucas Werkmeister (WMDE)):
[mediawiki/extensions/WikibaseLexeme@master] Make mw.wikibase.lexeme.entity.lexeme inherit mw.wikibase.entity

https://gerrit.wikimedia.org/r/544207

Change 544208 had a related patch set uploaded (by Lucas Werkmeister (WMDE); owner: Lucas Werkmeister (WMDE)):
[mediawiki/extensions/WikibaseLexeme@master] Specify Lua module to be used for Lexeme entities

https://gerrit.wikimedia.org/r/544208

Change 544234 had a related patch set uploaded (by Lucas Werkmeister (WMDE); owner: Lucas Werkmeister (WMDE)):
[mediawiki/extensions/WikibaseLexeme@master] Add documentation for rudimentary Lua modules

https://gerrit.wikimedia.org/r/544234

The patches linked above add support for code of the following sort:

mw.wikibase.lexeme.getLanguage( 'L1' )
mw.wikibase.getEntity( 'L2' ):getLexicalCategory()

Missing features:

  • Lua modules for Senses and Forms, likewise wired up with mw.wikibase.getEntity()
  • getSenses() and getForms() functions/methods in the Lexeme modules, returning “instances” of the corresponding modules

Also, lots of cleanup and testing is probably still needed.

Usage tracking is also going to be interesting. Currently, it’s strictly entity-based, as far as I can see (as opposed to page-based), both on the repo (wb_changes_subscription) and on the client (wbc_entity_usage). Does this mean that a Wiktionary page for one lexeme may end up with dozens, if not hundreds of wbc_entity_usage rows, one per form (and aspect)? Or should we say that entity usage stops at subentities, and any usage of a lexeme implies usage of all of its forms? Or do we somehow group usages together, similar as for other aspects, and turn form usages into one “all forms of this lexeme” usage once they exceed a certain threshold?

Change 545377 had a related patch set uploaded (by Lucas Werkmeister (WMDE); owner: Lucas Werkmeister (WMDE)):
[mediawiki/extensions/WikibaseLexeme@master] Add all-usage for all subentities

https://gerrit.wikimedia.org/r/545377

Change 545378 had a related patch set uploaded (by Lucas Werkmeister (WMDE); owner: Lucas Werkmeister (WMDE)):
[mediawiki/extensions/WikibaseLexeme@master] Add getLemmas function to Lua modules

https://gerrit.wikimedia.org/r/545378

Change 545379 had a related patch set uploaded (by Lucas Werkmeister (WMDE); owner: Lucas Werkmeister (WMDE)):
[mediawiki/extensions/WikibaseLexeme@master] Add Lua module for Forms

https://gerrit.wikimedia.org/r/545379

Change 545537 had a related patch set uploaded (by Lucas Werkmeister (WMDE); owner: Lucas Werkmeister (WMDE)):
[mediawiki/extensions/WikibaseLexeme@master] Add Lua module for Senses

https://gerrit.wikimedia.org/r/545537

Change 544205 merged by jenkins-bot:
[mediawiki/extensions/WikibaseLexeme@master] Add rudimentary mw.wikibase.lexeme Lua module

https://gerrit.wikimedia.org/r/544205

Change 544206 merged by jenkins-bot:
[mediawiki/extensions/WikibaseLexeme@master] Add rudimentary mw.wikibase.lexeme.entity.lexeme Lua module

https://gerrit.wikimedia.org/r/544206

Change 544207 merged by jenkins-bot:
[mediawiki/extensions/WikibaseLexeme@master] Make mw.wikibase.lexeme.entity.lexeme inherit mw.wikibase.entity

https://gerrit.wikimedia.org/r/544207

Change 544208 merged by jenkins-bot:
[mediawiki/extensions/WikibaseLexeme@master] Specify Lua module to be used for Lexeme entities

https://gerrit.wikimedia.org/r/544208

Change 544234 merged by jenkins-bot:
[mediawiki/extensions/WikibaseLexeme@master] Add documentation for rudimentary Lua modules

https://gerrit.wikimedia.org/r/544234

Change 545377 abandoned by Lucas Werkmeister (WMDE):
Add all-usage for all subentities

Reason:
not necessary after all

https://gerrit.wikimedia.org/r/545377

Change 545378 merged by jenkins-bot:
[mediawiki/extensions/WikibaseLexeme@master] Add getLemmas function to Lua modules

https://gerrit.wikimedia.org/r/545378

Change 550662 had a related patch set uploaded (by Lucas Werkmeister (WMDE); owner: Lucas Werkmeister (WMDE)):
[mediawiki/extensions/WikibaseLexeme@master] Change function declarations to Lua style

https://gerrit.wikimedia.org/r/550662

Change 554116 had a related patch set uploaded (by Lucas Werkmeister (WMDE); owner: Lucas Werkmeister (WMDE)):
[mediawiki/extensions/WikibaseLexeme@master] Capitalize Lexeme more consistently

https://gerrit.wikimedia.org/r/554116

Change 554117 had a related patch set uploaded (by Lucas Werkmeister (WMDE); owner: Lucas Werkmeister (WMDE)):
[mediawiki/extensions/WikibaseLexeme@master] Add mw.wikibase.lexeme.splitLexemeId function

https://gerrit.wikimedia.org/r/554117

Change 554116 merged by jenkins-bot:
[mediawiki/extensions/WikibaseLexeme@master] Capitalize Lexeme more consistently

https://gerrit.wikimedia.org/r/554116

Change 554117 merged by jenkins-bot:
[mediawiki/extensions/WikibaseLexeme@master] Add mw.wikibase.lexeme.splitLexemeId function

https://gerrit.wikimedia.org/r/554117