Page MenuHomePhabricator

mw.wikibase.lexeme is nil on Beta Wikidata, and mw.wikibase.mediainfo is nil on Commons (beta+prod)
Closed, ResolvedPublic

Description

On the beta cluster, Lexeme Lua access is enabled now, and the mw.wikibase.lexeme package is loaded, but mw.wikibase.lexeme is not available:

=package.loaded['mw.wikibase.lexeme']
table
=mw.wikibase.lexeme
nil

Similarly, on Commons, Lua access is enabled but the mw.wikibase.mediainfo global isn’t set properly:

=mw.wikibase.mediainfo
nil
=package.loaded['mw.wikibase.mediainfo']
table

I now think this is a Scribunto bug (T294545), happening because we register these modules before the main mw.wikibase module, due to how the extensions are loaded in mediawiki-config. The rest of the original description below, and the first comments on the task, are red herrings.


I think this is a load order problem. In the file mw.wikibase.lexeme.lua, setupInterface() has this code:

	mw = mw or {}
	mw.wikibase = mw.wikibase or {}
	mw.wikibase.lexeme = wikibaseLexeme
	package.loaded['mw.wikibase.lexeme'] = wikibaseLexeme
	wikibaseLexeme.setupInterface = nil

But in mw.wikibase.lua, it looks like this:

	mw = mw or {}
	mw.wikibase = wikibase
	package.loaded['mw.wikibase'] = wikibase
	wikibase.setupInterface = nil

This looks like the Lexeme module tries to make its registration work regardless of whether it or the Wikibase module is loaded first, but the Wikibase module is not so careful, and overwrites any existing mw.wikibase value. So I suspect that for some reason, the Lexeme setupInterface() gets called before the Wikibase one on Beta, and that’s (one reason?) why Lexeme access doesn’t work properly.

Event Timeline

I dimly remember that this Lua packages thing (which we interact with using package.loaded) supports some form of lazy-loading, so I think the best solution might be for the Lexeme module to lazy-load the Wikibase module if needed, rather than for the Wikibase module to not overwrite mw.wikibase. Needs some more looking into.

Change 734285 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[mediawiki/extensions/WikibaseLexeme@master] Require mw.wikibase in mw.wikibase.lexeme.lua

https://gerrit.wikimedia.org/r/734285

I was going to test the fix by manually applying it on deployment-mediawiki11, but now I can’t even reproduce the error anymore, mw.wikibase.lexeme is already a table. That seems somewhat plausible if it’s indeed a load order problem, but I don’t know why the load order would change (and it seems to be fairly consistently working now).

Hm, but on Beta Wikidata I can still reproduce the bug, and it appears to persist even with the fix applied to /srv/mediawiki/php-master/extensions/WikibaseLexeme/src/MediaWiki/Scribunto/mw.wikibase.lexeme.lua

Change 734285 merged by jenkins-bot:

[mediawiki/extensions/WikibaseLexeme@master] Require mw.wikibase in mw.wikibase.lexeme.lua

https://gerrit.wikimedia.org/r/734285

The bug still persists on Beta Wikidata (but not Beta English Wiktionary), though the above change is now definitely rolled out there. I don’t understand why, but there must be some other bug.

Screenshot 2021-10-26 at 12-42-46 Creating Module Blank - Wikidata.png (321×944 px, 15 KB)

Okay, with a healthy amount of debug logging on deployment-mediawiki11, I think I’ve figured out what’s wrong: by the time the WikibaseLexeme setupInterface() runs (where we assign to wikibase or mw.wikibase), the Wikibase setupInterface() Lua function actually hasn’t run yet, at least on Beta Wikidata. (On Beta English Wiktionary, it has, and I don’t know why.) So the assignment of mw.wikibase = wikibase hasn’t actually happened yet, and in general the module isn’t actually ready for “consumption”.

I’ll have to look into Scribunto to see if we’re supposed to do anything about this (call setupInterface() ourselves?) or if that’s supposed to be Scribunto’s responsibility (and constitutes a bug).

Apparently, on Beta Wikidata, the Lexeme Lua not only loads before the main Wikibase Lua, the Wikibase Lua actually loads twice?

["log"] = table#2 {
  "mw.wikibase.lexeme toplevel",
  "mw.wikibase.lexeme.setupInterface begin",
  "mw.wikibase toplevel",
  "mw.wikibase.setupInterface begin",
  "mw.wikibase.setupInterface end",
  "mw.wikibase.lexeme.setupInterface end",
  "mw.wikibase toplevel",
  "mw.wikibase.setupInterface begin",
  "mw.wikibase.setupInterface end",
},

On Beta English Wiktionary that doesn’t happen:

["log"] = table#2 {
  "mw.wikibase toplevel",
  "mw.wikibase.setupInterface begin",
  "mw.wikibase.setupInterface end",
  "mw.wikibase.lexeme toplevel",
  "mw.wikibase.lexeme.setupInterface begin",
  "mw.wikibase.lexeme.setupInterface end",
},

Some more logging, to include a full stack trace (and help me understand how mw.wikibase.lexeme is even able to require mw.wikibase before it’s been registered):

  ["log"] = table#2 {
    "mw.wikibase.lexeme toplevel\
stack traceback:\
	mw.wikibase.lexeme.lua:11: in main chunk\
	[C]: ?",
    "mw.wikibase.lexeme.setupInterface begin\
stack traceback:\
	mw.wikibase.lexeme.lua:14: in function <mw.wikibase.lexeme.lua:13>\
	[C]: ?",
    "mw.wikibase toplevel\
stack traceback:\
	mw.wikibase.lua:20: in main chunk\
	[C]: ?\
	[C]: in function 'loadPHPLibrary'\
	mw.lua:31: in function 'loader'\
	package.lua:75: in function 'load'\
	package.lua:99: in function 'require'\
	mw.wikibase.lexeme.lua:46: in function <mw.wikibase.lexeme.lua:13>\
	[C]: ?",
    "mw.wikibase.setupInterface begin\
stack traceback:\
	mw.wikibase.lua:77: in function <mw.wikibase.lua:76>\
	[C]: ?\
	[C]: in function 'loadPHPLibrary'\
	mw.lua:31: in function 'loader'\
	package.lua:75: in function 'load'\
	package.lua:99: in function 'require'\
	mw.wikibase.lexeme.lua:46: in function <mw.wikibase.lexeme.lua:13>\
	[C]: ?",
    "mw.wikibase.setupInterface end\
stack traceback:\
	mw.wikibase.lua:477: in function <mw.wikibase.lua:76>\
	[C]: ?\
	[C]: in function 'loadPHPLibrary'\
	mw.lua:31: in function 'loader'\
	package.lua:75: in function 'load'\
	package.lua:99: in function 'require'\
	mw.wikibase.lexeme.lua:46: in function <mw.wikibase.lexeme.lua:13>\
	[C]: ?",
    "mw.wikibase.lexeme.setupInterface end\
stack traceback:\
	mw.wikibase.lexeme.lua:54: in function <mw.wikibase.lexeme.lua:13>\
	[C]: ?",
    "mw.wikibase toplevel\
stack traceback:\
	mw.wikibase.lua:20: in main chunk\
	[C]: ?",
    "mw.wikibase.setupInterface begin\
stack traceback:\
	mw.wikibase.lua:77: in function <mw.wikibase.lua:76>\
	[C]: ?",
    "mw.wikibase.setupInterface end\
stack traceback:\
	mw.wikibase.lua:477: in function <mw.wikibase.lua:76>\
	[C]: ?",
  },

The magic happens here:

Scribunto_LuaEngine::load()
$this->availableLibraries = $this->getLibraries( 'lua', self::$libraryClasses );
foreach ( $this->availableLibraries as $name => $def ) {
	$this->instantiatePHPLibrary( $name, $def, false );
}
Scribunto_LuaEngine::loadPHPLibrary()
if ( isset( $this->availableLibraries[$name] ) ) {
	$ret = $this->instantiatePHPLibrary( $name, $this->availableLibraries[$name], true );
}

Scribunto assigns all the available libraries (which getLibraries() gets from the ScribuntoExternalLibraries hook). Then it iterates through each of them to instantiate the non-deferLoad ones, including mw.wikibase and mw.wikibase.lexeme. But if the available libraries happen to list mw.wikibase.lexeme before mw.wikibase, then Lexeme’s require 'mw.wikibase' will call loadPHPLibrary( 'mw.wikibase' ) before mw.wikibase was instantiated by the main loop in load(). This will instantiate mw.wikibase, but then, afterwards, load() will reach mw.wikibase, and instantiate it again. And I think that’s actually a bug in Scribunto.

You can reproduce the bug locally by adding the following to LocalSettings.php:

$wgHooks['ScribuntoExternalLibraries'][] = function ( $engine, &$extraLibraries ) {
    $extraLibraries['mw.wikibase.lexeme'] = null;
    $extraLibraries['mw.wikibase'] = null;
};

This hook handler will run before the “real” hook handlers in Wikibase and WikibaseLexeme (because it’s added directly in LocalSettings.php, whereas the other hook handlers will only be registered when the wfLoadExtension() queue is processed), so it forces the mw.wikibase.lexeme library to be listed before the mw.wikibase one.

Ah, and the reason why it happens on Beta Wikidata but not Beta English Wiktionary will be:

wmf-config/Wikibase.php
// Load the Repo, and Repo extensions                                                                                                                                                                                                                                                                                                                      
if ( !empty( $wmgUseWikibaseRepo ) ) {
    wfLoadExtension( 'WikibaseRepository', "$IP/extensions/Wikibase/extension-repo.json" );
    // ...
    if ( !empty( $wmgUseWikibaseLexeme ) ) {
        wfLoadExtension( 'WikibaseLexeme' );
    }
    // ...
}

// Load the Client, and Client extensions                                                                                                                                                                                                                                                                                                                  
if ( !empty( $wmgUseWikibaseClient ) ) {
    wfLoadExtension( 'WikibaseClient', "$IP/extensions/Wikibase/extension-client.json" );
    // ...
    if ( !empty( $wmgUseWikibaseLexeme ) ) {
        wfLoadExtension( 'WikibaseLexeme' );
    }
}

So iff Wikibase Repo is enabled, then WikibaseLexeme gets loaded before WikibaseClient. This then causes the hook handlers to run in the wrong order.

Maybe we should just fix that in general, and expect that WikibaseClient is loaded before WikibaseLexeme?

Commons / MediaInfo has the exact same problem, by the way:

=mw.wikibase.mediainfo
nil
=type(require('mw.wikibase.mediainfo'))
table
=mw.wikibase.mediainfo
nil

Did this only break recently? Or has it always been broken? I can’t find any Phabricator task about it…

Change 735367 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[operations/mediawiki-config@master] Load Wikibase Client before other Wikibase extensions

https://gerrit.wikimedia.org/r/735367

Lucas_Werkmeister_WMDE renamed this task from mw.wikibase.lexeme is nil on beta cluster to mw.wikibase.lexeme is nil on Beta Wikidata, and mw.wikibase.mediainfo is nil on Commons (beta+prod).Oct 28 2021, 11:58 AM
Lucas_Werkmeister_WMDE updated the task description. (Show Details)

Change 735367 merged by jenkins-bot:

[operations/mediawiki-config@master] Load Wikibase Client before other Wikibase extensions

https://gerrit.wikimedia.org/r/735367

Mentioned in SAL (#wikimedia-operations) [2021-11-11T13:14:44Z] <lucaswerkmeister-wmde@deploy1002> Synchronized wmf-config/Wikibase.php: Config: [[gerrit:735367|Load Wikibase Client before other Wikibase extensions (T294224)]] (duration: 00m 55s)

Lucas_Werkmeister_WMDE claimed this task.

Maybe we should just fix that in general, and expect that WikibaseClient is loaded before WikibaseLexeme?

Done, and that seems to have fixed this issue.

I wonder if this belongs in Tech News? TL;DR: mw.wikibase.mediainfo was nil on Wikimedia Commons, though it doesn’t look like anybody noticed it before; now it works, and functions like mw.wikibase.mediainfo.getCaption( 'M80857538' ) are available. (Specifically, the available functions are getEntityIdForTitle( pageTitle, globalSiteId ), getCaptionWithLang( id ), getCaption( id ), and getCaptionByLang( id, languageCode ).)

@Lucas_Werkmeister_WMDE Hi! Re: Tech News - I'm not sure if it belongs, because (as a non-developer) I don't understand what these functions do, nor who it affects (i.e. how many potential people, and at which wikis).
Is this something that dozens-to-hundreds of tool-owners or code-writers will want to learn about, and are they located at dozens-to-hundreds of wikis? (Tech News gets sent to ~1,000 separate wikipages).

If not, then it might be better as a targeted message just to some specific locations, such as the Wikidata and Commons technical noticeboards, and related mailing lists (IIUC). Or a MassMessage to the Technical_Village_Pumps_distribution_list

If yes, then we can include it! But I'll need help with a clear draft... E.g. If the specific details of these newly-working functions are already documented somewhere, then we could add a short summary and a link to that location (instead of including the complete list you wrote above).
Perhaps something like this (with my very-rough guesses at details/keywords that might be right!?!)

"Some globally available Lua functions for mw.wikibase.lexeme and mw.wikibase.mediainfo were not working properly. This has now been fixed. There is [[LINK | a complete list of available functions]]."

The affected people are Lua module authors/editors on Wikimedia Commons. No other wikis, no tools. Sounds like it’s not something for Tech News then :)

I left a quick note on the Commons technical village pump instead (permalink).