Page MenuHomePhabricator

Pages display Lua error in mw.wikibase.entity.lua
Closed, ResolvedPublic

Authored By
Johnuniq
Jul 8 2017, 7:50 AM
Referenced Files
F8983121: Screenshot_2017-08-08-03-59-04.png
Aug 8 2017, 1:05 AM
F8805562: New Picture (15).png
Jul 21 2017, 11:50 AM
F8771865: Capture_2017_07_15_15_48_00_78.jpg
Jul 15 2017, 1:53 PM
F8684904: New Picture (1).jpg
Jul 10 2017, 2:53 AM
F8658002: Screen Shot 2017-07-02 at 04.04.37.png
Jul 8 2017, 10:35 AM
Tokens
"Doubloon" token, awarded by Vachovec1."Baby Tequila" token, awarded by RandomDSdevel."The World Burns" token, awarded by putnik."The World Burns" token, awarded by Framawiki."The World Burns" token, awarded by Thibaut120094.

Description

An intermittent problem occurs when a Lua module executes entity = mw.wikibase.getEntity() or its equivalent entity = mw.wikibase.getEntityObject().

The symptom is that a rendered page shows a big red error message saying:

Lua error in mw.wikibase.entity.lua at line 34: The entity data must be a table obtained via mw.wikibase.getEntityObject

and the page is added to Category:Pages_with_script_errors (or local equivalent).

The problem goes away when the page is purged so a link to a demonstration only works for a limited period. Currently, these pages show the error:

Using Special:Search for mw.wikibase.getEntityObject shows lots of cached examples:

Google search also shows examples. For example, search for one of the following then view Google's cache:

"error in mw.wikibase.entity.lua" site:commons.wikimedia.org
"error in mw.wikibase.entity.lua" site:en.wikipedia.org
"error in mw.wikibase.entity.lua" site:zh.wikipedia.org

The problem occurs when a Lua module uses Wikidata. That calls mw.wikibase.getEntity which calls php.getEntityId. I guess that sometimes times out when it tries to establish a network connection with the Wikidata database.

There was a discussion at enwiki.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

You forgot to add Lua!

MediaWiki-extensions-Lua has nothing to do with this task. That extension is a different approach that doesn't offer the rigorous sandboxing that Scribunto provides, and is not installed on Wikimedia wikis.

You can try to purge all the page with the old message, and check tomorrow if there is a new one, or more, @Vachovec1,

OK, I purged/null edited everything. No positive search results now. We will see in 24 hours.

Interesting. So yesterday I purged/null edited about 180 pages at cs-wiki found through direct search for specific error messages (about 140 pages for "line 34" error message, 40 for "line 27" error message). Mostly "true" errors, only about 20 percent false positives. 24 hours later, no "line 34" error messages are shown, but 6 new "line 37" error messages was found (4 "real" errors, 2 false positives). Nothing from that is shown in the related category (there are only 4 new false positives).

Summarizing:

  1. new cases of this bug are still appearing
  2. the "line 34" message is probably obsolete, it was replaced with new "line 37" error message (this is apparently an expected outcome of the patch above)
  3. the category supposed to catch the script errors is completely unreliable in this matter, it's much better to get a list of affected pages through a direct search for error message(s)

This is what I've got from digging lostash. Three types of error happen: 1- Saving a page fails because Lua engine returns false one example, you find tons of it 2- It gets propagated to ChangeProp an example 3- And refresh links job an example. I think the first type of error is the source of the other two.

I put the whole backtrace of the first type for debugging:

t exception.file	/srv/mediawiki/php-1.30.0-wmf.11/extensions/Scribunto/engines/LuaSandbox/Engine.php:318
t exception.message	Scribunto_LuaSandboxInterpreter::callFunction: LuaSandboxFunction::call returned false
t exception.trace	#0 /srv/mediawiki/php-1.30.0-wmf.11/extensions/Scribunto/engines/LuaCommon/LuaCommon.php(178): Scribunto_LuaSandboxInterpreter->callFunction(LuaSandboxFunction, array)
#1 /srv/mediawiki/php-1.30.0-wmf.11/extensions/Scribunto/engines/LuaCommon/SiteLibrary.php(91): Scribunto_LuaEngine->registerInterface(string, array, array)
#2 /srv/mediawiki/php-1.30.0-wmf.11/extensions/Scribunto/engines/LuaCommon/LuaCommon.php(512): Scribunto_LuaSiteLibrary->register()
#3 /srv/mediawiki/php-1.30.0-wmf.11/extensions/Scribunto/engines/LuaCommon/LuaCommon.php(149): Scribunto_LuaEngine->instantiatePHPLibrary(string, string, boolean)
#4 /srv/mediawiki/php-1.30.0-wmf.11/extensions/Scribunto/engines/LuaSandbox/Engine.php(37): Scribunto_LuaEngine->load()
#5 /srv/mediawiki/php-1.30.0-wmf.11/extensions/Scribunto/common/Hooks.php(125): Scribunto_LuaSandboxEngine->getResourceUsage(integer)
#6 /srv/mediawiki/php-1.30.0-wmf.11/includes/parser/Parser.php(3408): ScribuntoHooks::invokeHook(Parser, PPTemplateFrame_Hash, array)
#7 /srv/mediawiki/php-1.30.0-wmf.11/includes/parser/Parser.php(3133): Parser->callParserFunction(PPTemplateFrame_Hash, string, array)
#8 /srv/mediawiki/php-1.30.0-wmf.11/includes/parser/Preprocessor_Hash.php(1071): Parser->braceSubstitution(array, PPTemplateFrame_Hash)
#9 /srv/mediawiki/php-1.30.0-wmf.11/includes/parser/Preprocessor_Hash.php(1504): PPFrame_Hash->expand(PPNode_Hash_Tree, integer)
#10 /srv/mediawiki/php-1.30.0-wmf.11/includes/parser/Parser.php(3284): PPTemplateFrame_Hash->cachedExpand(string, PPNode_Hash_Tree)
#11 /srv/mediawiki/php-1.30.0-wmf.11/includes/parser/Preprocessor_Hash.php(1071): Parser->braceSubstitution(array, PPFrame_Hash)
#12 /srv/mediawiki/php-1.30.0-wmf.11/includes/parser/Parser.php(2948): PPFrame_Hash->expand(PPNode_Hash_Tree, integer)
#13 /srv/mediawiki/php-1.30.0-wmf.11/includes/parser/Parser.php(1304): Parser->replaceVariables(string)
#14 /srv/mediawiki/php-1.30.0-wmf.11/includes/parser/Parser.php(451): Parser->internalParse(string)
#15 /srv/mediawiki/php-1.30.0-wmf.11/includes/content/WikitextContent.php(329): Parser->parse(string, Title, ParserOptions, boolean, boolean, NULL)
#16 /srv/mediawiki/php-1.30.0-wmf.11/includes/content/AbstractContent.php(497): WikitextContent->fillParserOutput(Title, NULL, ParserOptions, boolean, ParserOutput)
#17 /srv/mediawiki/php-1.30.0-wmf.11/includes/page/WikiPage.php(2078): AbstractContent->getParserOutput(Title, NULL, ParserOptions)
#18 /srv/mediawiki/php-1.30.0-wmf.11/includes/api/ApiStashEdit.php(200): WikiPage->prepareContentForEdit(WikitextContent, NULL, User, string, boolean)
#19 /srv/mediawiki/php-1.30.0-wmf.11/includes/api/ApiStashEdit.php(148): ApiStashEdit::parseAndStash(WikiPage, WikitextContent, User, string)
#20 /srv/mediawiki/php-1.30.0-wmf.11/includes/api/ApiMain.php(1583): ApiStashEdit->execute()
#21 /srv/mediawiki/php-1.30.0-wmf.11/includes/api/ApiMain.php(546): ApiMain->executeAction()
#22 /srv/mediawiki/php-1.30.0-wmf.11/includes/api/ApiMain.php(517): ApiMain->executeActionWithErrorHandling()
#23 /srv/mediawiki/php-1.30.0-wmf.11/api.php(94): ApiMain->execute()
#24 /srv/mediawiki/w/api.php(3): include(string)
#25 {main}

HTH

Using Special:Search for mw.wikibase.getEntityObject shows 26,913 pages in arwiki
arwiki

I think the first type of error is the source of the other two.

I doubt errors in any of these places causes the errors in the other places. But the underlying bug in all three of those is T166348.

I doubt errors in any of these places causes the errors in the other places. But the underlying bug in all three of those is T166348.

Thanks for pointing out, in that case I couldn't find these errors in logsatsh or these two bugs are somehow related (just a wild guess, I have no idea how lua works).

On Czech Wikipedia (cswiki) there is not Lua error in mw.wikibase.entity.lua at line 34: The entity data must be a table obtained via mw.wikibase.getEntityObject, but instead there is an error Lua error in mw.wikibase.entity.lua at line 37: data.schemaVersion must be a number. got nil instead.. The bug is on cca 200 pages, you can see their list here (purged already). Blank edit works

Change 369323 had a related patch set uploaded (by Thiemo Mättig (WMDE); owner: Thiemo Mättig (WMDE)):
[mediawiki/extensions/Wikibase@master] Add mw.wikibase.entity.create() error message for empty tables

https://gerrit.wikimedia.org/r/369323

Change 369324 had a related patch set uploaded (by Thiemo Mättig (WMDE); owner: Thiemo Mättig (WMDE)):
[mediawiki/extensions/Wikibase@master] Skip cloning in mw.wikibase.entity.create() on first call

https://gerrit.wikimedia.org/r/369324

Change 369324 abandoned by Thiemo Mättig (WMDE):
Skip cloning in mw.wikibase.entity.create() on first call

Reason:
Whoops.

Being able to predict the future would be so cool. The code could then know if it is going to be called a second time.

https://gerrit.wikimedia.org/r/369324

Change 369612 had a related patch set uploaded (by Thiemo Mättig (WMDE); owner: Thiemo Mättig (WMDE)):
[mediawiki/extensions/Wikibase@master] Add more error messages when Lua code produces empty tables

https://gerrit.wikimedia.org/r/369612

Change 369323 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Add mw.wikibase.entity.create() error message for empty tables

https://gerrit.wikimedia.org/r/369323

Today I see this at arwiki.
mw.wikibase.entity.lua on line 37 data.schemaVersion must be a number, got nil instead.

Screenshot_2017-08-08-03-59-04.png (1×1 px, 528 KB)

Using Special:Search for "data.schemaVersion must be a number" (with quotes) shows many of the mw.wikibase.entity.lua at line 37: data.schemaversion must be a number, got nil instead problems. That applies at enwiki (330 pages listed), but itwiki wins with over 5700 pages. Possibly the problem pages are noticed and purged more quickly at enwiki.

Today we detected inexplicable Lua/Wikidata errors. Maybe they have the same cause like this described in this task.

  • In Rocky Mountain National Park we got the message, that the Object ids "Q4735531, Q4878135" are not known. But these ids are not used in the article but existent at Wikidata.
  • In other cases like Chennai we got nil errors for the value of the property P856 (official website) although the value is set.

They are used indeed. See Q777183, P2872.

Today we detected inexplicable Lua/Wikidata errors. Maybe they have the same cause like this described in this task.

  • In Rocky Mountain National Park we got the message, that the Object ids "Q4735531, Q4878135" are not known. But these ids are not used in the article but existent at Wikidata.

This is a bug in some local Lua module: Apparently something tries to load the entity Q4735531, Q4878135 (these are not two errors, but one for comma separated id Q4735531, Q4878135).

  • In other cases like Chennai we got nil errors for the value of the property P856 (official website) although the value is set.

This is a bug in your Modul:Wikidata2's getProperty function: It assumes wd.getLabel().label to return non-nil, but it doesn't always (as label is sometimes taken from mw.wikibase.label which can be nil).

This is unrelated to the issues mentioned here.

Thanks for your answer. We will check for the cause in the scripts and why these errors occurred only today.

On it.wikipedia: 1900 pages in 24 hours, in category page. The problem is increasing? Now users often can see the red message in pages. I think that start to be a problem also for the readers.

I also believe it's became worth.

Perhaps you could use a bot to purge the articles in it.wikipedia if it's becoming a problem?

This is the solution? Purge the pages? I already do it but now we have an average of 80-90 pages per hours. Yesterday I emptied the category at 9:42 and now, 10 aug ad 19:45 the category have 3019 pages.

Of course not final solution but temporary solution. Finnish Wikipedia does have the same problem and there is a bot to purge pages once per hour that contains "data.schemaVersion must be a number" to keep number of these red error texts low as possible.

I saw this mentioned on wikitech-l. To summarize my understanding of the current state of this task: The hope is that fixing T171166: Build and push a new hhvm-luasandbox package will fix this, so effectively this task is blocked on that one. If that's not correct, please say so. If it is, should we set up a phab subtask dependency to reflect that blocking relationship?

HHVM-Luasandbox has been upgraded to 2.0.13 in production. Please have a look whether the problem persists.

@MoritzMuehlenhoff good job! Looks like it solved the issue. I tested it by purging the lua errors tracking category to get to a clean state, and resample the category after a short while (~1hr) and there are no weird pages added there as previously..

I suggest to close this bug in next few days if there are no new comments regarding the issue, assuming the root cause was the lua sandbox.

Sound good, we can close this bug end of the week if nothing further gets reported.

Thanks indeed. Continuing @eranroz, there are 240 uncategorized errors on the same wiki. I just finished running nulledit bot, let's see in a couple of hours if there will not be any new ones.
And there are 123,000 such errors on Commons. Somebody should purge them too.

matej_suchanek lowered the priority of this task from Unbreak Now! to Needs Triage.Aug 23 2017, 1:59 PM

Somebody should purge them too.

Yeah, I've been running a pywikibot touchbot on Category:Pages_with_script_errors since years ago ;)

No, @zhuyifei1999, I'm talking about the uncategorized errors. What's you're doing is not enough.

No, @zhuyifei1999, I'm talking about the uncategorized errors. What's you're doing is not enough.

Do you have some kind of list/generator for such pages?

I asked about Commons, because it's common :-) resource. Surprisingly, Wikidata has also a couple of hundreds errors, in both ways. Somebody can fix them too, please?

We did purging (for both categorized and uncategorized errors) on cs-wikipedia project. Crossing fingers now.

Side note: the Category:Pages with script errors (cs-wiki version) contained a lot of false positives. We found similar false positives in Category:Pages where expansion depth is exceeded and Category:Pages with malformed coordinate tags (cs-wiki versions). Is this another bug, which would be hopefully solved with deployment of the new HHVM-Luasandbox too?

@Vachovec1: I already answered you about that in T170039#3467750. If a page is reparsed without updating link tables, a transient warning might be cleared while leaving the page in the error category until something does trigger a link table update.

Well, three hours later, both lists are empty. I think we have a jackpot.
Should we publish in User-notice some recommendations for emptying of each of two lists? I believe we must do this. Guys? @Johan?

@Vachovec1: I already answered you about that in T170039#3467750. If a page is reparsed without updating link tables, a transient warning might be cleared while leaving the page in the error category until something does trigger a link table update.

No. What did you describe could happen (and it's probably something normal/expected). But you are missing the point. Most of these "alarms" should never happen/never be triggered.

Let's see for example the page Armas Taipale, which is currently listed in en-wiki category Category:Pages where expansion depth is exceeded. There is absolutely nothing what could/should trigger the alarm. Last edit on the page is from 2016/10/15. So why is the page listed? My guess is some some breakage on the Wikipedia↔Wikidata link as we saw in this bug (I remind you that the real trigger for this bug still remains mysterious, even if the new HHVM-Luasandbox would prove itself as a fix).

Also on it.wikipedia the category is empty and the search of "mw.wikibase.entity.lua" return zero result.

Well, three hours later, both lists are empty. I think we have a jackpot.
Should we publish in User-notice some recommendations for emptying of each of two lists? I believe we must do this. Guys? @Johan?

How would you phrase it?

Something like
The bug ... that was mentioned in the Tech news issue ... was fixed now. You are recommended to rebuild all damaged pages on your local wiki using [[PAGE|these instructions]]. T170039
And to write a small explanation page (or use a new chapter in existing one), on meta or mw, with the bug description and the fixing recomendations:
You should purge, or pywikibot touch or run script T170039#3473755 or rebuild any other way:

  1. All pages in you local int:"scribunto-common-error-category" category.
  2. All pages that come from Special:Search of "mw.wikibase.entity.lua" for all namespaces.

How would you phrase it?

I belive until Monday the Commons bot will finish it's run, so all the search will be local.
What do you think?

Something like this?
https://www.mediawiki.org/wiki/Mw.wikibase.entity.lua_bug_fix_instructions

Edit as necessary, and I'll mark it for translation when done.

Thanks, but I think it still will be better to build the explanation for two lists explicitly, as I wrote above, because otherwise some people can miss that point.
And also, I do not understand at all the bug itself, so don't wait from me for the description part confirmation.

@Johan : I would reccomend something like this:

Lua executing entity = mw.wikibase.getEntity() or entity = mw.wikibase.getEntityObject() caused problems resulting in a big red error message(s) announcing a Lua error in mw.wikibase.entity.lua. This has now been fixed so we don't get new errors, but all affected pages still need need to be fixed.

To do this, you can do one of the following things to all pages in your local {{int:scribunto-common-error-category}} category and all pages you find when you search for mw.wikibase.entity.lua in all namespaces (many affected pages are NOT properly categorized):

  • Purge all affected pages
  • Use Pywikibot touch on all affected pages
  • Run the script in T170039#347375599
  • Or rebuild any other way

A lot of affected pages were NOT listed in the script errors categories. Instead of this, the categories were/are full of false positives. Nevertheless, those pages need to be purged too, otherwise they will be stuck in the category until edited or their cache will be renewed.

24 hours after purging affected pages on cs-wiki, everything looks good:

  • no new Lua errors (especially of type error in mw.wikibase.entity.lua)
  • no more "phantom" Lua errors (false positives) in the "script errors" category
  • no more "phantom" errors (false positives) elsewhere (f. e. in the "pages where expansion depth is exceeded" category)

I recommend to close this bug as resolved (or should we wait until @Johan sends the messages?).

I don't think closing this needs to wait for Tech News to be sent out.

Commons File: namespace is clear. The progress on Category: namespace is very slow due to T171392 & T173194

(Actually, everything besides Category: namespace on commons is clear)

(Actually, everything besides Category: namespace on commons is clear)

Great. What about the category?

MoritzMuehlenhoff claimed this task.

I recommend to close this bug as resolved (or should we wait until @Johan sends the messages?).

Yeah, let's close this. Notifying people for cleanups is orthogonal.

Great. What about the category?

It's running, but getting tons of 500s.

It's running, but getting tons of 500s.

You can rerun it, starting filtering 14 first.

You can rerun it, starting filtering 14 first.

(Actually, everything besides Category: namespace on commons is clear)

You can rerun it, starting filtering 14 first.

(Actually, everything besides Category: namespace on commons is clear)

There are 2,500 pages still there. I tried to nulledit one - it was removed from the category.

You mean Category:Pages_with_script_errors? Most of them look irrelevant to me, and my last null edit run only removed a few dozen pages iirc. I just edited a broken template and started the bot. Let's see how many can be removed.

There are 2,500 pages still there. I tried to nulledit one - it was removed from the category.

Try one of Category:1xxx_in_Finland, eg. Category:1883_in_Finland: a null-edit attempt ends with 500.
I am afraid we cannot go on here until this problem is fixed.

There are 2,500 pages still there. I tried to nulledit one - it was removed from the category.

Try one of Category:1xxx_in_Finland, eg. Category:1883_in_Finland: a null-edit attempt ends with 500.
I am afraid we cannot go on here until this problem is fixed.

I can't even understand what are you talking about - the address you gave returns 404. It should not happen at all in wiki.

Thanks, but I think it still will be better to build the explanation for two lists explicitly, as I wrote above, because otherwise some people can miss that point.
And also, I do not understand at all the bug itself, so don't wait from me for the description part confirmation.

@Johan?

I can't even understand what are you talking about - the address you gave returns 404. It should not happen at all in wiki.

It is a blank 500 page not 404. See the tickets I linked.

I can't even understand what are you talking about - the address you gave returns 404. It should not happen at all in wiki.

It is a blank 500 page not 404. See the tickets I linked.

It returned 404 for me.

@IKhitron I started editing but realised I'm not entirely sure I'd properly reflect what you mean. Could you edit the page? I can fix it up later.

@IKhitron I started editing but realised I'm not entirely sure I'd properly reflect what you mean. Could you edit the page? I can fix it up later.

Very well, done. Thank you.

There are 2,500 pages still there. I tried to nulledit one - it was removed from the category.

Try one of Category:1xxx_in_Finland, eg. Category:1883_in_Finland: a null-edit attempt ends with 500.
I am afraid we cannot go on here until this problem is fixed.

I can't even understand what are you talking about - the address you gave returns 404. It should not happen at all in wiki.

The address https://commons.wikimedia.org/wiki/Category:1882_in_Finland works for me. But when I try to edit with the new wikitexteditor I get an "empty server response" and when I use the old wikitexteditor I get a http error 500 (internal server error).

When I try to purge the page in the previous link through the API (with forcelinkupdate, which has the same effect as a null edit) also results in a Internal Server Error. Tried another year and that gives the same problem. Selected a random cat on my watchlist (https://commons.wikimedia.org/wiki/Category:Psychologists_from_the_Netherlands) and I can purge that one through the API. But I don't know if that error has anything to do with this ticket or that it's a coincidence.

Random category that I find when looking for the lua error in the description of this ticket: https://commons.wikimedia.org/wiki/Category:Saint_George_churches_in_France Null edit gives white screen. API purge results in Service Unavailable.
What they've got in common is they're using this template: {{Countries of Europe|prefix=:Category:Saint George churches in}}
(The value after prefix=: differs per category, the rest not)
Could it be there's something in that template that causes the issue with null edits/api purges? All hits for this search: https://commons.wikimedia.org/w/index.php?title=Special:Search&limit=500&offset=0&profile=default&search=Lua+error+in+mw.wikibase.entity.lua&searchToken=3fdwxwd5nyjdkkrj7ur5r81qo are country based categories and probably all use the Countries of Europe template.

Random category that I find when looking for the lua error in the description of this ticket: https://commons.wikimedia.org/wiki/Category:Saint_George_churches_in_France Null edit gives white screen. API purge results in Service Unavailable.
What they've got in common is they're using this template: {{Countries of Europe|prefix=:Category:Saint George churches in}}
(The value after prefix=: differs per category, the rest not)
Could it be there's something in that template that causes the issue with null edits/api purges? All hits for this search: https://commons.wikimedia.org/w/index.php?title=Special:Search&limit=500&offset=0&profile=default&search=Lua+error+in+mw.wikibase.entity.lua&searchToken=3fdwxwd5nyjdkkrj7ur5r81qo are country based categories and probably all use the Countries of Europe template.

Ok. Just noticed T171392.

When I try to purge the page in the previous link through the API (with forcelinkupdate, which has the same effect as a null edit) also results in a Internal Server Error. Tried another year and that gives the same problem. Selected a random cat on my watchlist (https://commons.wikimedia.org/wiki/Category:Psychologists_from_the_Netherlands) and I can purge that one through the API. But I don't know if that error has anything to do with this ticket or that it's a coincidence.

This has nothing to do with this bug, it's T171392 & T173194.

Change 369612 abandoned by Thiemo Mättig (WMDE):
Add more error messages when Lua code produces empty tables

Reason:
I think the messages in the mw.wikibase.entity.lua constructor are fine and shouldn't be removed.

https://gerrit.wikimedia.org/r/369612