Warning: array_key_exists() expects parameter 2 to be array, null given in /srv/mediawiki/php-1.26wmf17/extensions/Wikidata/extensions/Wikibase/lib/includes/changes/DiffChange.php on line 95
Description
Details
Status | Subtype | Assigned | Task | |
---|---|---|---|---|
· · · | ||||
Resolved | None | T49930 [Epic] Allow accessing data from a Wikidata item not connected to the current page - arbitrary access | ||
Invalid | None | T108944 [Epic] Improve change dispatching | ||
Resolved | JanZerebecki | T108130 [Bug] "Warning: array_key_exists() expects parameter 2 to be array, null given" in the change dispatcher | ||
Resolved | None | T108246 [Task] Enlarge wb_changes.change_info | ||
· · · |
Event Timeline
This happens only on zend. This error can not be found in logstash as it seems that warnings from zend do not get sent there. So I assume this means it happened on terbium.
From my reading this would happen if we get from the DB a value that is either null or the string "N;" ( = serialize(null); ) or the string "null" ( = json_decode(null); ) or some invalid json or json deeper than the recursion limit (the two cases where the docs say json_decode returns null). From running a few queries there seems to be none of the first 3 cases in the DB. No easy way to test the later two with just a mysql console.
Maybe a failed DB connection?
It also happened in a dispatchChanges instance which I started with hhvm on mw1017.
From my reading this would happen if we get from the DB a value that is either null or the string "N;" ( = serialize(null); ) or the string "null" ( = json_decode(null); ) or some invalid json or json deeper than the recursion limit (the two cases where the docs say json_decode returns null). From running a few queries there seems to be none of the first 3 cases in the DB. No easy way to test the later two with just a mysql console.
Maybe a failed DB connection?
DB connections don't fail silently, so the script would have died there.
I don't see this in logstash for wm1017, but do see other warnings. Perhaps because it is a manually started cli hhvm process.
Should we try something like for each row in wb_changes do json_decode(change_info), batched by 500? Would only take 700 and some iterations.
Also keep in mind, that ORMRow has special handling for non-string fields. But despite of that, yes, we should do that.
Unless you beat me to it, I will hack something up tomorrow and run it probably on tool labs.
I investigated this a bit and I can tell that we have cut off values in that table:
SELECT change_id FROM wb_changes WHERE change_info NOT LIKE "%}" LIMIT 5; +-----------+ | change_id | +-----------+ | 236998945 | | 236999268 | | 237059314 | | 237059627 | +-----------+ 4 rows in set (0.76 sec)
I think there's more to this, but I couldn't yet nail anything else down.
[wikidatawiki]> select length(change_info) from wb_changes where change_id in (236998945, 236999268, 237059314, 237059627) ; +---------------------+ | length(change_info) | +---------------------+ | 65535 | | 65535 | | 65535 | | 65535 | +---------------------+ 4 rows in set (0.00 sec)
"A BLOB column with a maximum length of 65,535 (216 - 1) bytes" - https://mariadb.com/kb/en/mariadb/blob/
At some point I checked what the then current MySQL does when you insert longer values, in default settings it gave no error and just cut it off. Seems that happened here.
Fix our MariaDB settings or add checks everywhere for length?
Move to MEDIUMBLOB?
Change 229741 had a related patch set uploaded (by JanZerebecki):
Log when we fail unserializeInfo
Perhaps make it LONGBLOB so that it can hold more than the entity that is stored in a MEDIUMBLOB.
I poked at this some more and iterated over all changes currently in the DB using very similar code to what we use in Wikibase and the change ids mentioned above are the only ones failing.
From quickly skimming the relevant code, my estimate is that we don't need to increase rc_params, because we explicitly omit also adding the content of change_info into it.
So that we know which items this affected:
[wikidatawiki]> select change_object_id, change_revision_id from wb_changes where change_id in (236998945, 236999268, 237059314, 237059627) ; +------------------+--------------------+ | change_object_id | change_revision_id | +------------------+--------------------+ | q18041912 | 238873453 | | q18053093 | 238873799 | | q18041912 | 238934794 | | q18053093 | 238935101 | +------------------+--------------------+ 4 rows in set (0.01 sec)
It seems https://www.wikidata.org/wiki/User:ProteinBoxBot is updating the "retrieved" date on all the references of an item without anything else: https://www.wikidata.org/w/index.php?title=Q18053093&diff=prev&oldid=238935101
Change 230273 had a related patch set uploaded (by Hoo man):
Hack: Don't write change rows where LENGTH(change_info) > 65500
Change 230273 merged by jenkins-bot:
Hack: Don't write change rows where LENGTH(change_info) > 65500
After looking at the code thoroughly, I think this warning is just another symptom, but not the cause of the troubles we see.
[wikidatawiki]> select change_id, change_object_id, change_revision_id from wb_changes where LENGTH(change_info) > 65500 ; +-----------+------------------+--------------------+ | change_id | change_object_id | change_revision_id | +-----------+------------------+--------------------+ | 236998945 | q18041912 | 238873453 | | 236999268 | q18053093 | 238873799 | | 237059314 | q18041912 | 238934794 | | 237059627 | q18053093 | 238935101 | | 237356029 | q18046547 | 239234465 | | 237364868 | q18040739 | 239243336 | | 237367810 | q18031289 | 239246278 | | 237368198 | q18025144 | 239246669 | | 237369087 | q18031463 | 239247560 | | 237369294 | q18036434 | 239247768 | | 237369539 | q18040217 | 239248014 | | 237373627 | q18025214 | 239252120 | +-----------+------------------+--------------------+ 12 rows in set (5.96 sec)
proposed fix:
- If the diff is too large, remove it from the info blob.
- When no stored diff is found, re-calculate it on the fly, based on revision ids.
I'm not sure how this is related, but https://gerrit.wikimedia.org/r/#/c/232263/ (accidentally linked to T108246: [Task] Enlarge wb_changes.change_info) is already merged and should fix this.