Page MenuHomePhabricator

[Bug] "Warning: array_key_exists() expects parameter 2 to be array, null given" in the change dispatcher
Closed, ResolvedPublic1 Story Points

Description

Warning: array_key_exists() expects parameter 2 to be array, null given in /srv/mediawiki/php-1.26wmf17/extensions/Wikidata/extensions/Wikibase/lib/includes/changes/DiffChange.php on line 95

Event Timeline

hoo created this task.Aug 5 2015, 10:47 PM
hoo raised the priority of this task from to High.
hoo updated the task description. (Show Details)
hoo added subscribers: hoo, JanZerebecki, daniel.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptAug 5 2015, 10:47 PM
hoo added a subscriber: Addshore.Aug 5 2015, 10:49 PM

This happens only on zend. This error can not be found in logstash as it seems that warnings from zend do not get sent there. So I assume this means it happened on terbium.

From my reading this would happen if we get from the DB a value that is either null or the string "N;" ( = serialize(null); ) or the string "null" ( = json_decode(null); ) or some invalid json or json deeper than the recursion limit (the two cases where the docs say json_decode returns null). From running a few queries there seems to be none of the first 3 cases in the DB. No easy way to test the later two with just a mysql console.

Maybe a failed DB connection?

hoo added a comment.Aug 6 2015, 1:01 AM

This happens only on zend. This error can not be found in logstash as it seems that warnings from zend do not get sent there. So I assume this means it happened on terbium.

It also happened in a dispatchChanges instance which I started with hhvm on mw1017.

From my reading this would happen if we get from the DB a value that is either null or the string "N;" ( = serialize(null); ) or the string "null" ( = json_decode(null); ) or some invalid json or json deeper than the recursion limit (the two cases where the docs say json_decode returns null). From running a few queries there seems to be none of the first 3 cases in the DB. No easy way to test the later two with just a mysql console.
Maybe a failed DB connection?

DB connections don't fail silently, so the script would have died there.

I don't see this in logstash for wm1017, but do see other warnings. Perhaps because it is a manually started cli hhvm process.

Should we try something like for each row in wb_changes do json_decode(change_info), batched by 500? Would only take 700 and some iterations.

hoo added a comment.EditedAug 6 2015, 1:24 AM

I don't see this in logstash for wm1017, but do see other warnings. Perhaps because it is a manually started cli hhvm process.
Should we try something like for each row in wb_changes do json_decode(change_info), batched by 500? Would only take 700 and some iterations.

Also keep in mind, that ORMRow has special handling for non-string fields. But despite of that, yes, we should do that.

Unless you beat me to it, I will hack something up tomorrow and run it probably on tool labs.

hoo added a comment.Aug 6 2015, 1:24 PM

I investigated this a bit and I can tell that we have cut off values in that table:

SELECT change_id FROM wb_changes WHERE change_info NOT LIKE "%}" LIMIT 5;
+-----------+
| change_id |
+-----------+
| 236998945 |
| 236999268 |
| 237059314 |
| 237059627 |
+-----------+
4 rows in set (0.76 sec)

I think there's more to this, but I couldn't yet nail anything else down.

[wikidatawiki]> select length(change_info) from wb_changes where change_id in (236998945, 236999268, 237059314, 237059627) ;
+---------------------+
| length(change_info) |
+---------------------+
|               65535 |
|               65535 |
|               65535 |
|               65535 |
+---------------------+
4 rows in set (0.00 sec)

"A BLOB column with a maximum length of 65,535 (216 - 1) bytes" - https://mariadb.com/kb/en/mariadb/blob/
At some point I checked what the then current MySQL does when you insert longer values, in default settings it gave no error and just cut it off. Seems that happened here.

Fix our MariaDB settings or add checks everywhere for length?
Move to MEDIUMBLOB?

Change 229741 had a related patch set uploaded (by JanZerebecki):
Log when we fail unserializeInfo

https://gerrit.wikimedia.org/r/229741

Perhaps make it LONGBLOB so that it can hold more than the entity that is stored in a MEDIUMBLOB.

Change 229741 merged by jenkins-bot:
Log when we fail unserializeInfo

https://gerrit.wikimedia.org/r/229741

hoo added a comment.Aug 6 2015, 7:26 PM

Perhaps make it LONGBLOB so that it can hold more than the entity that is stored in a MEDIUMBLOB.

Mh... maybe. Also, do we need to resize the rc_params field as well?

hoo added a comment.Aug 6 2015, 7:27 PM

I poked at this some more and iterated over all changes currently in the DB using very similar code to what we use in Wikibase and the change ids mentioned above are the only ones failing.

From quickly skimming the relevant code, my estimate is that we don't need to increase rc_params, because we explicitly omit also adding the content of change_info into it.

So that we know which items this affected:

[wikidatawiki]> select change_object_id, change_revision_id from wb_changes where change_id in (236998945, 236999268, 237059314, 237059627) ;
+------------------+--------------------+
| change_object_id | change_revision_id |
+------------------+--------------------+
| q18041912        |          238873453 |
| q18053093        |          238873799 |
| q18041912        |          238934794 |
| q18053093        |          238935101 |
+------------------+--------------------+
4 rows in set (0.01 sec)

It seems https://www.wikidata.org/wiki/User:ProteinBoxBot is updating the "retrieved" date on all the references of an item without anything else: https://www.wikidata.org/w/index.php?title=Q18053093&diff=prev&oldid=238935101

hoo raised the priority of this task from High to Unbreak Now!.Aug 8 2015, 2:36 AM

Change 230273 had a related patch set uploaded (by Hoo man):
Hack: Don't write change rows where LENGTH(change_info) > 65500

https://gerrit.wikimedia.org/r/230273

Change 230273 merged by jenkins-bot:
Hack: Don't write change rows where LENGTH(change_info) > 65500

https://gerrit.wikimedia.org/r/230273

hoo added a comment.Aug 8 2015, 2:54 AM

After looking at the code thoroughly, I think this warning is just another symptom, but not the cause of the troubles we see.

JanZerebecki added a comment.EditedAug 8 2015, 3:04 AM
[wikidatawiki]> select change_id, change_object_id, change_revision_id from wb_changes where LENGTH(change_info) > 65500 ;
+-----------+------------------+--------------------+
| change_id | change_object_id | change_revision_id |
+-----------+------------------+--------------------+
| 236998945 | q18041912        |          238873453 |
| 236999268 | q18053093        |          238873799 |
| 237059314 | q18041912        |          238934794 |
| 237059627 | q18053093        |          238935101 |
| 237356029 | q18046547        |          239234465 |
| 237364868 | q18040739        |          239243336 |
| 237367810 | q18031289        |          239246278 |
| 237368198 | q18025144        |          239246669 |
| 237369087 | q18031463        |          239247560 |
| 237369294 | q18036434        |          239247768 |                                                                                                                                                                                                                          
| 237369539 | q18040217        |          239248014 |                                                                                                                                                                                                                          
| 237373627 | q18025214        |          239252120 |                                                                                                                                                                                                                          
+-----------+------------------+--------------------+                                                                                                                                                                                                                          
12 rows in set (5.96 sec)
hoo lowered the priority of this task from Unbreak Now! to High.Aug 8 2015, 10:46 AM

We hit these problems again, although the workaround for this bug is in place.

Jonas renamed this task from "Warning: array_key_exists() expects parameter 2 to be array, null given" in the change dispatcher to [Bug] "Warning: array_key_exists() expects parameter 2 to be array, null given" in the change dispatcher.Aug 13 2015, 1:57 PM

proposed fix:

  • If the diff is too large, remove it from the info blob.
  • When no stored diff is found, re-calculate it on the fly, based on revision ids.

Are you suggesting that instead of increasing the column?

I'm not sure how this is related, but https://gerrit.wikimedia.org/r/#/c/232263/ (accidentally linked to T108246: [Task] Enlarge wb_changes.change_info) is already merged and should fix this.

JanZerebecki closed this task as Resolved.Aug 21 2015, 3:45 PM
JanZerebecki claimed this task.
JanZerebecki moved this task from Doing to Done on the Wikidata-Sprint-2015-08-18 board.