Page MenuHomePhabricator

Latest wikidata JSON dump contains unexpected sql warning
Closed, ResolvedPublic

Description

Line 21232819 of latest dump contains unexpected sql warning after the definition of item Q22251178 :

[Mon Jun 20 13:54:32 2016] [hphp] [24609:7f90e5f5b100:0:000001] [] SlowTimer [14607ms] at runtime/ext_mysql: slow query: SELECT /* Wikibase\Lib\Store\Sql\WikiPageEntityMetaDataLookup::selectRevisionInformationMultiple */ rev_id,rev_content_format,rev_timestamp,page_latest,page_is_redirect,old_id,old_text,old_flags,page_title FROM page INNER JOIN revision ON ((page_latest=rev_id)) INNER JOIN text ON ((old_id=rev_text_id)) WHERE (('Q22251189'=page_title) AND (0=page_namespace)) OR (('Q22251195'=page_title) AND (0=page_namespace)) OR (('Q22251204'=page_title) AND (0=page_namespace)) OR (('Q22251206'=page_title) AND (0=page_namespace)) OR (('Q22251207'=page_title) AND (0=page_namespace)) OR (('Q22251208'=page_title) AND (0=page_namespace)) OR (('Q22251211'=page_title) AND (0=page_namespace)) OR (('Q22251216'=page_title) AND (0=page_namespace)) OR (('Q22251218'=page_title) AND (0=page_namespace)) OR (('Q22251220'=page_title) AND (0=page_namespace)) OR (('Q22251223'=page_title) AND (0=page_namespace)) OR (('Q22251228'=page_title) AND (0=page_namespace)) OR (('Q22251233'=page_title) AND (0=page_namespace)) OR (('Q22251245'=page_title) AND (0=page_namespace)) OR (('Q22251251'=page_title) AND (0=page_namespace)) OR (('Q22251258'=page_title) AND (0=page_namespace)) OR (('Q22251262'=page_title) AND (0=page_namespace)) OR (('Q22251263'=page_title) AND (0=page_namespace)) OR (('Q22251264'=page_title) AND (0=page_namespace)) OR (('Q22251267'=page_title) AND (0=page_namespace)) OR (('Q22251269'=page_title) AND (0=page_namespace)) OR (('Q22251271'=page_title) AND (0=page_namespace))

Steps to Reproduce:

  1. Get latest JSON dump wikidata-20160620-all.json (from https://dumps.wikimedia.org/wikidatawiki/entities/)
  2. Run this unix command :

split -n l/21207/21814 wikidata-20160620-all.json | grep -A1 Q22251178

Expected Results:
<JSON definition of Q22251178>,
<JSON definition of another item>,

Actual Results:
<JSON definition of Q22251178><above SQL warning>
,

Event Timeline

Restricted Application added a subscriber: Zppix. · View Herald TranscriptJun 21 2016, 12:08 PM
hoo added a subscriber: hoo.Jun 21 2016, 12:25 PM

Thanks for the report, this looks interesting.

We create the dump by piping the stdout of the dump creation script into gzip. Not sure why said warning ended up there, will have a look.

Mentioned in SAL [2016-06-21T12:33:21Z] <hoo> Removed Wikidata json dumps from 20160620 (inconsistent, per T138291).

Mentioned in SAL [2016-06-21T12:36:29Z] <hoo> Started a new JSON dump creation on snapshot1003 (after the last one was inconsistent, per T138291)

Probably related to T138208, I would set that as unbreak now.

Change 295554 had a related patch set uploaded (by Hoo man):
Log PHP/HHVM errors in CLI mode to stderr, not stdout

https://gerrit.wikimedia.org/r/295554

Change 295554 merged by jenkins-bot:
Log PHP/HHVM errors in CLI mode to stderr, not stdout

https://gerrit.wikimedia.org/r/295554

Mentioned in SAL [2016-06-23T20:04:29Z] <jzerebecki@tin> Synchronized wmf-config/CommonSettings.php: Log PHP/HHVM errors in CLI mode to stderr, not stdout T138291 (duration: 00m 28s)

Now echo 'error_reporting(-1); echo $foo;' |mwscript maintenance/eval.php testwikidatawiki 2>/dev/null does not print the notice.

hoo closed this task as Resolved.Jun 24 2016, 8:38 AM
hoo claimed this task.
hoo removed a project: Patch-For-Review.

Given we no longer log errors to stdout, this should not happen again.

ArielGlenn moved this task from Backlog to Done on the Dumps-Generation board.Jul 12 2016, 6:37 AM