memcached error: A BAD KEY WAS PROVIDED/CHARACTERS OUT OF RANGE
Closed, ResolvedPublic

Description

Spotted some errors in logstash, easily found via https://logstash.wikimedia.org/app/kibana#/dashboard/memcached

Memcached error for key "{memcached-key}" on server "{memcached-server}":
A BAD KEY WAS PROVIDED/CHARACTERS OUT OF RANGE

That seems related to 1.33.0-wmf.4 (T206658). Some related to jobrunner.discovery.wmnet / /rpc/RunSingleJob.php , but I had at least one from an api call.

The memcached key seems to be an empty string which would explain the error. I could not find stacktraces in logstash :/

hashar created this task.Nov 13 2018, 10:56 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptNov 13 2018, 10:56 PM
hashar updated the task description. (Show Details)Nov 13 2018, 10:58 PM

Does not seem to harmful, I will attempt to find the root cause tomorrow.

elukey added a subscriber: elukey.Nov 14 2018, 2:36 PM
zeljkofilipin triaged this task as Unbreak Now! priority.Nov 14 2018, 3:36 PM
zeljkofilipin added a subscriber: zeljkofilipin.

Train blockers are UBN.

Restricted Application added subscribers: Liuxinyu970226, TerraCodes. · View Herald TranscriptNov 14 2018, 3:36 PM
Anomie added a subscriber: Anomie.Nov 14 2018, 4:05 PM
This comment was removed by Anomie.

Change 473551 had a related patch set uploaded (by Anomie; owner: Anomie):
[mediawiki/core@wmf/1.33.0-wmf.4] Add trace to debug memcached bad key error

https://gerrit.wikimedia.org/r/473551

Change 473551 merged by jenkins-bot:
[mediawiki/core@wmf/1.33.0-wmf.4] Add trace to debug memcached bad key error

https://gerrit.wikimedia.org/r/473551

Mentioned in SAL (#wikimedia-operations) [2018-11-14T17:58:57Z] <hashar@deploy1001> Started scap: php-1.33.0-wmf.4/includes/libs/objectcache/MemcachedPeclBagOStuff.php Add trace to debug memcached bad key error - T209429

Mentioned in SAL (#wikimedia-operations) [2018-11-14T18:33:04Z] <hashar@deploy1001> Finished scap: php-1.33.0-wmf.4/includes/libs/objectcache/MemcachedPeclBagOStuff.php Add trace to debug memcached bad key error - T209429 (duration: 34m 07s)

Looks like it's probably something to do with rMW51945dbca359: Use DB domain in JobQueueGroup and make WikiMap domain ID methods stricter. I'm continuing to investigate.

A representative trace:

2018-11-14 18:31:06 [W@xp6QpAAEAAAIGhifMAAACC] mw1269 mediawikiwiki 1.33.0-wmf.4 memcached ERROR: Memcached error for key "" on server ":": A BAD KEY WAS PROVIDED/CHARACTERS OUT OF RANGE {"memcached-server":":","memcached-key":null} 
[Exception Exception] (/srv/mediawiki/php-1.33.0-wmf.4/includes/libs/objectcache/MemcachedPeclBagOStuff.php:234) A BAD KEY WAS PROVIDED/CHARACTERS OUT OF RANGE
  #0 /srv/mediawiki/php-1.33.0-wmf.4/includes/libs/objectcache/MemcachedPeclBagOStuff.php(167): MemcachedPeclBagOStuff->checkResult(NULL, boolean)
  #1 /srv/mediawiki/php-1.33.0-wmf.4/includes/jobqueue/JobQueue.php(491): MemcachedPeclBagOStuff->set(NULL, string, integer)
  #2 /srv/mediawiki/php-1.33.0-wmf.4/includes/jobqueue/JobQueue.php(464): JobQueue->doDeduplicateRootJob(HTMLCacheUpdateJob)
  #3 /srv/mediawiki/php-1.33.0-wmf.4/includes/jobqueue/JobQueue.php(345): JobQueue->deduplicateRootJob(HTMLCacheUpdateJob)
  #4 /srv/mediawiki/php-1.33.0-wmf.4/includes/jobqueue/JobQueue.php(310): JobQueue->batchPush(array, integer)
  #5 /srv/mediawiki/php-1.33.0-wmf.4/includes/jobqueue/JobQueueGroup.php(158): JobQueue->push(array)
  #6 /srv/mediawiki/php-1.33.0-wmf.4/includes/deferred/JobQueueEnqueueUpdate.php(57): JobQueueGroup->push(array)
  #7 /srv/mediawiki/php-1.33.0-wmf.4/includes/deferred/DeferredUpdates.php(270): JobQueueEnqueueUpdate->doUpdate()
  #8 /srv/mediawiki/php-1.33.0-wmf.4/includes/deferred/DeferredUpdates.php(216): DeferredUpdates::runUpdate(JobQueueEnqueueUpdate, Wikimedia\Rdbms\LBFactoryMulti, string, integer)
  #9 /srv/mediawiki/php-1.33.0-wmf.4/includes/deferred/DeferredUpdates.php(140): DeferredUpdates::execute(array, string, integer)
  #10 /srv/mediawiki/php-1.33.0-wmf.4/includes/MediaWiki.php(905): DeferredUpdates::doUpdates(string)
  #11 /srv/mediawiki/php-1.33.0-wmf.4/includes/MediaWiki.php(728): MediaWiki->restInPeace(string, boolean)
  #12 [internal function]: Closure$MediaWiki::doPostOutputShutdown()
  #13 {main}

All the traces so far seem to go through JobQueue::deduplicateRootJob().

I ran a full sync for deployment, hence the delay :(

We have traces now, an example:

#0 /srv/mediawiki/php-1.33.0-wmf.4/includes/libs/objectcache/MemcachedPeclBagOStuff.php(156): MemcachedPeclBagOStuff->checkResult(NULL, boolean)
#1 /srv/mediawiki/php-1.33.0-wmf.4/includes/libs/objectcache/MemcachedBagOStuff.php(56): MemcachedPeclBagOStuff->getWithToken(NULL, NULL, integer)
#2 /srv/mediawiki/php-1.33.0-wmf.4/includes/libs/objectcache/BagOStuff.php(193): MemcachedBagOStuff->doGet(NULL, integer)
#3 /srv/mediawiki/php-1.33.0-wmf.4/includes/jobqueue/JobQueue.php(485): BagOStuff->get(NULL)
#4 /srv/mediawiki/php-1.33.0-wmf.4/includes/jobqueue/JobQueue.php(464): JobQueue->doDeduplicateRootJob(RefreshLinksJob)
#5 /srv/mediawiki/php-1.33.0-wmf.4/includes/jobqueue/JobQueue.php(345): JobQueue->deduplicateRootJob(RefreshLinksJob)
#6 /srv/mediawiki/php-1.33.0-wmf.4/includes/jobqueue/JobQueue.php(310): JobQueue->batchPush(array, integer)
#7 /srv/mediawiki/php-1.33.0-wmf.4/includes/jobqueue/JobQueueGroup.php(158): JobQueue->push(array)
#8 /srv/mediawiki/php-1.33.0-wmf.4/includes/deferred/LinksUpdate.php(363): JobQueueGroup->push(array)
#9 /srv/mediawiki/php-1.33.0-wmf.4/includes/deferred/LinksUpdate.php(316): LinksUpdate::queueRecursiveJobsForTable(Title, string, string, string)
#10 /srv/mediawiki/php-1.33.0-wmf.4/includes/deferred/LinksUpdate.php(299): LinksUpdate->queueRecursiveJobs()
#11 /srv/mediawiki/php-1.33.0-wmf.4/includes/deferred/LinksUpdate.php(175): LinksUpdate->doIncrementalUpdate()
#12 /srv/mediawiki/php-1.33.0-wmf.4/includes/deferred/DeferredUpdates.php(270): LinksUpdate->doUpdate()
#13 /srv/mediawiki/php-1.33.0-wmf.4/includes/deferred/DeferredUpdates.php(216): DeferredUpdates::runUpdate(LinksUpdate, Wikimedia\Rdbms\LBFactoryMulti, string, integer)
#14 /srv/mediawiki/php-1.33.0-wmf.4/includes/deferred/DeferredUpdates.php(140): DeferredUpdates::execute(array, string, integer)
#15 /srv/mediawiki/php-1.33.0-wmf.4/includes/MediaWiki.php(905): DeferredUpdates::doUpdates(string)
#16 /srv/mediawiki/php-1.33.0-wmf.4/includes/MediaWiki.php(728): MediaWiki->restInPeace(string, boolean)
#17 [internal function]: Closure$MediaWiki::doPostOutputShutdown()
#18 {main}

Change 473578 had a related patch set uploaded (by Anomie; owner: Anomie):
[mediawiki/core@master] JobQueue: Actually return the value from getRootJobCacheKey()

https://gerrit.wikimedia.org/r/473578

Change 473579 had a related patch set uploaded (by Hashar; owner: Anomie):
[mediawiki/core@wmf/1.33.0-wmf.4] JobQueue: Actually return the value from getRootJobCacheKey()

https://gerrit.wikimedia.org/r/473579

Change 473578 merged by jenkins-bot:
[mediawiki/core@master] JobQueue: Actually return the value from getRootJobCacheKey()

https://gerrit.wikimedia.org/r/473578

Change 473579 merged by jenkins-bot:
[mediawiki/core@wmf/1.33.0-wmf.4] JobQueue: Actually return the value from getRootJobCacheKey()

https://gerrit.wikimedia.org/r/473579

Mentioned in SAL (#wikimedia-operations) [2018-11-14T19:14:23Z] <hashar@deploy1001> Synchronized php-1.33.0-wmf.4/includes/jobqueue/JobQueue.php: Actually return the value from getRootJobCacheKey() - T209429 (duration: 00m 53s)

hashar assigned this task to Anomie.Nov 14 2018, 7:21 PM
hashar closed this task as Resolved.

Thank you Brad. There is no more errors in logstash.

Sorry for the long time it took to add the stack trace to the log context, I ran a full scap instead of syncing just the file.

Sorry for the long time it took to add the stack trace to the log context, I ran a full scap instead of syncing just the file.

No problem, things happen sometimes.