Page MenuHomePhabricator

"Key contains invalid characters" when using MultiWriteBagOStuff
Closed, ResolvedPublicPRODUCTION ERROR

Description

This may not be related to the parent task (wmf.4 rollout), but I spotted it then so I'm filing it as such. On Commons (so far, could be elsewhere), something is generating links of the following sort:

https://commons.wikimedia.org/wiki/File:Map_of_USA_OR.svg?uselang=⧼Lang⧽

The actual error appears in logstash as:

Key contains invalid characters: commonswiki:pcache:idhash:7398609-0!userlang=⧼lang⧽

#0 /srv/mediawiki/php-1.30.0-wmf.4/includes/libs/objectcache/MemcachedPeclBagOStuff.php(145): MemcachedBagOStuff->validateKeyEncoding(string)
#1 /srv/mediawiki/php-1.30.0-wmf.4/includes/libs/objectcache/MemcachedBagOStuff.php(56): MemcachedPeclBagOStuff->getWithToken(string, NULL, integer)
#2 /srv/mediawiki/php-1.30.0-wmf.4/includes/libs/objectcache/BagOStuff.php(185): MemcachedBagOStuff->doGet(string, integer)
#3 /srv/mediawiki/php-1.30.0-wmf.4/includes/libs/objectcache/MultiWriteBagOStuff.php(112): BagOStuff->get(string, integer)
#4 /srv/mediawiki/php-1.30.0-wmf.4/includes/libs/objectcache/BagOStuff.php(185): MultiWriteBagOStuff->doGet(string, integer)
#5 /srv/mediawiki/php-1.30.0-wmf.4/includes/parser/ParserCache.php(230): BagOStuff->get(string, integer, integer)
#6 /srv/mediawiki/php-1.30.0-wmf.4/includes/page/Article.php(528): ParserCache->get(WikiFilePage, ParserOptions)
#7 /srv/mediawiki/php-1.30.0-wmf.4/includes/page/ImagePage.php(162): Article->view()
#8 /srv/mediawiki/php-1.30.0-wmf.4/includes/actions/ViewAction.php(68): ImagePage->view()
#9 /srv/mediawiki/php-1.30.0-wmf.4/includes/MediaWiki.php(499): ViewAction->show()
#10 /srv/mediawiki/php-1.30.0-wmf.4/includes/MediaWiki.php(293): MediaWiki->performAction(ImagePage, Title)
#11 /srv/mediawiki/php-1.30.0-wmf.4/includes/MediaWiki.php(862): MediaWiki->performRequest()
#12 /srv/mediawiki/php-1.30.0-wmf.4/includes/MediaWiki.php(523): MediaWiki->main()
#13 /srv/mediawiki/php-1.30.0-wmf.4/index.php(43): MediaWiki->run()
#14 /srv/mediawiki/w/index.php(3): include(string)
#15 {main}

The image varies depending on request--and there's not a ton of these requests to begin with. I'm curious about two things:

  1. Can we stop this from throwing an exception? I imagine we could fail much nicer on bogus uselang values--and we definitely shouldn't be letting it get so far down the stack into the parser cache code...
  2. What's generating these bogus links? My guess is some on-wiki template or whatever, but I dunno.

Event Timeline

Note this doesn't look image-specific; I get exception thrown if I try on a regular page that exists too, such as https://commons.wikimedia.org/wiki/Main_Page?uselang=%E2%A7%BCLang%E2%A7%BD BUT - I don't see the equivalent error on enwiki. https://en.wikipedia.org/wiki/Main_Page?uselang=%E2%A7%BCLang%E2%A7%BD does not error out.

The 'uselang' on ResourceContext and later goes through into $wgUserLang etc & should be pre-sanitized, so I'm not sure where the unsanitized one is coming from exactly... Some other extension maybe? (/me looks askance at ULS but doesn't see an obvious bug yet)

Per IRC, it's looking like these queries are coming from a malformed Bingbot request. So removed as blocker, although this is still pretty nasty and needs fixing.

Per irc, note that Wikibase's splitting everything on user lang is why the userlang appears in pcache and thus the bug is visible on any page on Commons, given the URL param, but on meta or enwiki it needs to be a page that uses user-lang stuff specifically.

I can force a similar scenario on a one-off page in MediaWiki-Vagrant:

  1. On Main Page, add {{int:mainpage}} or something else that uses a user-language UI message.
  2. go to http://dev.wiki.local.wmftest.net:8080/wiki/Main_Page?uselang=%E2%A7%BCLang%E2%A7%BD
  3. be surprised that the cache key is: "wiki:pcache:idhash:1-0!userlang=%E2%A7%BClang%E2%A7%BD"

Because something's URL-encoding here it doesn't trip the error that was seen in production, but it's still mostly-unsanitized data going into the key that shouldn't be.

Turns out Language::isValidCode() lets a lot through that I didn't expect for compatibility with weird hacks. It should prevent actual < and > and path-traversals etc, but is letting the non-ASCIIs through directly.

If that's expected behavior (hah!), then the main question is why is it coming up in the pcache key URL-encoded on my vagrant setup but not URL-encoded in production? URL-encoding should "solve" it insofar as making the cache keys work.

Ok, my vagrant install is using MemcachedBagOStuff which has a makeKeyInternal() which uses rawurlencode() on input strings, but in production we're using MultiWriteBagOStuff which does not override the naive base implementation from BagOStuff.

As a result, the keys created via MultiWriteBagOStuff->makeKey() are not encoded, and MemcachedBagOStuff sees non-ASCII chars and rejects them.

Recommended fix: move the sanitization in MemcacheBagOStuff->makeKeyInternal up to BagOStuff->makeKeyInternal and just apply it evenly everywhere so no surprises.

Ooh, that's interesting. +cc @Krinkle . This is probably caused by rMWff8a0c788bc8: parser: Avoid deprecated wfMemcKey() then. wfMemcKey() calls $wgMemc->makeKey so it would use the MemcachedBagOStuff implementation, but switching to the actual MultiWriteBagOStuff->makeKey() leaves it without the special memcache handling.

MultiWriteBagOStuff should probably unmake keys into arrays and passing them to the makeKey method of the bag it is proxying to, before calling that bag's get/set method.

Krinkle renamed this task from "Key contains invalid characters" on bogus uselang input to "Key contains invalid characters" when using MultiWriteBagOStuff.Jun 9 2017, 11:08 PM
Krinkle added a project: Performance-Team.
Krinkle moved this task from Library to libs/objectcache on the MediaWiki-libs-ObjectCache board.
Krinkle added a subscriber: aaron.

Recommended fix: move the sanitization in MemcacheBagOStuff->makeKeyInternal up to BagOStuff->makeKeyInternal and just apply it evenly everywhere so no surprises.

I'm not sure we can assume a single encoding will (or should) work on every backend. I'd rather keep this generic.

I'd recommend we apply the same logic in MultiWriteBackOStuff as we do already in WANObjectCache and CachedBagOStuff - which is to simply create stub methods that call the backend's method for makeKey and makeGlobalKey.

Change 358999 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/core@master] objectcache: Forward MultiWriteBagOStuff::makeKey to primary backend

https://gerrit.wikimedia.org/r/358999

Change 358999 merged by jenkins-bot:
[mediawiki/core@master] objectcache: Forward MultiWriteBagOStuff::makeKey to primary backend

https://gerrit.wikimedia.org/r/358999

Recommended fix: move the sanitization in MemcacheBagOStuff->makeKeyInternal up to BagOStuff->makeKeyInternal and just apply it evenly everywhere so no surprises.

I'm not sure we can assume a single encoding will (or should) work on every backend. I'd rather keep this generic.

Note that MemcachedBagOStuff seems to be the only subclass enforcing any validation or non-default key encoding at present... SqlBagOStuff probably should because it has a length limit on objectcache.keyname column, but doesn't seem to check for it.

I'd recommend we apply the same logic in MultiWriteBackOStuff as we do already in WANObjectCache and CachedBagOStuff - which is to simply create stub methods that call the backend's method for makeKey and makeGlobalKey.

Sounds good for now, I've +2'd the patch.

Change 359144 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/core@wmf/1.30.0-wmf.5] objectcache: Forward MultiWriteBagOStuff::makeKey to primary backend

https://gerrit.wikimedia.org/r/359144

Change 359144 merged by jenkins-bot:
[mediawiki/core@wmf/1.30.0-wmf.5] objectcache: Forward MultiWriteBagOStuff::makeKey to primary backend

https://gerrit.wikimedia.org/r/359144

Mentioned in SAL (#wikimedia-operations) [2017-06-15T18:16:50Z] <demon@tin> Synchronized php-1.30.0-wmf.5/includes/libs/objectcache/MultiWriteBagOStuff.php: T167465 (duration: 00m 44s)

Fixed. Error count for this dropped to 0 on 2017-06-15 when wmf.5 rolled out to all wikis:

Screen Shot 2017-06-21 at 19.53.33.png (550×2 px, 104 KB)

Thanks everyone for your help on this! Was quite a rabbit hole from the initial bug report :)

mmodell changed the subtype of this task from "Task" to "Production Error".Aug 28 2019, 11:10 PM