MWOAuthDataStore::lookup_nonce() calls memcached and does not differentiate between a successful lookup (which, for the OAuth protocol, means a nonce reuse error) and a memcached error. This means that the client receives an mwoauth-invalid-authorization error ("The authorization headers in your request are not valid: Nonce already used: XXX") even if the actual error is some memcached server dropping connections. This is very confusing for the client and should be improved. Unfortunately this behavior comes from the external OAuth library so that might not be easy (but maybe we can just throw an exception?), but at a minimum we should differentiate in our internal logging and set up an alert via T100735.
Description
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Invalid | None | T229661 Core REST API in MediaWiki | |||
Resolved | • eprodromou | T234665 Add OAuth 2.0 support to MediaWiki REST API | |||
Open | None | T245477 OAuth server should provide clear and useful feedback about client errors | |||
Resolved | Krinkle | T106066 Don't show "Nonce already used" error on memcache failure |
Event Timeline
One possibility is double-posting of some kind. The other is a memcached error; the extension uses MWOAuthDataStore::lookup_nonce() to store nonces for 5 mins in memcached and check uniqueness, and it returns false both when the nonce is already used and when a memcached operation fails. Given that accepting a nonce without a duplication check opens us to replay attacks, I don't think much can be done about that (although we could improve logging to differentiate between the two cases).
I'm pretty sure that it isn't double-posting, the first instance of this error I can find is from 15 Jul 2015 20:19:29 -0700 (PDT), everything worked fine in the last weeks.
As a test 100 API requests resulted in 5 errors: (the nonce is definitely used for fist time here by me, there were no other API requests and the last 10 digits are the unix time)
import requests from requests_oauthlib import OAuth1 auth1 = OAuth1(consumer_token['key'], client_secret=consumer_token['secret'], resource_owner_key=access_token['key'], resource_owner_secret=access_token['secret']) for i in range(100): response = requests.get( "https://meta.wikimedia.org/w/api.php", params={ 'action': "query", 'meta': "userinfo", 'format': "json" }, auth=auth1 ) if 'error' in response.json(): print response.json() time.sleep(3) {u'servedby': u'mw1128', u'error': {u'info': u'The authorization headers in your request are not valid: Nonce already used: 8100470808494508661437080839', u'*': u'See https://meta.wikimedia.org/w/api.php for API usage', u'code': u'mwoauth-invalid-authorization'}} {u'servedby': u'mw1128', u'error': {u'info': u'The authorization headers in your request are not valid: Nonce already used: 91229791883386003641437080945', u'*': u'See https://meta.wikimedia.org/w/api.php for API usage', u'code': u'mwoauth-invalid-authorization'}} {u'servedby': u'mw1128', u'error': {u'info': u'The authorization headers in your request are not valid: Nonce already used: 19195888973674011721437080983', u'*': u'See https://meta.wikimedia.org/w/api.php for API usage', u'code': u'mwoauth-invalid-authorization'}} {u'servedby': u'mw1139', u'error': {u'info': u'The authorization headers in your request are not valid: Nonce already used: 141774547467664134131437081014', u'*': u'See https://meta.wikimedia.org/w/api.php for API usage', u'code': u'mwoauth-invalid-authorization'}} {u'servedby': u'mw1128', u'error': {u'info': u'The authorization headers in your request are not valid: Nonce already used: 85106567837463406581437081030', u'*': u'See https://meta.wikimedia.org/w/api.php for API usage', u'code': u'mwoauth-invalid-authorization'}}
I've also just seen mw1134 as servedby, unfortunately I don't have the servedby information in my logfile for the previous errors.
Seems to have been fixed
[23:26:17] <ori> !log bounced nutcracker on mw1128 and mw1134
[23:26:21] <hashar> mw1139 seems to have the same issue :-(
[23:26:22] <morebots> Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master
[23:27:08] <grrrit-wm> (PS5) Lokal Profil: Add DCAT-AP for Wikibase [puppet] - https://gerrit.wikimedia.org/r/219800 (https://phabricator.wikimedia.org/T103087)
[23:27:11] <ori> !log bounced nutcracker on mw1139 as well. hashar noticed flood of errors from these hosts on https://logstash.wikimedia.org/#/dashboard/elasticsearch/mediawiki-errors . lack of monitoring / alerts is troubling.
@ori any idea if nutcracker failure was related to T105131: intermittent nutcracker failures ?
Regarding Memcached sometimes failing to store (e.g. due to being expired within one second of storing, or due to switching a shard to a new host, or due to resharding, or due to a host briefly not responding),