Page MenuHomePhabricator

Don't show "Nonce already used" error on memcache failure
Open, LowPublic

Description

MWOAuthDataStore::lookup_nonce() calls memcached and does not differentiate between a successful lookup (which, for the OAuth protocol, means a nonce reuse error) and a memcached error. This means that the client receives an mwoauth-invalid-authorization error ("The authorization headers in your request are not valid: Nonce already used: XXX") even if the actual error is some memcached server dropping connections. This is very confusing for the client and should be improved. Unfortunately this behavior comes from the external OAuth library so that might not be easy (but maybe we can just throw an exception?), but at a minimum we should differentiate in our internal logging and set up an alert via T100735.

Related Objects

Event Timeline

Sitic created this task.Jul 16 2015, 5:46 PM
Sitic raised the priority of this task from to Needs Triage.
Sitic updated the task description. (Show Details)
Sitic added a subscriber: Sitic.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJul 16 2015, 5:46 PM
Tgr added a subscriber: Tgr.Jul 16 2015, 7:35 PM

One possibility is double-posting of some kind. The other is a memcached error; the extension uses MWOAuthDataStore::lookup_nonce() to store nonces for 5 mins in memcached and check uniqueness, and it returns false both when the nonce is already used and when a memcached operation fails. Given that accepting a nonce without a duplication check opens us to replay attacks, I don't think much can be done about that (although we could improve logging to differentiate between the two cases).

Sitic added a comment.Jul 16 2015, 9:30 PM

I'm pretty sure that it isn't double-posting, the first instance of this error I can find is from 15 Jul 2015 20:19:29 -0700 (PDT), everything worked fine in the last weeks.
As a test 100 API requests resulted in 5 errors: (the nonce is definitely used for fist time here by me, there were no other API requests and the last 10 digits are the unix time)

import requests
from requests_oauthlib import OAuth1

auth1 = OAuth1(consumer_token['key'],
               client_secret=consumer_token['secret'],
               resource_owner_key=access_token['key'],
               resource_owner_secret=access_token['secret'])
for i in range(100):
    response = requests.get(
        "https://meta.wikimedia.org/w/api.php",
        params={
            'action': "query",
            'meta': "userinfo",
            'format': "json"  
        },
        auth=auth1
    )
    if 'error' in response.json():
        print response.json()
    time.sleep(3)     


{u'servedby': u'mw1128', u'error': {u'info': u'The authorization headers in your request are not valid: Nonce already used: 8100470808494508661437080839', u'*': u'See https://meta.wikimedia.org/w/api.php for API usage', u'code': u'mwoauth-invalid-authorization'}}
{u'servedby': u'mw1128', u'error': {u'info': u'The authorization headers in your request are not valid: Nonce already used: 91229791883386003641437080945', u'*': u'See https://meta.wikimedia.org/w/api.php for API usage', u'code': u'mwoauth-invalid-authorization'}}
{u'servedby': u'mw1128', u'error': {u'info': u'The authorization headers in your request are not valid: Nonce already used: 19195888973674011721437080983', u'*': u'See https://meta.wikimedia.org/w/api.php for API usage', u'code': u'mwoauth-invalid-authorization'}}
{u'servedby': u'mw1139', u'error': {u'info': u'The authorization headers in your request are not valid: Nonce already used: 141774547467664134131437081014', u'*': u'See https://meta.wikimedia.org/w/api.php for API usage', u'code': u'mwoauth-invalid-authorization'}}
{u'servedby': u'mw1128', u'error': {u'info': u'The authorization headers in your request are not valid: Nonce already used: 85106567837463406581437081030', u'*': u'See https://meta.wikimedia.org/w/api.php for API usage', u'code': u'mwoauth-invalid-authorization'}}

I've also just seen mw1134 as servedby, unfortunately I don't have the servedby information in my logfile for the previous errors.

Sitic added a comment.Jul 16 2015, 9:59 PM

Seems to have been fixed

[23:26:17] <ori> !log bounced nutcracker on mw1128 and mw1134
[23:26:21] <hashar> mw1139 seems to have the same issue :-(
[23:26:22] <morebots> Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master
[23:27:08] <grrrit-wm> (PS5) Lokal Profil: Add DCAT-AP for Wikibase [puppet] - https://gerrit.wikimedia.org/r/219800 (https://phabricator.wikimedia.org/T103087)
[23:27:11] <ori> !log bounced nutcracker on mw1139 as well. hashar noticed flood of errors from these hosts on https://logstash.wikimedia.org/#/dashboard/elasticsearch/mediawiki-errors . lack of monitoring / alerts is troubling.

Tgr renamed this task from Some OAuth API requests return "Nonce already used" errors to Don't show "Nonce already used" error on memcache failure.Jul 16 2015, 10:06 PM
Tgr updated the task description. (Show Details)
Tgr set Security to None.
Nemo_bis added a subscriber: Nemo_bis.

This sounds related to T102199.

Seems to have been fixed
[23:26:17] <ori> !log bounced nutcracker on mw1128 and mw1134
[23:26:21] <hashar> mw1139 seems to have the same issue :-(
[23:26:22] <morebots> Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master
[23:27:08] <grrrit-wm> (PS5) Lokal Profil: Add DCAT-AP for Wikibase [puppet] - https://gerrit.wikimedia.org/r/219800 (https://phabricator.wikimedia.org/T103087)
[23:27:11] <ori> !log bounced nutcracker on mw1139 as well. hashar noticed flood of errors from these hosts on https://logstash.wikimedia.org/#/dashboard/elasticsearch/mediawiki-errors . lack of monitoring / alerts is troubling.

@ori any idea if nutcracker failure was related to T105131: intermittent nutcracker failures ?

Tgr triaged this task as Low priority.Mar 7 2017, 4:10 AM