Page MenuHomePhabricator

site_tests.py fails for TestUserContribsWithoutUser with TimeoutError
Closed, ResolvedPublic

Description

C:\pwb\GIT\core>pwb.py tests/site_tests -v TestUserContribsWithoutUser
tests: max_retries reduced from 25 to 1
test_invalid_range (__main__.TestUserContribsWithoutUser)
Test the site.usercontribs() method with invalid parameters. ... ok
test_user_prefix (__main__.TestUserContribsWithoutUser)
Test the site.usercontribs() method with userprefix. ... ok
test_user_prefix_range (__main__.TestUserContribsWithoutUser)
Test the site.usercontribs() method. ... ERROR: Traceback (most recent call last
):
  File "C:\pwb\GIT\core\pywikibot\data\api.py", line 1953, in submit
    body=body, headers=headers)
  File "C:\pwb\GIT\core\pywikibot\tools\__init__.py", line 1414, in wrapper
    return obj(*__args, **__kw)
  File "C:\pwb\GIT\core\pywikibot\comms\http.py", line 279, in request
    r = fetch(baseuri, method, body, headers, **kwargs)
  File "C:\pwb\GIT\core\pywikibot\comms\http.py", line 434, in fetch
    error_handling_callback(request)
  File "C:\pwb\GIT\core\pywikibot\comms\http.py", line 352, in error_handling_ca
llback
    raise request.data
ReadTimeout: HTTPSConnectionPool(host='de.wikipedia.org', port=443): Read timed
out. (read timeout=30)

WARNING: Waiting 5 seconds before retrying.
ERROR: Traceback (most recent call last):
  File "C:\pwb\GIT\core\pywikibot\data\api.py", line 1953, in submit
    body=body, headers=headers)
  File "C:\pwb\GIT\core\pywikibot\tools\__init__.py", line 1414, in wrapper
    return obj(*__args, **__kw)
  File "C:\pwb\GIT\core\pywikibot\comms\http.py", line 279, in request
    r = fetch(baseuri, method, body, headers, **kwargs)
  File "C:\pwb\GIT\core\pywikibot\comms\http.py", line 434, in fetch
    error_handling_callback(request)
  File "C:\pwb\GIT\core\pywikibot\comms\http.py", line 352, in error_handling_ca
llback
    raise request.data
ReadTimeout: HTTPSConnectionPool(host='de.wikipedia.org', port=443): Read timed
out. (read timeout=30)

ERROR
 65.887s test_user_prefix_reverse (__main__.TestUserContribsWithoutUser)
Test the site.usercontribs() method with range reversed. ...  20.991s ok

======================================================================
ERROR: test_user_prefix_range (__main__.TestUserContribsWithoutUser)
Test the site.usercontribs() method.
----------------------------------------------------------------------
Traceback (most recent call last):
  File ".\tests\site_tests.py", line 1494, in test_user_prefix_range
    total=5):
  File "C:\pwb\GIT\core\pywikibot\data\api.py", line 2739, in __iter__
    self.data = self.request.submit()
  File "C:\pwb\GIT\core\pywikibot\data\api.py", line 1977, in submit
    self.wait()
  File "C:\pwb\GIT\core\pywikibot\data\api.py", line 2191, in wait
    raise TimeoutError("Maximum retries attempted without success.")
TimeoutError: Maximum retries attempted without success.

----------------------------------------------------------------------
Ran 4 tests in 89.061s

FAILED (errors=1)

C:\pwb\GIT\core>

Sometimes it also fails for test_user_prefix_reverse,
see https://travis-ci.org/wikimedia/pywikibot-core/jobs/128367609

Event Timeline

Change 287403 had a related patch set uploaded (by Xqt):
[TEST] Test solution for T134640

https://gerrit.wikimedia.org/r/287403

Change 287403 had a related patch set uploaded (by Xqt):
[TEST] Test solution for T134640

https://gerrit.wikimedia.org/r/287403

Change 287403 merged by jenkins-bot:
[TEST] Test solution for T134640

https://gerrit.wikimedia.org/r/287403

It looks like that this request is heave for Mediawiki so processing it take more then 30 seconds, and that's our standard timeout setting (see: 1).

There are 3 possible solutions:

  1. we can increase default socket_timeout globally to some arbitrary number,
  2. we can overwrite socket_timeout globally on our CI environment,
  3. we can implement such mechanism that will pass timeout variable per request,

I'm into first, because if we have timeout problem, our users could have it also. We should just choose number here.

What do You think? @Xqt @jayvdb , @valhallasw , @Dalba, @Lokal_Profil ?

Change 323549 had a related patch set uploaded (by Magul):
Increase defualt request timeout

https://gerrit.wikimedia.org/r/323549

I think a timeout of 5 minutes for a single request is to high. This could be misinterpreted a hanging script. Skipping the test would be appropriate. Or increasing it for CI tests only.

@MtDu CI == Continuous integration. In this situation we are talking about Travis CI, AppVeyor CI and after implementation Wikimedia CI.

@Xqt maybe we should found some number in beetween 30 and 300 that will make sense? Almost all requests take few seconds top to be responded, so timeout change will not affect them at all, and if You have heavy request You need to be prepared to wait. And I believe it's still better to wait and got response then to wait and get TimeoutError. It just looks like we are going to release software, that we genuine know that's broken, but don't want to fix it, because sometimes somebody could think that there is some freeze in script.

We are facing a ReadTimeout. Instead of increasing both connect and read timeouts we could only increase the latter. But that requires requests 2.4.0+ which I'm not sure we can rely on.

config.py setting is a default setting for all bot users. I guess most bot task does not exceed that given limit. For these problems while testing I think it would be better to use a changed value in the private user-config.py or set that config.py value for our tests. inside the test stuff.

I have just pushed update to my patch that will change default timeout to 1 minute instead of 5 (and fixing commit message typo).

@Dalba I'm not certain what kind of advantage we will get, if we will distinguish read timeout and connect timeout.

@Xqt I get Your hesitation about increasing default timeout, but we should think about it as an abstract value, that we are willing to wait. I know that lot of bots doesn't do heavy requests but they will not be affected by this change, and if will, inecrease of default timeout value is the the least of their problems.
.

@Magul Well, I think it would be better if we increased the timeouts as low as necessary. Especially that in multi-threaded environments these timeouts make the application completely unresponsive. When there is actually a problem with the connection somewhere an early timeout can be helpful. Users could actually depend on these timeouts, trying to catch and skip them or just using the built-in retry mechanism. Increasing the timeouts could greatly increase the overall retry times. An early timeout when there is actually no more hope can be a blessing...

On the other hand I'm not sure that increasing the timeout will resolve the problem here. I tried to reproduce the issue (many times) and only a few of my attempts where successful. All of which ended up with a 503 error instead of finishing the request:

WARNING: Http response status 503
WARNING: Non-JSON response received from server wikipedia:en; the server may be down.
Set uclimit = ['2']
WARNING: Waiting 5 seconds before retrying.
WARNING: Http response status 503
WARNING: Non-JSON response received from server wikipedia:en; the server may be down.
Set uclimit = ['1']
 246.508s 
Error
Traceback (most recent call last):
  File "G:\programming\python\wiki\pywikibot-core\pywikibot\data\api.py", line 1987, in submit
    result = json.loads(rawdata)
  File "C:\Users\a\AppData\Local\Programs\Python\Python35\lib\json\__init__.py", line 319, in loads
    return _default_decoder.decode(s)
  File "C:\Users\a\AppData\Local\Programs\Python\Python35\lib\json\decoder.py", line 339, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "C:\Users\a\AppData\Local\Programs\Python\Python35\lib\json\decoder.py", line 357, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "G:\programming\python\wiki\pywikibot-core\tests\site_tests.py", line 1493, in test_user_prefix_range
    total=5):
  File "G:\programming\python\wiki\pywikibot-core\pywikibot\data\api.py", line 2753, in __iter__
    self.data = self.request.submit()
  File "G:\programming\python\wiki\pywikibot-core\pywikibot\data\api.py", line 2005, in submit
    self.wait()
  File "G:\programming\python\wiki\pywikibot-core\pywikibot\data\api.py", line 2197, in wait
    raise TimeoutError("Maximum retries attempted without success.")
pywikibot.data.api.TimeoutError: Maximum retries attempted without success.

WARNING: Http response status 503
WARNING: Non-JSON response received from server wikipedia:en; the server may be down.
Set uclimit = ['2']
WARNING: Waiting 5 seconds before retrying.
 190.370s 
Process finished with exit code 0

(It seems that some caching mechanism is involved and you can only get these errors in the first attempt, then you'll have to wait a while, otherwise all subsequent retries will end up passing without error.)

tests: max_retries reduced from 25 to 1
WARNING: Http response status 503
WARNING: Non-JSON response received from server wikipedia:en; the server may be down.
Set uclimit = ['2']
WARNING: Waiting 5 seconds before retrying.
WARNING: Http response status 503
WARNING: Non-JSON response received from server wikipedia:en; the server may be down.
Set uclimit = ['1']
 250.990s 
Error
Traceback (most recent call last):
  File "G:\programming\python\wiki\pywikibot-core\pywikibot\data\api.py", line 1987, in submit
    result = json.loads(rawdata)
  File "C:\Users\a\AppData\Local\Programs\Python\Python35\lib\json\__init__.py", line 319, in loads
    return _default_decoder.decode(s)
  File "C:\Users\a\AppData\Local\Programs\Python\Python35\lib\json\decoder.py", line 339, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "C:\Users\a\AppData\Local\Programs\Python\Python35\lib\json\decoder.py", line 357, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "G:\programming\python\wiki\pywikibot-core\tests\site_tests.py", line 1516, in test_user_prefix_reverse
    reverse=True, total=5):
  File "G:\programming\python\wiki\pywikibot-core\pywikibot\data\api.py", line 2753, in __iter__
    self.data = self.request.submit()
  File "G:\programming\python\wiki\pywikibot-core\pywikibot\data\api.py", line 2005, in submit
    self.wait()
  File "G:\programming\python\wiki\pywikibot-core\pywikibot\data\api.py", line 2197, in wait
    raise TimeoutError("Maximum retries attempted without success.")
pywikibot.data.api.TimeoutError: Maximum retries attempted without success.


Process finished with exit code 0

(The timeout was set to 300 seconds in both cases)

After some tinkering with different configurations I think I have found the pattern in which this error occurs. It seems to be mostly due to mediawiki not being able to handle short userprefixes.

The following test uses ascii letters as user prefix, is very slow, and almost always fails for me on enwiki:

def test_user_prefix_reverse(self):
    """Test the site.usercontribs() method with range reversed."""
    mysite = self.get_site()
    import string
    for userprefix in string.ascii_uppercase:
        print(userprefix)
        for contrib in mysite.usercontribs(
                userprefix=userprefix,
                start=pywikibot.Timestamp.fromISOformat("2008-10-11T06:00:01Z"),
                end=pywikibot.Timestamp.fromISOformat("2008-10-11T23:59:59Z"),
                reverse=True, total=5):
            self.assertTrue(
                "2008-10-11T06:00:01Z" <= contrib['timestamp'] <= "2008-10-11T23:59:59Z")

The longer the userprefix the higher the chance of passing the test; which makes sense.

I'll upload a patch according to the observations above.

Dalba added a subscriber: Magul.

Change 335403 had a related patch set uploaded (by Dalba):
TestUserContribsWithoutUser: Avoid too short prefixes

https://gerrit.wikimedia.org/r/335403

Change 335403 merged by jenkins-bot:
TestUserContribsWithoutUser: Avoid too short prefixes

https://gerrit.wikimedia.org/r/335403

Change 323549 abandoned by Xqt:
Increase default request timeout

Reason:
Already fixed with https://gerrit.wikimedia.org/r/335403

https://gerrit.wikimedia.org/r/323549