Page MenuHomePhabricator

test_allpages_pagesize and test_allpages_protection failures
Closed, ResolvedPublic

Description

About 23 hours ago, shortly after pywikibot commit 9fccfed, there were two unusual timeouts with allpages on English Wikipedia . Probably intermittent, as subsequent builds have not seen similar problems.

======================================================================
ERROR: test_allpages_pagesize (tests.site_tests.TestSiteGenerators)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/build/wikimedia/pywikibot-core/tests/site_tests.py", line 584, in test_allpages_pagesize
    if (len(page.text.encode(mysite.encoding())) > 200 and
  File "/home/travis/build/wikimedia/pywikibot-core/pywikibot/page.py", line 500, in text
    self._text = self.get(get_redirect=True)
  File "/home/travis/build/wikimedia/pywikibot-core/pywikibot/tools/__init__.py", line 1327, in wrapper
    return obj(*__args, **__kw)
  File "/home/travis/build/wikimedia/pywikibot-core/pywikibot/page.py", line 367, in get
    self._getInternals(sysop)
  File "/home/travis/build/wikimedia/pywikibot-core/pywikibot/page.py", line 397, in _getInternals
    self.site.loadrevisions(self, getText=True, sysop=sysop)
  File "/home/travis/build/wikimedia/pywikibot-core/pywikibot/site.py", line 3713, in loadrevisions
    for pagedata in rvgen:
  File "/home/travis/build/wikimedia/pywikibot-core/pywikibot/data/api.py", line 2719, in __iter__
    self.data = self.request.submit()
  File "/home/travis/build/wikimedia/pywikibot-core/pywikibot/data/api.py", line 2320, in submit
    self._data = super(CachedRequest, self).submit()
  File "/home/travis/build/wikimedia/pywikibot-core/pywikibot/data/api.py", line 1965, in submit
    self.wait()
  File "/home/travis/build/wikimedia/pywikibot-core/pywikibot/data/api.py", line 2180, in wait
    raise TimeoutError("Maximum retries attempted without success.")
TimeoutError: Maximum retries attempted without success.

======================================================================
ERROR: test_allpages_protection (tests.site_tests.TestSiteGenerators)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/build/wikimedia/pywikibot-core/tests/site_tests.py", line 599, in test_allpages_protection
    protect_level="sysop", total=5):
  File "/home/travis/build/wikimedia/pywikibot-core/pywikibot/data/api.py", line 2719, in __iter__
    self.data = self.request.submit()
  File "/home/travis/build/wikimedia/pywikibot-core/pywikibot/data/api.py", line 2320, in submit
    self._data = super(CachedRequest, self).submit()
  File "/home/travis/build/wikimedia/pywikibot-core/pywikibot/data/api.py", line 1992, in submit
    self.wait()
  File "/home/travis/build/wikimedia/pywikibot-core/pywikibot/data/api.py", line 2180, in wait
    raise TimeoutError("Maximum retries attempted without success.")
TimeoutError: Maximum retries attempted without success.

https://travis-ci.org/wikimedia/pywikibot-core/jobs/83796554

They are allpages(minsize=100, total=5) and allpages(protect_type="edit", protect_level="sysop", total=5) respectively - the parameters map through predictably to expected API parameters.

These two generator calls have not previously ever caused a timeout that have been noticed by the pywikibot tests, based on no tasks being raised and as far as I can recall.

Feel free to close if this isnt worth chasing down, and we'll re-open if it starts happening with any regularity, or maybe it can easily be correlated with other problems which are known to have been fixed.

Event Timeline

jayvdb created this task.Oct 6 2015, 10:26 PM
jayvdb raised the priority of this task from to Needs Triage.
jayvdb updated the task description. (Show Details)
jayvdb added a subscriber: jayvdb.
Restricted Application added subscribers: pywikibot-bugs-list, Aklapper. · View Herald TranscriptOct 6 2015, 10:26 PM
greg set Security to None.
Anomie closed this task as Resolved.Oct 7 2015, 3:31 PM
Anomie claimed this task.
Anomie added a subscriber: Anomie.

It would be helpful if your test runner were to include more timestamps. Since a later timestamp in the log (line 2120) is 2015-10-05 23:39:33 this is very likely https://wikitech.wikimedia.org/wiki/Incident_documentation/20151005-MediaWiki, so I'm going to resolve this as being due to that.

BTW, somewhat earlier in your log I see (line 1685)

test_allpages_pagesize (tests.site_tests.TestSiteGenerators) ... WARNING: Http response status 500
WARNING: Non-JSON response received from server wikipedia:en; the server may be down.
WARNING: Waiting 5 seconds before retrying.
WARNING: Http response status 500
WARNING: Non-JSON response received from server wikipedia:en; the server may be down.
WARNING: Waiting 10 seconds before retrying.
ERROR: Traceback (most recent call last):
  File "/home/travis/build/wikimedia/pywikibot-core/pywikibot/data/api.py", line 1941, in submit
    body=body, headers=headers)
  File "/home/travis/build/wikimedia/pywikibot-core/pywikibot/tools/__init__.py", line 1327, in wrapper
    return obj(*__args, **__kw)
  File "/home/travis/build/wikimedia/pywikibot-core/pywikibot/comms/http.py", line 247, in request
    r = fetch(baseuri, method, body, headers, **kwargs)
  File "/home/travis/build/wikimedia/pywikibot-core/pywikibot/comms/http.py", line 402, in fetch
    error_handling_callback(request)
  File "/home/travis/build/wikimedia/pywikibot-core/pywikibot/comms/http.py", line 320, in error_handling_callback
    raise request.data
ReadTimeout: HTTPSConnectionPool(host='en.wikipedia.org', port=443): Read timed out. (read timeout=30)
ERROR
 45.561s test_allpages_protection (tests.site_tests.TestSiteGenerators) ... WARNING: Http response status 500
WARNING: Non-JSON response received from server wikipedia:en; the server may be down.
Set gaplimit = ['2']
WARNING: Waiting 5 seconds before retrying.
WARNING: Http response status 500
WARNING: Non-JSON response received from server wikipedia:en; the server may be down.
Set gaplimit = ['1']
WARNING: Waiting 10 seconds before retrying.
WARNING: Http response status 500
WARNING: Non-JSON response received from server wikipedia:en; the server may be down.
Set gaplimit = ['0']
ERROR
 15.439s test_allusers (tests.site_tests.TestSiteGenerators)
Test the site.allusers() method. ... WARNING: Http response status 500
WARNING: Non-JSON response received from server wikipedia:en; the server may be down.
Set aulimit = ['5']
WARNING: Waiting 5 seconds before retrying.
WARNING: Http response status 500
WARNING: Non-JSON response received from server wikipedia:en; the server may be down.
Set aulimit = ['2']
WARNING: Waiting 10 seconds before retrying.
 15.149s ok

HTTP 500 errors from the API are usually a PHP fatal or an error from some other layer (e.g. varnish). The fact that the next test, test_allusers (tests.site_tests.TestSiteGenerators), was also failing with the same symptoms and then suddenly resolved also points to a non-API cause.

I've put up a patch to add timestamps to more of our jobs: https://gerrit.wikimedia.org/r/#/c/244951 . Thanks for looking, and pointing to the likely cause.

Aklapper removed a subscriber: Anomie.Oct 16 2020, 5:39 PM
Restricted Application added a project: Pywikibot. · View Herald TranscriptOct 16 2020, 5:39 PM