Page MenuHomePhabricator

"The write operation timed out" with Pywikibot
Open, Needs TriagePublicPRODUCTION ERROR

Description

Error

While uploading files with Pywikibot ( https://commons.wikimedia.org/wiki/Special:Contributions/YannBot ), I get the following:

ERROR: An error occurred for uri https://commons.wikimedia.org/w/api.php
ERROR: Traceback (most recent call last):
  File "/cygdrive/c/Users/yannf/OneDrive/Documents/core_stable/pywikibot/data/api/_requests.py", line 681, in _http_request
    response = http.request(self.site, uri=uri,
  File "/cygdrive/c/Users/yannf/OneDrive/Documents/core_stable/pywikibot/comms/http.py", line 282, in request
    r = fetch(baseuri, headers=headers, **kwargs)
  File "/cygdrive/c/Users/yannf/OneDrive/Documents/core_stable/pywikibot/comms/http.py", line 448, in fetch
    callback(response)
  File "/cygdrive/c/Users/yannf/OneDrive/Documents/core_stable/pywikibot/comms/http.py", line 333, in error_handling_callback
    raise response from None
  File "/cygdrive/c/Users/yannf/OneDrive/Documents/core_stable/pywikibot/comms/http.py", line 439, in fetch
    response = session.request(method, uri,
  File "/usr/lib/python3.9/site-packages/requests/sessions.py", line 529, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/lib/python3.9/site-packages/requests/sessions.py", line 645, in send
    r = adapter.send(request, **kwargs)
  File "/usr/lib/python3.9/site-packages/requests/adapters.py", line 501, in send
    raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', timeout('The write operation timed out'))
Impact

The file failed to upload after repeated attempts.

Event Timeline

Looks that this exception is raised from urllib3 site package and it is based on SocketError or HTTPException see urllib3.connectionpool.py line 784 (urllib3==1.26.15) within HTTPConnectionPool.urlopen() method when requesting a connection from the queue. This is an upstream issue from Pywikibot's side. Anyway maybe you could increase the timeout value in your user-config.py. The default is socket_timeout = (6.05, 45), the first value of this tuple is important; see https://doc.wikimedia.org/pywikibot/master/api_ref/pywikibot.config.html#http-settings

@Yann: Did you increase the timeout settings and get T338969 instead? What was you script command?

I replaced socket_timeout = (6.05, 45) by socket_timeout = (12, 45), but it apparently didn't change anything.
It is not clear to me to what these values correspond. There is not much detail in the doc about them.

It is not clear to me to what these values correspond. There is not much detail in the doc about them.

https://requests.readthedocs.io/en/stable/user/advanced/#timeouts

OK, thanks for the link. Now I am trying with socket_timeout = (6.05, 90), but some small files still fail.
https://commons.wikimedia.org/wiki/File:Masterandman_07_tolstoy.mp3 failed on the first try (6.05 MB).
Bigger files fail more often.

An example of failure now:

WARNING: API error http-curl-error: Error fetching URL: Received HTTP code 403 from proxy after CONNECT
ERROR: Upload error:
Traceback (most recent call last):
  File "/cygdrive/c/Users/yannf/OneDrive/Documents/core_stable/pywikibot/specialbots/_upload.py", line 408, in upload_file
    success = imagepage.upload(file_url,
  File "/cygdrive/c/Users/yannf/OneDrive/Documents/core_stable/pywikibot/page/_filepage.py", line 267, in upload
    return self.site.upload(self, source_filename=filename, source_url=url,
  File "/cygdrive/c/Users/yannf/OneDrive/Documents/core_stable/pywikibot/site/_decorators.py", line 92, in callee
    return fn(self, *args, **kwargs)
  File "/cygdrive/c/Users/yannf/OneDrive/Documents/core_stable/pywikibot/site/_apisite.py", line 2853, in upload
    return Uploader(self, filepage, **kwargs).upload()
  File "/cygdrive/c/Users/yannf/OneDrive/Documents/core_stable/pywikibot/site/_upload.py", line 133, in upload
    return self._upload(self.ignore_warnings, self.report_success)
  File "/cygdrive/c/Users/yannf/OneDrive/Documents/core_stable/pywikibot/site/_upload.py", line 435, in _upload
    return self.submit(final_request, result, data.get('result'),
  File "/cygdrive/c/Users/yannf/OneDrive/Documents/core_stable/pywikibot/site/_upload.py", line 455, in submit
    result = request.submit()['upload']
  File "/cygdrive/c/Users/yannf/OneDrive/Documents/core_stable/pywikibot/data/api/_requests.py", line 1088, in submit
    raise pywikibot.exceptions.APIError(**args)
pywikibot.exceptions.APIError: http-curl-error: Error fetching URL: Received HTTP code 403 from proxy after CONNECT
[param: action=upload&filename=Possessed+24+dostoyevsky+128kb.mp3&url=https%3A%2F%2Fwww.archive.org%2Fdownload%2Fpossessed_1404_librivox%2Fpossessed_24_dostoyevsky_128kb.mp3&comment=%3D%3D+%7B%7Bint%3Afiledesc%7D%7D+%3D%3D%0A%7B%7BInformation%0A%7CDescription%3D%7B%7Ben%7C%27%27The+Possessed%27%27%2C+by+Fyodor+Dostoyevsky%2C+in+English.+Translated+by+Constance+Garnett.+LibriVox+recording+by+Expatriate.+24+-+Pt.+1%2C+Ch.+05%3A+The+Subtle+Serpent+%28Sec.+5%29.%7D%7D+%0A%7B%7Bfr%7C%27%27The+Possessed%27%27%2C+par+Fiodor+Dosto%C3%AFevski%2C+en+anglais.+Traduction+par+Constance+Garnett.+Enregistrement+LibriVox+par+Expatriate.+24+-+Pt.+1%2C+Ch.+05%3A+The+Subtle+Serpent+%28Sec.+5%29.%7D%7D%0A%7CSource%3Dhttps%3A%2F%2Farchive.org%2Fdetails%2Fpossessed_1404_librivox+https%3A%2F%2Farchive.org%2Fdetails%2Fpossessed_1404_librivox%0A%7CDate%3D2014-04-26%0A%7CAuthor%3D%7B%7BCreator%3AFyodor+Dostoyevsky%7D%7D%0A%7B%7BCreator%3AConstance+Garnett%7D%7D%0A%7CPermission%3D%0A%7Cother_versions%3D%0A%7D%7D%0A%0A%3D%3D+%7B%7Bint%3Alicense-header%7D%7D+%3D%3D%0ARecording%0A%7B%7BLibriVox+public+domain%7D%7D%0AOriginal+%0A%7B%7BPD-old-100-expired%7D%7D%0A%0A%5B%5BCategory%3ALibriVox+-+The+Possessed%2C+by+Fyodor+Dostoyevsky%5D%5D%0A%5B%5BCategory%3AFiles+uploaded+by+Yann+Forget%5D%5D%0A&text=%3D%3D+%7B%7Bint%3Afiledesc%7D%7D+%3D%3D%0A%7B%7BInformation%0A%7CDescription%3D%7B%7Ben%7C%27%27The+Possessed%27%27%2C+by+Fyodor+Dostoyevsky%2C+in+English.+Translated+by+Constance+Garnett.+LibriVox+recording+by+Expatriate.+24+-+Pt.+1%2C+Ch.+05%3A+The+Subtle+Serpent+%28Sec.+5%29.%7D%7D+%0A%7B%7Bfr%7C%27%27The+Possessed%27%27%2C+par+Fiodor+Dosto%C3%AFevski%2C+en+anglais.+Traduction+par+Constance+Garnett.+Enregistrement+LibriVox+par+Expatriate.+24+-+Pt.+1%2C+Ch.+05%3A+The+Subtle+Serpent+%28Sec.+5%29.%7D%7D%0A%7CSource%3Dhttps%3A%2F%2Farchive.org%2Fdetails%2Fpossessed_1404_librivox+https%3A%2F%2Farchive.org%2Fdetails%2Fpossessed_1404_librivox%0A%7CDate%3D2014-04-26%0A%7CAuthor%3D%7B%7BCreator%3AFyodor+Dostoyevsky%7D%7D%0A%7B%7BCreator%3AConstance+Garnett%7D%7D%0A%7CPermission%3D%0A%7Cother_versions%3D%0A%7D%7D%0A%0A%3D%3D+%7B%7Bint%3Alicense-header%7D%7D+%3D%3D%0ARecording%0A%7B%7BLibriVox+public+domain%7D%7D%0AOriginal+%0A%7B%7BPD-old-100-expired%7D%7D%0A%0A%5B%5BCategory%3ALibriVox+-+The+Possessed%2C+by+Fyodor+Dostoyevsky%5D%5D%0A%5B%5BCategory%3AFiles+uploaded+by+Yann+Forget%5D%5D%0A&assert=user&maxlag=5&format=json&token=a357894a5d3e285d10847ce9e8d8e0af64970e41%2B%5C;
 servedby: mw1493;
 help: See https://commons.wikimedia.org/w/api.php for API usage. Subscribe to the mediawiki-api-announce mailing list at <https://lists.wikimedia.org/postorius/lists/mediawiki-api-announce.lists.wikimedia.org/> for notice of API deprecations and breaking changes.]
1 read operation
Script terminated successfully.

WIth a new script, I now get for some files:

File: landlady_04_dostoyevsky.mp3
Title: 04 - Part II, Chapter I
https://www.archive.org/download/landlady_2205_librivox/landlady_04_dostoyevsky.mp3
Sleeping for 8.1 seconds, 2023-06-28 17:58:56
WARNING: Waiting 5.0 seconds before retrying.
WARNING: Waiting 10.0 seconds before retrying.
WARNING: Waiting 20.0 seconds before retrying.
WARNING: Waiting 40.0 seconds before retrying.
WARNING: Waiting 80.0 seconds before retrying.
WARNING: Waiting 120.0 seconds before retrying.
WARNING: Waiting 120.0 seconds before retrying.
WARNING: Waiting 120.0 seconds before retrying.
WARNING: Waiting 120.0 seconds before retrying.
WARNING: Waiting 120.0 seconds before retrying.
WARNING: Waiting 120.0 seconds before retrying.
WARNING: Waiting 120.0 seconds before retrying.
WARNING: Waiting 120.0 seconds before retrying.
WARNING: Waiting 120.0 seconds before retrying.
WARNING: Waiting 120.0 seconds before retrying.
Upload failed. Retrying...
WARNING: Waiting 5.0 seconds before retrying.
WARNING: Waiting 10.0 seconds before retrying.
WARNING: Waiting 20.0 seconds before retrying.
WARNING: Waiting 40.0 seconds before retrying.
WARNING: Waiting 80.0 seconds before retrying.
WARNING: Waiting 120.0 seconds before retrying.
WARNING: Waiting 120.0 seconds before retrying.
WARNING: Waiting 120.0 seconds before retrying.
WARNING: Waiting 120.0 seconds before retrying.
WARNING: Waiting 120.0 seconds before retrying.
WARNING: Waiting 120.0 seconds before retrying.
WARNING: Waiting 120.0 seconds before retrying.
WARNING: Waiting 120.0 seconds before retrying.
WARNING: Waiting 120.0 seconds before retrying.
WARNING: Waiting 120.0 seconds before retrying.
Upload failed. Retrying...
WARNING: Waiting 5.0 seconds before retrying.
WARNING: Waiting 10.0 seconds before retrying.
WARNING: Waiting 20.0 seconds before retrying.
WARNING: Waiting 40.0 seconds before retrying.
WARNING: Waiting 80.0 seconds before retrying.
WARNING: Waiting 120.0 seconds before retrying.
WARNING: Waiting 120.0 seconds before retrying.
ERROR: An error occurred for uri https://commons.wikimedia.org/w/api.php
ERROR: Traceback (most recent call last):
  File "/cygdrive/c/Users/yannf/OneDrive/Documents/core_stable/pywikibot/data/api/_requests.py", line 681, in _http_request
    response = http.request(self.site, uri=uri,
  File "/cygdrive/c/Users/yannf/OneDrive/Documents/core_stable/pywikibot/comms/http.py", line 282, in request
    r = fetch(baseuri, headers=headers, **kwargs)
  File "/cygdrive/c/Users/yannf/OneDrive/Documents/core_stable/pywikibot/comms/http.py", line 448, in fetch
    callback(response)
  File "/cygdrive/c/Users/yannf/OneDrive/Documents/core_stable/pywikibot/comms/http.py", line 333, in error_handling_callback
    raise response from None
  File "/cygdrive/c/Users/yannf/OneDrive/Documents/core_stable/pywikibot/comms/http.py", line 439, in fetch
    response = session.request(method, uri,
  File "/usr/lib/python3.9/site-packages/requests/sessions.py", line 529, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/lib/python3.9/site-packages/requests/sessions.py", line 645, in send
    r = adapter.send(request, **kwargs)
  File "/usr/lib/python3.9/site-packages/requests/adapters.py", line 532, in send
    raise ReadTimeout(e, request=request)
requests.exceptions.ReadTimeout: HTTPSConnectionPool(host='commons.wikimedia.org', port=443): Read timed out. (read timeout=90)

WARNING: Waiting 120.0 seconds before retrying.
WARNING: Waiting 120.0 seconds before retrying.
ERROR: An error occurred for uri https://commons.wikimedia.org/w/api.php
ERROR: Traceback (most recent call last):
  File "/cygdrive/c/Users/yannf/OneDrive/Documents/core_stable/pywikibot/data/api/_requests.py", line 681, in _http_request
    response = http.request(self.site, uri=uri,
  File "/cygdrive/c/Users/yannf/OneDrive/Documents/core_stable/pywikibot/comms/http.py", line 282, in request
    r = fetch(baseuri, headers=headers, **kwargs)
  File "/cygdrive/c/Users/yannf/OneDrive/Documents/core_stable/pywikibot/comms/http.py", line 448, in fetch
    callback(response)
  File "/cygdrive/c/Users/yannf/OneDrive/Documents/core_stable/pywikibot/comms/http.py", line 333, in error_handling_callback
    raise response from None
  File "/cygdrive/c/Users/yannf/OneDrive/Documents/core_stable/pywikibot/comms/http.py", line 439, in fetch
    response = session.request(method, uri,
  File "/usr/lib/python3.9/site-packages/requests/sessions.py", line 529, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/lib/python3.9/site-packages/requests/sessions.py", line 645, in send
    r = adapter.send(request, **kwargs)
  File "/usr/lib/python3.9/site-packages/requests/adapters.py", line 501, in send
    raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))

WARNING: Waiting 120.0 seconds before retrying.
WARNING: Waiting 120.0 seconds before retrying.
WARNING: Waiting 120.0 seconds before retrying.
WARNING: Waiting 120.0 seconds before retrying.
WARNING: Waiting 120.0 seconds before retrying.
WARNING: Waiting 120.0 seconds before retrying.
Upload failed. Retrying...
ERROR: An error occurred for uri https://commons.wikimedia.org/w/api.php
ERROR: Traceback (most recent call last):
  File "/cygdrive/c/Users/yannf/OneDrive/Documents/core_stable/pywikibot/data/api/_requests.py", line 681, in _http_request
    response = http.request(self.site, uri=uri,
  File "/cygdrive/c/Users/yannf/OneDrive/Documents/core_stable/pywikibot/comms/http.py", line 282, in request
    r = fetch(baseuri, headers=headers, **kwargs)
  File "/cygdrive/c/Users/yannf/OneDrive/Documents/core_stable/pywikibot/comms/http.py", line 448, in fetch
    callback(response)
  File "/cygdrive/c/Users/yannf/OneDrive/Documents/core_stable/pywikibot/comms/http.py", line 333, in error_handling_callback
    raise response from None
  File "/cygdrive/c/Users/yannf/OneDrive/Documents/core_stable/pywikibot/comms/http.py", line 439, in fetch
    response = session.request(method, uri,
  File "/usr/lib/python3.9/site-packages/requests/sessions.py", line 529, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/lib/python3.9/site-packages/requests/sessions.py", line 645, in send
    r = adapter.send(request, **kwargs)
  File "/usr/lib/python3.9/site-packages/requests/adapters.py", line 532, in send
    raise ReadTimeout(e, request=request)
requests.exceptions.ReadTimeout: HTTPSConnectionPool(host='commons.wikimedia.org', port=443): Read timed out. (read timeout=90)

The file uploaded after many retries:
https://commons.wikimedia.org/wiki/File:Landlady_04_dostoyevsky.mp3

I guess you have modified a script (don't know which one) with the following (seen in T338969):

retry_count = 3  # Number of upload retries
success = False

while retry_count > 0 and not success:
    try:
        filepage = pywikibot.FilePage(site, key)
        filepage.upload(
            source=filename,
            comment=desc_content,
            text=desc_content,
            chunk_size = "5M",
            ignore_warnings=False,
        )
        success = True
    except Exception as e:
        print("Upload failed. Retrying...")
        retry_count -= 1

if not success:
    print("Upload failed after retries.")

Pywikibot already retries uploading 15 times by default. You change increases this 3 times which means 45 retries for timeouts. You already have increased the socket timeouts and I have no glue what to do in Pywikibot site, seems it is a server side problem on commons' API.

The Python script I use now which produces the errors above:

while retry_count > 0 and not success:
    try:
        filepage = pywikibot.FilePage(site, key)
        filepage.upload(
            source=filename,
            comment=desc_content,
            text=desc_content,
            chunk_size = "5M",
            ignore_warnings=False,
        )
        success = True
    except Exception as e:
        print("Upload failed. Retrying...")
        retry_count -= 1

if not success:
    print("Upload failed after retries.")

Hi, I removed the retry_count, and I now get another issue:

Code
except exceptions.UploadError as e:
    # https://doc.wikimedia.org/pywikibot/master/api_ref/exceptions.html#exceptions.UploadError
    print(f"Upload failed with error code: {e.code}")
    print(f"Error message: {e.message}")
    print(f"File key: {e.file_key}")
    print(f"Offset: {e.offset}")
Error
File: workspoevol4_05_poe.mp3
Title: 05 - The System of Doctor Tarr and Professor Fether
https://www.archive.org/download/poeraven_vol4_1308_librivox/workspoevol4_05_poe.mp3
Sleeping for 3.3 seconds, 2023-07-01 21:43:47
WARNING: Waiting 5.0 seconds before retrying.
WARNING: Waiting 10.0 seconds before retrying.
WARNING: /cygdrive/c/Users/yannf/OneDrive/Documents/core_stable/pywikibot/site/_upload.py:133: UserWarning: The upload returned 2 warnings: exists, nochange
  return self._upload(self.ignore_warnings, self.report_success)

Upload failed with error code: exists
Error message: File Workspoevol4_05_poe.mp3 already exists.
File key: None
Offset: False
Upload failed after retries.

The file uploaded successfully before the bot retried uploading it: https://commons.wikimedia.org/wiki/File:Workspoevol4_05_poe_128kb.mp3 (37.69 MB)
It seems the server didn't send a successful upload message, that's why the bot retried uploading the file after a successful upload.
This is not a single occurrence of this behavior. I have many examples of such cases.

This still happens time to time. Any possible fix? https://commons.wikimedia.org/wiki/File:Atlas_von_Asien_(15177030).jpg was never uploaded by the bot after many attempts, while it uploaded fine manually. All other files in the same category uploaded fine: https://commons.wikimedia.org/wiki/Category:Atlas_von_Asien

I have no idea what to do here on Pywikibot side except useing wait cycles which are already implemented. You may increase the maximum time to wait before resubmitting a failed API request (retry_max), which is two minutes by default. You can do it in your user-config.py or as global option e.g: pwb -retry_max:3600 for one hour which retries over 6 hours until the script fails. Ok this looks very inefficient and the right way would be to use a queue for retrying. (T342147)