Page MenuHomePhabricator

Site.purgepages() does not handle rate limit
Closed, ResolvedPublic

Description

A rate limit is imposed on anyone but sysops and bureaucrats when using the purge function. The default value is max 30 pages per 60 seconds.

Right now there is nothing in Site.purgepages() which prevents it from sending too many requests nor is there anything in either it data/api.py or comms/http.py which can handle the api response other than raising it as a warning.
WARNING: API warning (purge): You've exceeded your rate limit. Please wait some time and try again

Event Timeline

As a work around:

import time
from pywikibot import config as pwb_config

batch_size = 30
rate_limit = 65  # default limit is 30 edits per 60 seconds
max_timeout = 300

# bump timeout
old_timeout = pwb_config.socket_timeout
pwb_config.socket_timeout = max_timeout

while True:
    batch = pages[:batch_size]
    pages = pages[batch_size:]
    pre_timepoint = time.time()
    result = site.purgepages(batch, **requestparams)

    if pages:
        duration = time.time()-pre_timepoint
         time.sleep(max(0, (rate_limit-duration)))
    else:
        break

# reset timeout
pwb_config.socket_timeout = old_timeout

Where pages are the pages you wish to purge on the provided site and requestparams are any params (such as forcelinkupdate) which you might want to pass on to site.purgepages().

Even this still triggers the rate limit on rare occasions but overall was quite stable.

Xqt subscribed.

Cannot reproduce it:

import pywikibot
s = pywikibot.Site()
p = pywikibot.Page(s, 'Alan Smithee')
p.purge()
>>>
Sleeping for 4.4 seconds, 2022-03-27 17:33:00
<<<
True

throttle object is called during APISite.purgepages():

print('>>>')
result = req.submit()
print('<<<')

Ah, I see: There is a warning but no error:

self._handle_warnings(result)

if 'error' not in result:
    return result

The ratelimit handler is called for 'ratelimited' error code only

if code == 'ratelimited':
    self._ratelimited()
    continue

Isn't this a MediaWiki issue then?

Xqt triaged this task as Medium priority.

got this message with
pwb touch -purge -start:! -pt:0

Change 774015 had a related patch set uploaded (by Xqt; author: Xqt):

[pywikibot/core@master] [IMPR] Handle ratelimit with purgepages()

https://gerrit.wikimedia.org/r/774015

Behaviour befor this change:

C:\pwb\GIT\core>pwb touch -start:! -purge -pt:0
Retrieving 50 pages from wikipedia:de.
Page [[de:!!!]] purged
Page [[de:!Women Art Revolution]] purged
Page [[de:!distain]] purged
Page [[de:"]] purged
Page [[de:"Over the beach"-Fähigkeit]] purged
Page [[de:$]] purged
Page [[de:$100,000 Nanjing 2015]] purged
Page [[de:$100,000 Shenzhen 2016]] purged
Page [[de:$100,000 Shenzhen 2017]] purged
Page [[de:$100,000 Shenzhen 2018]] purged
Page [[de:$50,000 Liuzhou 2016]] purged
Page [[de:$50,000 Suzhou 2016]] purged
Page [[de:$50,000 Tianjin 2016]] purged
Page [[de:$50,000 Waco Showdown 2016]] purged
Page [[de:$50,000 Zhengzhou 2016]] purged
Page [[de:$50,000 Zhuhai 2016]] purged
Page [[de:$50SAT]] purged
Page [[de:$80,000 Waco Showdown 2017]] purged
Page [[de:$ick]] purged
Page [[de:&Me]] purged
Page [[de:&RQ]] purged
Page [[de:& Radieschen]] purged
Page [[de:(((echo)))]] purged
Page [[de:(.)p(...)nin]] purged
Page [[de:(030) Magazin]] purged
Page [[de:(1) Ceres]] purged
Page [[de:(1,5-Cyclooctadien)(1,3,5-cyclooctatrien)ruthenium]] purged
Page [[de:(10) Hygiea]] purged
Page [[de:(100) Hekate]] purged
Page [[de:(1000) Piazzia]] purged
WARNING: API warning (purge): You've exceeded your rate limit. Please wait some time and try again.
Page [[de:(10000) Myriostos]] not purged
WARNING: API warning (purge): You've exceeded your rate limit. Please wait some time and try again.
Page [[de:(100000) Astronautica]] not purged
WARNING: API warning (purge): You've exceeded your rate limit. Please wait some time and try again.
Page [[de:(10001) Palermo]] not purged
WARNING: API warning (purge): You've exceeded your rate limit. Please wait some time and try again.
Page [[de:(100027) Hannaharendt]] not purged
WARNING: API warning (purge): You've exceeded your rate limit. Please wait some time and try again.
Page [[de:(100033) Taizé]] not purged
WARNING: API warning (purge): You've exceeded your rate limit. Please wait some time and try again.
Page [[de:(10007) Malytheatre]] not purged
WARNING: API warning (purge): You've exceeded your rate limit. Please wait some time and try again.
Page [[de:(1001) Gaussia]] not purged
WARNING: API warning (purge): You've exceeded your rate limit. Please wait some time and try again.
Page [[de:(10010) Rudruna]] not purged
WARNING: API warning (purge): You've exceeded your rate limit. Please wait some time and try again.
Page [[de:(100133) Demosthenes]] not purged
WARNING: API warning (purge): You've exceeded your rate limit. Please wait some time and try again.
Page [[de:(10015) Valenlebedev]] not purged
WARNING: API warning (purge): You've exceeded your rate limit. Please wait some time and try again.
Page [[de:(1002) Olbersia]] not purged
WARNING: API warning (purge): You've exceeded your rate limit. Please wait some time and try again.
Page [[de:(10026) Sophiexeon]] not purged
WARNING: API warning (purge): You've exceeded your rate limit. Please wait some time and try again.
Page [[de:(100267) JAXA]] not purged
WARNING: API warning (purge): You've exceeded your rate limit. Please wait some time and try again.
Page [[de:(100268) Rosenthal]] not purged
...

Behaviour after this change:

C:\pwb\GIT\core>pwb touch -start:! -purge -pt:0
Retrieving 50 pages from wikipedia:de.
Page [[de:!!!]] purged
Page [[de:!Women Art Revolution]] purged
Page [[de:!distain]] purged
Page [[de:"]] purged
Page [[de:"Over the beach"-Fähigkeit]] purged
Page [[de:$]] purged
Page [[de:$100,000 Nanjing 2015]] purged
Page [[de:$100,000 Shenzhen 2016]] purged
Page [[de:$100,000 Shenzhen 2017]] purged
Page [[de:$100,000 Shenzhen 2018]] purged
Page [[de:$50,000 Liuzhou 2016]] purged
Page [[de:$50,000 Suzhou 2016]] purged
Page [[de:$50,000 Tianjin 2016]] purged
Page [[de:$50,000 Waco Showdown 2016]] purged
Page [[de:$50,000 Zhengzhou 2016]] purged
Page [[de:$50,000 Zhuhai 2016]] purged
Page [[de:$50SAT]] purged
Page [[de:$80,000 Waco Showdown 2017]] purged
Page [[de:$ick]] purged
Page [[de:&Me]] purged
Page [[de:&RQ]] purged
Page [[de:& Radieschen]] purged
Page [[de:(((echo)))]] purged
Page [[de:(.)p(...)nin]] purged
Page [[de:(030) Magazin]] purged
Page [[de:(1) Ceres]] purged
Page [[de:(1,5-Cyclooctadien)(1,3,5-cyclooctatrien)ruthenium]] purged
Page [[de:(10) Hygiea]] purged
Page [[de:(100) Hekate]] purged
Page [[de:(1000) Piazzia]] purged
WARNING: You've exceeded your rate limit.
WARNING: Waiting 3.0 seconds before retrying.
WARNING: You've exceeded your rate limit.
WARNING: Waiting 6.0 seconds before retrying.
WARNING: You've exceeded your rate limit.
WARNING: Waiting 12.0 seconds before retrying.
WARNING: You've exceeded your rate limit.
WARNING: Waiting 24.0 seconds before retrying.
WARNING: You've exceeded your rate limit.
WARNING: Waiting 48.0 seconds before retrying.
Page [[de:(10000) Myriostos]] purged
Page [[de:(100000) Astronautica]] purged
Page [[de:(10001) Palermo]] purged
Page [[de:(100027) Hannaharendt]] purged
Page [[de:(100033) Taizé]] purged
Page [[de:(10007) Malytheatre]] purged
Page [[de:(1001) Gaussia]] purged
Page [[de:(10010) Rudruna]] purged
...
Xqt removed projects: Patch-For-Review, TestMe.

Change 774015 merged by jenkins-bot:

[pywikibot/core@master] [IMPR] Handle ratelimit with purgepages()

https://gerrit.wikimedia.org/r/774015