When I start weblinkchecker in pywikibot-core
python pwb.py weblinkchecker -start:!
with MediaWiki 1.23.7 and python 2.7.8
I get the following warning multiple times:
"WARNING: API warning (query): Too many values supplied for parameter 'pageids': the limit is 50
and only a subset of pages are scanned.
This refers to 9660f18689130835a27eb67d90aad71157520bd3 of https://gerrit.wikimedia.org/r/pywikibot/core.git
The MediaWiki user is in the bot group which has the following permission:
$wgGroupPermissions['bot']['apihighlimits'] = true;
The only way to get the script working as expected was the following patch:
diff --git a/scripts/weblinkchecker.py b/scripts/weblinkchecker.py index d4f511b..87ecebb 100644 --- a/scripts/weblinkchecker.py +++ b/scripts/weblinkchecker.py @@ -902,9 +902,8 @@ def main(*args): if gen: if namespaces != []: gen = pagegenerators.NamespaceFilterPageGenerator(gen, namespaces) - # fetch at least 240 pages simultaneously from the wiki, but more if - # a high thread number is set. - pageNumber = max(240, config.max_external_links * 2) + # fetch at 50 pages simultaneously from the wiki + pageNumber = 50 gen = pagegenerators.PreloadingGenerator(gen, step=pageNumber) gen = pagegenerators.RedirectFilterPageGenerator(gen) bot = WeblinkCheckerRobot(gen, HTTPignore, day)
Of course this is not the patch I recommend to apply but a workaround for me to make weblinkchecker usable.