Page MenuHomePhabricator

invalid CSRF token error shown with each block
Closed, ResolvedPublicBUG REPORT

Description

I am using a bot to block open proxies; the bot works without any problem (blocks are successfully done) but after each block I see this warning on terminal:

WARNING: API error badtoken: Invalid CSRF token

The bot's code can be seen here.

Event Timeline

Xqt triaged this task as High priority.Jul 30 2019, 1:10 AM
Xqt changed the subtype of this task from "Task" to "Bug Report".
Xqt added a subscriber: Framawiki.

I did more examinations of this. My bot will issue several blocks each time I run it. The first block never causes the warning to show up; all subsequent blocks will. I will try to investigate more, but thought sharing it here could help others who may also be investigating this.

Can't reproduce with a loop blocking five accounts on a dev wiki.
@Huji are you sure that the warning comes from the line where the block is made ?

That is my best guess. Here is an excerpt of the relevant portions of the output of my bot in one of its recent runs:

...
Checking 5.113.58.172
Checking 38.91.100.235
Logging in to wikipedia:fa as HujiBot
WARNING: API warning (login): Main-account login via "action=login" is deprecated and may stop working without warning. To continue login with "action=login", see [[Special:BotPasswords]]. To safely continue using main-account login, see "action=clientlogin".
WARNING: API warning (main): Subscribe to the mediawiki-api-announce mailing list at <https://lists.wikimedia.org/mailman/listinfo/mediawiki-api-announce> for notice of API deprecations and breaking changes. Use [[Special:ApiFeatureUsage]] to see usage of deprecated features by your application.
WARNING: API error badtoken: Invalid CSRF token.
Sleeping for 4.8 seconds, 2019-08-14 00:05:33
Checking 185.176.57.25
...

38.91.100.235 was the fifth IP it blocked in that run. You can see that the CSRF warning is shown right around when the block is done.

Huji renamed this task from invalid CSRF token error shown after each block to invalid CSRF token error shown with each block.Aug 14 2019, 12:59 AM

Could you give BotParsswords or OAuth a try, please, and see if the same Invalid CSRF token error occurs or not?

Sure, I will give OAuth a try. But first, I need someone to add my bot to the "confirmed" group on Meta.

Alright, I tried OAuth and when I was creating the consumer on meta, I made sure to check "Block and unblock users". However, when the bot gets to the point that it tries to block an IP I get this error message:

...
Checking 109.169.72.36
Traceback (most recent call last):
  File "pwb.py", line 297, in <module>
    if not main():
  File "pwb.py", line 292, in main
    run_python_file(filename, [filename] + args, argvu, file_package)
  File "pwb.py", line 96, in run_python_file
    main_mod.__dict__)
  File "./scripts/userscripts/findproxy.py", line 236, in <module>
    robot.find_proxies()
  File "./scripts/userscripts/findproxy.py", line 209, in find_proxies
    anononly=False, allowusertalk=True)
  File "/home/huji/bot/pywikibot/site.py", line 1320, in callee
    self.login(True)
  File "/home/huji/bot/pywikibot/site.py", line 2069, in login
    raise NoUsername('No sysop is permitted with OAuth')
pywikibot.exceptions.NoUsername: No sysop is permitted with OAuth

Looking at Site.py it seems like the code actually disallows using OAuth for sysop accounts. (I believe it is because OAuth does not provide a way to ascertain the username associated with a consumer .. but should it matter?)

Anyway, unless I am doing something overtly wrong, OAuth is not going to be the answer here. Let me see if I can get BotPasswords to work.

With BotPasswords, the CSRF error was not shown when the first block was done. When a second block was attempted, I got a Login failed error. Here are some relevant portions of the log:

...
Checking 5.250.46.252
Checking 109.169.72.36
Logging in to wikipedia:fa as HujiBot@HujiBot
Checking 188.229.7.134
Checking 5.106.138.33
...
Checking 2A02:A03F:52DB:3500:51F8:75AD:3798:726C
Checking 204.14.73.69
Logging in to wikipedia:fa as HujiBot@HujiBot
ERROR: Login failed (Aborted).
Password for user HujiBot@HujiBot on wikipedia:fa (no characters will be shown):

Note that my bot successfully blocked 109.169.72.36 (and that is where the first login attempt happened and was successful); later, it tried to block 204.14.73.69 and, because each block must be preceded with a new login, it tried to log in again and failed. This is all using BotPasswords.

I believe it is because OAuth does not provide a way to ascertain the username associated with a consumer .. but should it matter?

Sounds like T142303 (which, btw, I didn't know about, sorry about that).

With BotPasswords, the CSRF error was not shown when the first block was done. When a second block was attempted, I got a Login failed error. Here are some relevant portions of the log:

...
Checking 5.250.46.252
Checking 109.169.72.36
Logging in to wikipedia:fa as HujiBot@HujiBot
Checking 188.229.7.134
Checking 5.106.138.33
...
Checking 2A02:A03F:52DB:3500:51F8:75AD:3798:726C
Checking 204.14.73.69
Logging in to wikipedia:fa as HujiBot@HujiBot
ERROR: Login failed (Aborted).
Password for user HujiBot@HujiBot on wikipedia:fa (no characters will be shown):

Note that my bot successfully blocked 109.169.72.36 (and that is where the first login attempt happened and was successful); later, it tried to block 204.14.73.69 and, because each block must be preceded with a new login, it tried to log in again and failed. This is all using BotPasswords.

I cannot reproduce. The following script worked fine for me:

import pywikibot
from pywikibot import Site

proxies = ('198.16.74.205', '204.14.73.69', '185.217.117.2')

class FindProxyBot():

    def __init__(self):
        self.site = Site('fa', 'wpbeta')
        self.target = 'ویکی‌پدیا:گزارش دیتابیس/کشف پروکسی'
        self.summary = 'روزآمدسازی نتایج (وظیفه ۲۲)'
        self.blocksummary = '{{پروکسی باز}}'

    def find_proxies(self):

        for ip in proxies:
            pywikibot.output('Checking %s' % ip)
            target = pywikibot.User(self.site, ip)
            if not target.isBlocked():
                self.site.blockuser(
                    target, '1 year', self.blocksummary,
                    anononly=False, allowusertalk=True)


robot = FindProxyBot()
robot.find_proxies()

Result:

Logging in to wpbeta:fa as Dalba@Dalba
Checking 198.16.74.205
Checking 204.14.73.69
Checking 185.217.117.2

I wonder why you're being prompted for typing your password again, maybe your user-config.py/password_file is not configured properly?

I have the following lines in mine:

user-config.py:

usernames['*']['*'] = 'Dalba'
sysopnames['*']['*'] = 'Dalba'
password_file = 'user-password.py'

user-password.py:

('wpbeta', 'Dalba', BotPassword('Dalba', '<redacted>'))

I wonder why you're being prompted for typing your password again, maybe your user-config.py/password_file is not configured properly?

You should delete pywikibot.lwp file or better download/clone Pywikibot freshly to the new folder

I wonder why you're being prompted for typing your password again, maybe your user-config.py/password_file is not configured properly?

I have the following lines in mine:

user-config.py:

usernames['*']['*'] = 'Dalba'
sysopnames['*']['*'] = 'Dalba'
password_file = 'user-password.py'

Mine is like this:

usernames['wikipedia']['fa'] = 'HujiBot'
sysopnames['wikipedia']['fa'] = 'HujiBot'
password_file = "/home/huji/bot/user-password.py"

user-password.py:

('wpbeta', 'Dalba', BotPassword('Dalba', '<redacted>'))

Mine is like this:

('HujiBot', BotPassword('HujiBot', 'REDACTED'))

I wonder why you're being prompted for typing your password again, maybe your user-config.py/password_file is not configured properly?

You should delete pywikibot.lwp file or better download/clone Pywikibot freshly to the new folder

I deleted the file (which essentially clears the cookies for pywikibot). But when I re-ran the bot I still run into the same issue (of it asking me to log in again). This is based on a fresh copy of Pywikibot as of 8b35e732403fb7

@Dalba I just ran this simplified bot on fawiki and ran into the same issue (of it asking me to login again):

import pywikibot
from pywikibot import Site

proxies = ('198.16.74.205', '204.14.73.69', '185.217.117.2')

class FindProxyBot():

    def __init__(self):
        self.site = Site('fa', 'wikipedia')
        self.target = 'ویکی‌پدیا:گزارش دیتابیس/کشف پروکسی'
        self.summary = 'روزآمدسازی نتایج (وظیفه ۲۲)'
        self.blocksummary = '{{پروکسی باز}}'

    def find_proxies(self):

        for ip in proxies:
            pywikibot.output('Checking %s' % ip)
            target = pywikibot.User(self.site, ip)
            if target.isBlocked():
                pywikibot.output('Unblocking %s' % ip)
                self.site.unblockuser(target)
            pywikibot.output('Blocking %s' % ip)
            self.site.blockuser(
                target, '1 year', self.blocksummary,
                anononly=False, allowusertalk=True)


robot = FindProxyBot()
robot.find_proxies()

Of note, the bot first unblocked the previously bloeckd IP, and when it tried to block it again, it failed. See Special:Logs/HujiBot

@Huji, are having multiple pywikibot processes running on the same machine? Maybe this is the result of a race condition between them. If so, try it with a single process per user.

No. Only one instance, run in solitude.

Mine is like this:

('HujiBot', BotPassword('HujiBot', 'REDACTED'))

Unless your wiki's family name is HujiBot, it should be rewritten like this:

('wikipedia', BotPassword('HujiBot', 'REDACTED'))

https://www.mediawiki.org/wiki/Manual:Pywikibot/BotPasswords#password_file_entries_format has explained the meaning of each item in those those entries.

Correct. Here is how I have it now:

('fa', 'wikipedia', 'HujiBot', BotPassword('HujiBot', 'REDACTED'))

I verified that login.py can be run with this configuration without any issues. I have also updated the bot script so that it explicitly says when it is trying to block an IP. When I ran my bot again. It did one block successfully but asked for my password when attempting a second block. Here are the logs:

...
Checking 5.210.61.111
Checking 136.244.84.19
Blocking 136.244.84.19
Logging in to wikipedia:fa as HujiBot@HujiBot
Checking 89.198.39.196
Checking 83.123.81.153
Checking 5.209.53.251
Checking 5.235.14.22
Checking 31.56.92.175
Checking 31.56.98.144
Checking 31.2.244.214
Checking 5.127.70.237
Checking 5.117.132.109
Checking 86.55.175.176
Checking 5.122.72.65
Checking 83.123.54.83
Checking 85.93.89.107
Blocking 85.93.89.107
Logging in to wikipedia:fa as HujiBot@HujiBot
ERROR: Login failed (Aborted).
Password for user HujiBot@HujiBot on wikipedia:fa (no characters will be shown):

I'm curious to know if e.info contains any more details. Adding the following print statement to login.py should print it:

pywikibot-core[master]
$ git diff
diff --git a/pywikibot/login.py b/pywikibot/login.py
index 3bc76619..b9bee2b2 100644
--- a/pywikibot/login.py
+++ b/pywikibot/login.py
@@ -304,6 +304,7 @@ class LoginManager(object):
         except pywikibot.data.api.APIError as e:
             error_code = e.code
             pywikibot.error('Login failed ({}).'.format(error_code
))
+            print(e.info, self.login_name, self.password)
             if error_code in self._api_error:
                 error_msg = 'Username "{}" {} on {}'.format(
                     self.login_name, self._api_error[error_code],
self.site)

(also check username and password)

First of all, I found out something really interesting: when I run the bot against test.wikipedia.org it works without any issues. When I run it against fa.wikipedia.org I get that Login Failed error followed by the script asking for my password.

This made me remember that on fawiki we do not assign admin bots to the "sysop" group; instead, we assign them to the "botadmin" group. I added my bot to the "sysop" group as well and re-run it, and it did not show any error messages. So whatever is the issue has to do with the "botadmin" group (or more specifically, with the differences between its rights and that of the "sysop" group).

Comparing the rights of the two groups, the only thing that stood out was that "sysop" has the "noratelimit" right but "botadmin" does not. But my bot is both in the "botadmin" and the "bot" group and the latter has the "noratelimit" right. To further make sure this is *not* the cause of the problem, I modified the script so that it would wait 20 seconds before each blocking and it still showed the Login Failed error.

With that knowledge, I also did what you asked me, @Dalba, and here is the output:

Cannot log in when using MediaWiki\Session\BotPasswordSessionProvider sessions. HujiBot@HujiBot REDACTED

The value of the REDACTED is the same as my BotPasswords key, obviously. I am not sure how to interpret that error message.

This can potentially be a MediaWiki bug (such as a hardcoded "sysop" value somewhere in the API code), so I am going to add a MW tag as well.

And here is a comparison of the rights of the "sysop" group to those my bot holds by being in both "bot" and "botdamin" groups:

rightsysopbot or botadmin
abusefilter-logX
abusefilter-log-detailX
abusefilter-log-privateX
abusefilter-modifyX
abusefilter-modify-restrictedX
abusefilter-revertX
abusefilter-viewX
abusefilter-view-privateX
apihighlimitsXX
autoconfirmedXX
autopatrolXX
autoreviewX
blockXX
blockemailX
botX
browsearchiveX
createaccountX
deleteXX
deletechangetagsX
deletedhistoryX
deletedtextX
deletelogentryX
deleterevisionX
editcontentmodelX
editinterfaceX
editprotectedXX
editsemiprotectedXX
editsitejsonX
edituserjsonX
extendedconfirmedXX
flow-deleteX
flow-edit-postX
flow-lockX
globalblock-whitelistX
importX
ipblock-exemptXX
managechangetagsX
markboteditsX
massmessageX
mergehistoryXX
moveX
move-categorypagesXX
move-rootuserpagesX
move-subpagesX
movefileX
movestableX
nominornewtalkX
noratelimitXX
nukeX
oathauth-enableXX
oathauth-view-logX
override-antispoofX
patrolX
protectXX
reuploadX
reupload-sharedX
reviewX
rollbackX
skipcaptchaXX
stablesettingsX
suppressredirectXX
tboverrideX
titleblacklistlogX
transcode-resetX
transcode-statusX
undeleteXX
unwatchedpagesX
uploadX
urlshortener-create-urlX
validateX
writeapiX

First of all, I found out something really interesting: when I run the bot against test.wikipedia.org it works without any issues. When I run it against fa.wikipedia.org I get that Login Failed error followed by the script asking for my password.

This made me remember that on fawiki we do not assign admin bots to the "sysop" group; instead, we assign them to the "botadmin" group. I added my bot to the "sysop" group as well and re-run it, and it did not show any error messages. So whatever is the issue has to do with the "botadmin" group (or more specifically, with the differences between its rights and that of the "sysop" group).

This can potentially be a MediaWiki bug (such as a hardcoded "sysop" value somewhere in the API code), so I am going to add a MW tag as well.

I think this is pywikibot's @must_be decorator attempting to log the bot in as a sysop, which your bot couldn't do without being in the sysop group.
T71283: dualism between user and sysop needs to be overtaken

Huji removed a project: MediaWiki-User-management.

I think I found the problem. Removing the MW tag, because this is indeed a Pywikibot bug. And my guess was on point: hard coded "sysop" values are the cause. You can find them here and here. Essentially, Pywikibot is being presumptuous that *only* sysops can block. This is wrong, and instead of checking the user group, the rights should be checked. I will submit a patch shortly, which fixes my problem and also avoids other similar problems in the future.

Change 531589 had a related patch set uploaded (by Huji; owner: Huji):
[pywikibot/core@master] Check a user's rights, not groups, to ascertain permissions

https://gerrit.wikimedia.org/r/531589

I just confirmed that the patch above fixes the issue both using the traditional username and password based user configuration, as well as using the BotPasswords configuration.

The OAuth approach does not work, but that is not due to the issue discussed in this task, so I will open a separate task for it.

First of all, I found out something really interesting: when I run the bot against test.wikipedia.org it works without any issues. When I run it against fa.wikipedia.org I get that Login Failed error followed by the script asking for my password.

This made me remember that on fawiki we do not assign admin bots to the "sysop" group; instead, we assign them to the "botadmin" group. I added my bot to the "sysop" group as well and re-run it, and it did not show any error messages. So whatever is the issue has to do with the "botadmin" group (or more specifically, with the differences between its rights and that of the "sysop" group).

This can potentially be a MediaWiki bug (such as a hardcoded "sysop" value somewhere in the API code), so I am going to add a MW tag as well.

I think this is pywikibot's @must_be decorator attempting to log the bot in as a sysop, which your bot couldn't do without being in the sysop group.
T71283: dualism between user and sysop needs to be overtaken

Did not mean to ignore you here.

The patch I submitted will do away with checking membership in the "sysop" group. Group membership is essentially irrelevant. Rights matter, groups don't. Now that we can check a user's rights, we should have a much easier time doing away with the user/sysop dualism.

I think the bot shouldn't check user rights neither. Do what the bot owner has instructed it to do, and if you can't by permissions, the api will return a relevant error.

@Ciencia_Al_Poder I agree with you in essence. All of these checks are a side effect of the fact that we have historically allowed a user to run *one* bot script with a configuration that includes *more than one* user account (one normal account, one sysop account). This is archaic, and if it was up to me, I would immediately drop it. And I am not alone in that point of view; see T71283#1040612

But until we solve that throughout the code, I would like to at least solve this very specific case through a minimal change in the patch I posted above.

Yes, the functionality is archaic and for todays needs of sysops (one sysop account, one bot account) it does not work well anyway (Pywikibot should support some easy switching between two accounts in the future). Also I think the functionality to set different account for different language is a little bit archaic, yes, there are some exclusions, but mostly Wikis are using SUL nowadays

Change 531589 merged by jenkins-bot:
[pywikibot/core@master] Check a user's rights before checking its group memberships

https://gerrit.wikimedia.org/r/531589