On wikidata.org, api login returns wrong lgtoken (object replacement character) in response body; correct in HTTP header
Closed, ResolvedPublic

Description

the lgtoken attribute (which is need to create a valid cookie for future requests) is invalid after successful login.

request (POST):
action=login&lgpassword=xxx&lgname=MerlIwBot

response:
<api><login result="NeedToken"
token="087cb80705ed9564d18c3309fae1ec"
cookieprefix=wikidatawiki
sessionid="9b4c867c526e06f0e55ee395287ff0"
/></api>

request (POST):
action=login&lgtoken=087cb80705ed9564d18c3309fae1ec&lgpassword=xxx&lgname=MerlIwBot

response:
<api><login result="Success"
lguserid="3280"
lgusername="MerlIwBot"
lgtoken="��������������������������������"
cookieprefix=wikidatawiki
sessionid="9b4c867c526e06af0e55ee39f287ff0"
/></api>

password, sessionid and first token is a bit modified for this paste of course. The problem is that the lgtoken is invalid because it only contains character hex:EF BF BD (which is "object replacement character" according to unicode table)

Login on all other wmf wikis and wikidata-test-repo.wikimedia.de work for me.


Version: unspecified
Severity: normal

Details

Reference
bz41586
bzimport set Reference to bz41586.
Merl created this task.Oct 31 2012, 2:04 PM
Reedy added a comment.Oct 31 2012, 4:32 PM

WFM

"<?xml version=\"1.0\"?><api><login result=\"Success\" lguserid=\"2\" lgusername=\"Reedy\" lgtoken=\"3e07eb6f2406cf44c5f222884aa0ac83\" cookieprefix=\"wikidatawiki\" sessionid=\"09cdced66371121d024c7082f8a0c0f4\" /></api>"

Merl added a comment.Oct 31 2012, 6:09 PM

If i am additionally using centralauth cookie (read from header instead builded from api response) it works for me, too. But not without centralauth cookie as it is described at http://www.mediawiki.org/wiki/API:Login#How_to_log_in

Using the API sandbox, I can request a token without any problems:

http://www.wikidata.org/wiki/Special:ApiSandbox#action=login&format=json&lgname=Foo&lgpassword=Bar

(no, that's not my password. It will generate a valid token anyway).

Closing WFM. Please supply a minimal test case to reproduce the issue if you want to reopoen it.

I can reproduce the problem now for wikidata.org. The token is returned correctly as an HTTP header it seems, but not in the response body. The same request works fine on en.wikipedia.org and also with my local install with wikibase enabled.

There must be something odd going on during token creation, but I have no idea what.

Merl added a comment.Dec 14 2012, 11:42 AM

Test script:

#!/bin/bash
#FILENAME: login.sh
#USAGE: login.sh USERNAME PASSWORD DOMAIN
USERNAME="$1"
PASSWD="$2"
DOMAIN="${3:-www.wikidata.org}"

RESP1=wget -qO- --no-cookies --user-agent=LOGIN --post-data "action=login&lgname=${USERNAME}&lgpassword=${PASSWD}&format=xml" http://${DOMAIN}/w/api.php
echo "${RESP1}"
TOKEN=echo "${RESP1}" | sed -ne '/<login/ { s/.*token="\([^"]*\)".*/\1/; p }'
SESSIONID=echo "${RESP1}" | sed -ne '/<login/ { s/.*sessionid="\([^"]*\)".*/\1/; p }'
COOKIEPREFIX=echo "${RESP1}" | sed -ne '/<login/ { s/.*cookieprefix="\([^"]*\)".*/\1/; p }'

wget -qO- --no-cookies --user-agent=LOGIN --post-data "action=login&lgname=${USERNAME}&lgpassword=${PASSWD}&lgtoken=${TOKEN}&format=xml" --header "Cookie: ${COOKIEPREFIX}_session=${SESSIONID}" http://${DOMAIN}/w/api.php


works on dewiki:
$ bash login.sh MerlIwBot XXX de.wikipedia.org
<?xml version="1.0"?><api><login result="NeedToken" token="1b0e04d8ec9cd7f28210e2fe8c1cad86" cookieprefix="dewiki" sessionid="67c2c8150e3fa887e86d3e9b19722449" /></api>
<?xml version="1.0"?><api><login result="Success" lguserid="1234" lgusername="MerlIwBot" lgtoken="7874ac3a1851e9f93af3912bf4b00d6b" cookieprefix="dewiki" sessionid="67c2c8150e3fa887e86d3e9b19722449" /></api>

invalid lgtoken on wikidata.org:
$ bash login.sh MerlIwBot XXX www.wikidata.org
<?xml version="1.0"?><api><login result="NeedToken" token="da90161c74826a5629029fc857970660" cookieprefix="wikidatawiki" sessionid="92b3e5f5e6cdd7145516732b79ca5845" /></api>
<?xml version="1.0"?><api><login result="Success" lguserid="1234" lgusername="MerlIwBot" lgtoken="��������������������������������" cookieprefix="wikidatawiki" sessionid="92b3e5f5e6cdd7145516732b79ca5845" /></api>

@andre: assigning this to wikidata-bugs is not helpful, since it doesn't seem to be a problem with the Wikibase software. This issue is not present on any other system running Wikibase, and Wikibase doesn't mess with the login process or tokens. This seems to be a configuration issue, or some oddity related to having a wiki with no subdomain.

In any case, we at the very least need help from someone with shell access for debugging this. So I think it makes sense to keep this on wikibugs-l.

Daniel: I didn't assign it on purpose, it's just the default assignee.
If it's a config issue I'll move it to "Site requests".

I wonder if we in the beginning asked for a new token as we used a special itemtoken. That would reset the other tokens to. I think the code is gone now as we use the edittoken, but I'm not completly sure.

After checking the code I'm pretty sure its gone.

(In reply to comment #9)

I wonder if we in the beginning asked for a new token as we used a special
itemtoken.

Even then, the token would change, but it would not suddenly consist of invalid characters.

(In reply to comment #8)

Daniel: I didn't assign it on purpose, it's just the default assignee.

Ah, you changed the component, now I see.

If it's a config issue I'll move it to "Site requests".

*If* it's a config issue - but I'm afraid we'll need some life-site-debugging to find that out.

daniel added a comment.Jan 4 2013, 2:01 PM

Created attachment 11586
bash script for testing the API response

Provide any name and password as first and second parameters to check the token.

attachment bug-41586.sh ignored as obsolete

daniel added a comment.Jan 4 2013, 2:03 PM

I have attached the test script provided by merl at https://www.wikidata.org/wiki/Wikidata:Contact_the_development_team/Archive/2012/12#API:_Login_doesn.27t_work.3F

Running it now, it seems to work:

bash ~/src/bug-41586.sh foo bar
<?xml version="1.0"?><api><login result="NeedToken" token="cd3f7d845bef68f23fd4f9f37f1c1fc7" cookieprefix="wikidatawiki" sessionid="8181ac1523df3d24ec18d1be4fec8153" /></api>

<?xml version="1.0"?><api><login result="NotExists" /></api>

Please check whether it works for you now. Otherwise, we can close the bug. Would be nice to understand what was going on, though...

Merl:
Please check whether it works for you now.

Decreasing priority for the time being, as it's currently not reproducible as per comment 9 and 13.

Sorry, I was wrong, I *can* reproduce it: if I run the test script with a valid user name and password, I get back a broken lgtoken.

WFM (with Daniels script)...

(In reply to comment #17)

WFM (with Daniels script)...

with a valid name & password? can you post the output, please?

Reedy: Could you answer comment 18?

Comment in the RT ticket is:
"I think this rather needs attention from wikidata dev people, which would be Wikimedia Deutschland people almost entirely, afaik"

I get the same results as in comment 6, using the Bash script in https://www.wikidata.org/wiki/Wikidata:Contact_the_development_team/Archive/2012/12#API:_Login_doesn.27t_work.3F

$:andre\> ./login.sh Malyacko xxxxxxxx de.wikipedia.org
<?xml version="1.0"?><api><login result="NeedToken" token="48dc7df07df9d9b6529c7d56f33c2eb5" cookieprefix="dewiki" sessionid="392cc0b55fe5b7594d3fd41536421c89" /></api>

<?xml version="1.0"?><api><login result="Success" lguserid="1220085" lgusername="Malyacko" lgtoken="f409b84cd9049810ecf8db70eac8e9f4" cookieprefix="dewiki" sessionid="392cc0b55fe5b7594d3fd41536421c89" /></api>

$:andre\> ./login.sh Malyacko xxxxxxxx www.wikidata.org
<?xml version="1.0"?><api><login result="NeedToken" token="aa8e747286dc98018d750cac5ee62b6b" cookieprefix="wikidatawiki" sessionid="b54e252498f7c97280965e4736c4035b" /></api>

<?xml version="1.0"?><api><login result="Success" lguserid="37514" lgusername="Malyacko" lgtoken="��������������������������������" cookieprefix="wikidatawiki" sessionid="b54e252498f7c97280965e4736c4035b" /></api>

(In reply to comment #20)

Comment in the RT ticket is:
"I think this rather needs attention from wikidata dev people, which would be
Wikimedia Deutschland people almost entirely, afaik"

I do not see how the Wikidata team can do anything about this.

  • We do not deal with login code or token generation at all. The problem occurs with a core API call that has nothing to do with the Wikibase extension.
  • The problem is exclusive to wikidata.org. We can not reproduce it on any of our test systems.
  • Nobody in the Wikidata team has the access level necessary to investigate the issue.

So, what do you suggest the Wikidata team can do about the problem?

An extended debugging session by Aaron and Tim unearthed bad tokens appearing in the database due to issues with a schema change. Fixes are on gerrit, see I92f1645d4 and I3529fe8af

FIXED?

https://wikitech.wikimedia.org/w/index.php?title=Server_Admin_Log&diff=61872&oldid=61871 says:

  • 21:31 Tim: running resetUserTokens.php on all wikis, in screen on hume, to fix user_token field corruption ([[bugzilla:41586|bug 41586]])
daniel added a comment.Mar 8 2013, 9:05 PM

I am still getting the broken token.

daniel added a comment.Mar 8 2013, 9:09 PM

Created attachment 11899
bash script for testing the API response

Improved test script: no password on the command line, runs output through hexdump.

Attached: login-test.sh

daniel added a comment.Mar 8 2013, 9:13 PM

Still open. Corrupt tokens are still in the database:

[22:07] <AaronSchulz> Tim was running a script but it stalled out and needed to be batched
[22:07] <AaronSchulz> I don't think he ever rewrote it, so the rows are probably still broken

Tim / Aaron: How can we get some progress here in order to get this fixed?

Works for me now with the script in comment 25, token seems to be correct.

Daniel Kinzler: Can you confirm?

Works for me now with the script in comment 25, token seems to be correct.

Daniel Kinzler: Can you confirm?

No answer => Closing.

I do not know whether this is fixed for all users. The issue is rooted in a corrupt entry in the user table (ask Aaron or Tim for details). So, this issue would affect some users but not others, until all user entries in the database have been fixed. I do not have a way to check this.

Reopening as per last comment.

I'll take a look at making resetUserTokens.php more efficient, and then we can find someone to run it for the affected wikis.

Aaron, could you find out which wikis had the wrong column definition?

aaron added a comment.Jun 27 2013, 4:38 PM

(In reply to comment #33)

I'll take a look at making resetUserTokens.php more efficient, and then we
can
find someone to run it for the affected wikis.

Aaron, could you find out which wikis had the wrong column definition?

It really should just be run on all wikis.

The script can be made faster by:
(a) Batching the SELECT by user_id
(b) Avoiding the User function to update the token and doing it directly
(c) Batching these token updates
(d) A script option can be added to skip rows with valid tokens (the bad ones are all \0, good ones should have the right length and hex chars)

Change 77136 had a related patch set uploaded by CSteipp:
Efficiently reset null user tokens

https://gerrit.wikimedia.org/r/77136

Change 77136 merged by jenkins-bot:
Efficiently reset null user tokens

https://gerrit.wikimedia.org/r/77136

Patch has been merged. Reedy, can you run maintenance/resetUserTokens.php on the cluster at some point?

Reedy, can you run maintenance/resetUserTokens.php on the cluster at some point?

greg added a comment.Oct 18 2013, 10:31 PM

This should actually be done, as a consequence of the fun potential data breach response (all tokens were reset on affected wikis, and wikidata was one).

Add Comment