Author: damian
I have 2 bots - ClueBot NG and ClueBot III - both using the same php class. These have previously run fine in pmtpa tools and where running fine in eqiad tools.
Since earlier today, api requests using a session id seem to fail a random amount of time - CBNG is missing maybe 20% of edits, III about 80%.
Tracing out the CURL requests - the session is kept, a successful login is done, but tokens are missing/the current user looks logged out.
A sample request:
- About to connect() to port 80 (#0)
- Trying * connected
POST /w/api.php?action=login&format=php HTTP/1.1
User-Agent: ClueBot/2.0
Accept: */*
Accept-Encoding: deflate, gzip
Content-Length: 268
Content-Type: multipart/form-data; boundary=----------------------------0882348d6018
< HTTP/1.1 200 OK
< Server: Apache
< X-Powered-By: PHP/5.3.10-1ubuntu3.9+wmf1
< X-Content-Type-Options: nosniff
< Cache-control: private
< X-Frame-Options: SAMEORIGIN
< Content-Encoding: gzip
< Vary: Accept-Encoding
< X-Vary-Options: Accept-Encoding;list-contains=gzip
< Content-Type: application/vnd.php.serialized; charset=utf-8
< X-Varnish: 2852251556, 1052575718
< Via: 1.1 varnish, 1.1 varnish
< Content-Length: 162
< Accept-Ranges: bytes
< Date: Wed, 05 Mar 2014 23:27:10 GMT
< Age: 0
< Connection: keep-alive
< X-Cache: cp1066 miss (0), cp1068 frontend miss (0)
- Added cookie enwikiSession="< REMOVED >" for domain, path /, expire 0
< Set-Cookie: enwikiSession=< REMOVED >; path=/; HttpOnly; GeoIP=::::v4; path=/
- Closing connection #0
POST: (0.065814018249512 s) (194 b)
[login] => Array ( [result] => NeedToken [token] => < REMOVED > [cookieprefix] => enwiki [sessionid] => < REMOVED > )
- About to connect() to port 80 (#0)
- Trying * connected
POST /w/api.php?action=login&format=php HTTP/1.1
User-Agent: ClueBot/2.0
Accept: */*
Accept-Encoding: deflate, gzip
Cookie: enwikiSession=< REMOVED >
Content-Length: 396
Content-Type: multipart/form-data; boundary=----------------------------ce92f93b040d
< HTTP/1.1 200 OK
< Server: Apache
< X-Powered-By: PHP/5.3.10-1ubuntu3.9+wmf1
< X-Content-Type-Options: nosniff
< Cache-control: private
< P3P: CP="This is not a P3P policy! See for more info."
< X-Frame-Options: SAMEORIGIN
< Content-Encoding: gzip
< Vary: Accept-Encoding
< X-Vary-Options: Accept-Encoding;list-contains=gzip
< Content-Type: application/vnd.php.serialized; charset=utf-8
< X-Varnish: 2852251622, 1518781136
< Via: 1.1 varnish, 1.1 varnish
< Content-Length: 205
< Accept-Ranges: bytes
< Date: Wed, 05 Mar 2014 23:27:10 GMT
< Age: 0
< Connection: keep-alive
< X-Cache: cp1066 miss (0), cp1067 frontend miss (0)
- Added cookie centralauth_User="ClueBot+III" for domain, path /, expire 1396654030
< Set-Cookie: centralauth_User=ClueBot+III; expires=Fri, 04-Apr-2014 23:27:10 GMT; path=/;; httponly; GeoIP=::::v4; path=/
- Closing connection #0
POST: (0.11309099197388 s) (256 b)
[login] => Array ( [result] => Success [lguserid] => < REMOVED > [lgusername] => ClueBot III [lgtoken] => < REMOVED > [cookieprefix] => enwiki [sessionid] => < REMOVED > )
- About to connect() to port 80 (#0)
- Trying * connected
GET /w/api.php?action=query&prop=revisions&titles=User%3ADamianZaremba&rvlimit=1&rvprop=timestamp|ids|user|comment&format=php&meta=userinfo&rvdir=older HTTP/1.1
User-Agent: ClueBot/2.0
Accept: */*
Accept-Encoding: deflate, gzip
Cookie: centralauth_User=ClueBot+III; enwikiSession=< REMOVED >
< HTTP/1.1 200 OK
< Server: Apache
< X-Powered-By: PHP/5.3.10-1ubuntu3.9+wmf1
< X-Content-Type-Options: nosniff
< Cache-control: private
< X-Frame-Options: SAMEORIGIN
< Content-Encoding: gzip
< Vary: Accept-Encoding
< X-Vary-Options: Accept-Encoding;list-contains=gzip
< Content-Type: application/vnd.php.serialized; charset=utf-8
< X-Varnish: 2479625599, 1518781362
< Via: 1.1 varnish, 1.1 varnish
< Content-Length: 277
< Accept-Ranges: bytes
< Date: Wed, 05 Mar 2014 23:27:11 GMT
< Age: 0
< Connection: keep-alive
< X-Cache: cp1053 miss (0), cp1067 frontend miss (0)
- Added cookie GeoIP="::::v4" for domain, path /, expire 0
< Set-Cookie: GeoIP=::::v4; path=/
- Closing connection #0
GET:|ids|user|comment&format=php&meta=userinfo&rvdir=older (0.083457946777344 s) (475 b) (200 code)
[0] => Array ( [revid] => 510098058 [parentid] => 509798527 [user] => DamianZaremba [timestamp] => 2012-08-31T11:46:49Z [comment] => ) [ns] => 2 [title] => User:DamianZaremba [currentuser] => [continue] => [pageid] => 29562889
All the values and return looks good - but the current user is anon and any actions using tokens or doing privileged actions fail.
This is the same across multiple bot users, all hitting the enwiki api from tools in eqiad.
Also tested from pmtpa to rule out migration issues and this issue is still present - it was not last week.
Possibly related: The internal labs IPs are now the anon ips, rather than the external - I believe it use to be the external, but apparently it shouldn't matter.
The varnish header suggests this is getting past the cache ok and fails 'randomly', so I'm having a hard time tracking down the issue.
Others have mentioned it as per and other 'lost' sessions.
Maybe some migration related issue outstanding?
Hopefully others can add more clarity.
Version: wmf-deployment
Severity: major