Page MenuHomePhabricator

LiveRCPageGenerator using socketIO_client stall the builds regularly
Closed, ResolvedPublic

Description

LiveRCPageGenerator stall the travis builds. The only common part of the log is

No handlers could be found for logger "socketIO_client"

That message doesnt appear on successful runs of this test. My guess is that "socketIO_client" wants to report a problem (possible socket timeout?), but cant due to a bug in their library or in our use of their library.

ar.wp - 2.6 (only failed build for that commit)
https://travis-ci.org/wikimedia/pywikibot-core/jobs/45687549

ar.wp - 2.7 & 2.6
https://travis-ci.org/wikimedia/pywikibot-core/jobs/45743553
https://travis-ci.org/wikimedia/pywikibot-core/jobs/45743561

It could be triggered by a low amount of data in the rc stream.

Event Timeline

jayvdb raised the priority of this task from to Unbreak Now!.
jayvdb updated the task description. (Show Details)
jayvdb added a project: Pywikibot.
jayvdb added a project: Pywikibot-tests.
jayvdb added subscribers: Aklapper, Unknown Object (MLST), jayvdb.
jayvdb renamed this task from LiveRCPageGenerator stall the builds regularly to LiveRCPageGenerator using socketIO_client stall the builds regularly.Jan 11 2015, 11:50 PM
jayvdb set Security to None.

After installing a logger "socketIO_client" on , we see the error is:

WARNING:socketIO_client:[connection error] connection closed ()

Change 184285 had a related patch set uploaded (by John Vandenberg):
Prevent hang in LiveRCPageGenerator

https://gerrit.wikimedia.org/r/184285

Patch-For-Review

Change 184285 merged by jenkins-bot:
Prevent hang in LiveRCPageGenerator

https://gerrit.wikimedia.org/r/184285

There seem to be two issues here:

Change 185200 had a related patch set uploaded (by Merlijn van Deen):
RCStream: return heartbeats and handle on_reconnect

https://gerrit.wikimedia.org/r/185200

Patch-For-Review

the reason on_connect is not called is the following. Compare

DEBUG:socketIO_client:[transports available] websocket xhr-multipart htmlfile jsonp-polling flashsocket xhr-polling
DEBUG:socketIO_client.transports:[transport selected] websocket
DEBUG:socketIO_client.transports:[packet sent] 1::/rc:
DEBUG:socketIO_client.transports:[packet received] 1::
DEBUG:socketIO_client: [connect]
DEBUG:socketIO_client.transports:[packet received] 1::/rc

to

DEBUG:socketIO_client:[transports available] websocket xhr-multipart htmlfile jsonp-polling flashsocket xhr-polling
DEBUG:socketIO_client.transports:[packet received] 1::
DEBUG:socketIO_client: [connect]
(... nothing)

This means socketio_client doesn't actually register in the /rc namespace, and thus our code is not called.

Change 185200 merged by jenkins-bot:
RCStream: return heartbeats and handle on_reconnect

https://gerrit.wikimedia.org/r/185200

jayvdb claimed this task.

This appears to be fixed.

Okay since this test (which was 7 days ago) all Python 3.3 tests which used setup.py stalled (the nosetests ran fine). It seems that all tests errored the same way. Maybe this is unrelated to this bug, but I'm not familiar with socketIO so I can't really determine that. A new one should be opened then.

Interestingly since then Travis is installing requests-2.5.3 and socketIO-client is using that library so they could be connected.

Regarding my comment 9 days ago: That is fixed and was caused because of T91393#1104004.