Reported on IRC by j-mo, not sure how long it's been happening.
Description
Related Objects
- Mentioned In
- T143493: Paws display 504 - Bad gateway time-out
Event Timeline
Looked into hub logs - was seeing a stream of these errors. Not sure why this is happening, and why it takes the hub down - but restarting the hub seemed to restore normalcy, at-least for now.
[E 2017-02-21 19:25:14.212 JupyterHub ioloop:629] Exception in callback functools.partial(<function wrap.<locals>.null_wrapper at 0x7fc700464510>, <tornado.concurrent.Future object at 0x7fc700494ba8>) Traceback (most recent call last): File "/usr/local/lib/python3.4/dist-packages/tornado/ioloop.py", line 600, in _run_callback ret = callback() File "/usr/local/lib/python3.4/dist-packages/tornado/stack_context.py", line 275, in null_wrapper return fn(*args, **kwargs) File "/usr/local/lib/python3.4/dist-packages/tornado/ioloop.py", line 615, in <lambda> self.add_future(ret, lambda f: f.result()) File "/usr/local/lib/python3.4/dist-packages/jupyterhub/spawner.py", line 333, in poll_and_notify status = yield self.poll() File "/usr/local/lib/python3.4/dist-packages/kubespawner/spawner.py", line 232, in poll data = yield self.get_pod_info(self.pod_name) File "/usr/local/lib/python3.4/dist-packages/kubespawner/spawner.py", line 228, in pod_name return self._expand_user_properties(self.pod_name_template) File "/usr/local/lib/python3.4/dist-packages/kubespawner/spawner.py", line 119, in _expand_user_properties safe_username = ''.join([s if s in safe_chars else '-' for s in self.user.name.lower()]) File "/usr/local/lib/python3.4/dist-packages/jupyterhub/user.py", line 133, in __getattr__ if hasattr(self.orm_user, attr): File "/usr/local/lib/python3.4/dist-packages/sqlalchemy/orm/attributes.py", line 237, in __get__ return self.impl.get(instance_state(instance), dict_) File "/usr/local/lib/python3.4/dist-packages/sqlalchemy/orm/attributes.py", line 578, in get value = state._load_expired(state, passive) File "/usr/local/lib/python3.4/dist-packages/sqlalchemy/orm/state.py", line 474, in _load_expired self.manager.deferred_scalar_loader(self, toload) File "/usr/local/lib/python3.4/dist-packages/sqlalchemy/orm/loading.py", line 664, in load_scalar_attributes only_load_props=attribute_names) File "/usr/local/lib/python3.4/dist-packages/sqlalchemy/orm/loading.py", line 219, in load_on_ident return q.one() File "/usr/local/lib/python3.4/dist-packages/sqlalchemy/orm/query.py", line 2718, in one ret = list(self) File "/usr/local/lib/python3.4/dist-packages/sqlalchemy/orm/query.py", line 2761, in __iter__ return self._execute_and_instances(context) File "/usr/local/lib/python3.4/dist-packages/sqlalchemy/orm/query.py", line 2774, in _execute_and_instances close_with_result=True) File "/usr/local/lib/python3.4/dist-packages/sqlalchemy/orm/query.py", line 2765, in _connection_from_session **kw) File "/usr/local/lib/python3.4/dist-packages/sqlalchemy/orm/session.py", line 893, in connection execution_options=execution_options) File "/usr/local/lib/python3.4/dist-packages/sqlalchemy/orm/session.py", line 898, in _connection_for_bind engine, execution_options) File "/usr/local/lib/python3.4/dist-packages/sqlalchemy/orm/session.py", line 313, in _connection_for_bind self._assert_active() File "/usr/local/lib/python3.4/dist-packages/sqlalchemy/orm/session.py", line 214, in _assert_active % self._rollback_exception sqlalchemy.exc.InvalidRequestError: This Session's transaction has been rolled back due to a previous exception during flush. To begin a new transaction with this Session, first issue Session.rollback(). Original exception was: New instance <User at 0x7fc7342caf60> with identity key (<class 'jupyterhub.orm.User'>, (49450018,)) conflicts with persistent instance <User at 0x7fc7368d1588>
We know globaluser id 49450018 is 'Info-farmerBioBot', and there's a jupyter-info-formerbiobot-49450018 1/1 Running 1 1d pod on paws.
I had a hunch there might be two distinct users with different capitalization, but that doesn't seem to be the case. There /are/ a whole bunch of these usernames: Info-farmer, Info-farmer-bio-bot, Info-farmerBioBot, Info-farmerBiologyBot, Info-farmerBot, Info-farmerLifeSciBot. Maybe Info-farmer-bio-bot vs Info-farmerBioBot is still triggering some sort of unique constraint?
It has been up since. For now I am resolving this to be revisited if we can narrow down the trigger and/or mitigate.
I've today a 502 nginx error message when I click on the blue button "refresh", when the message says that the server is starting, after login with Framawiki or Framabot.
I've tried to clear the cookies, disconnect my MW account, use a clean brother... No way to use the tool.
I fixed this late yesterday night, and hopefully will have a more long term
fix coming in the next few weeks.