Page MenuHomePhabricator

quarry down
Closed, ResolvedPublic

Description

Giving "Internal Server Error"

Event Timeline

Looks like none of the web pods are running. Logs give

[2024-05-21 15:28:04 +0000] [11] [ERROR] Error handling request /
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/gunicorn/workers/sync.py", line 135, in handle
    self.handle_request(listener, req, client, addr)
  File "/usr/local/lib/python3.7/site-packages/gunicorn/workers/sync.py", line 178, in handle_request
    respiter = self.wsgi(environ, resp.start_response)
  File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 2464, in __call__
    return self.wsgi_app(environ, start_response)
  File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 2450, in wsgi_app
    response = self.handle_exception(e)
  File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1867, in handle_exception
    reraise(exc_type, exc_value, tb)
  File "/usr/local/lib/python3.7/site-packages/flask/_compat.py", line 39, in reraise
    raise value
  File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 2447, in wsgi_app
    response = self.full_dispatch_request()
  File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1952, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1821, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/usr/local/lib/python3.7/site-packages/flask/_compat.py", line 39, in reraise
    raise value
  File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1950, in full_dispatch_request
    rv = self.dispatch_request()
  File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1936, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/app/quarry/web/app.py", line 82, in index
    stats_count_users=global_conn.session.query(User).count(),
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3091, in count
    return self._from_self(col).enable_eagerloads(False).scalar()
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 2832, in scalar
    ret = self.one()
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 2809, in one
    return self._iter().one()
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 2850, in _iter
    execution_options={"_sa_orm_load_options": self.load_options},
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 1689, in execute
    result = conn._execute_20(statement, params or {}, execution_options)
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1583, in _execute_20
    return meth(self, args_10style, kwargs_10style, execution_options)
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/sql/elements.py", line 324, in _execute_on_connection
    self, multiparams, params, execution_options
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1462, in _execute_clauseelement
    cache_hit=cache_hit,
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1669, in _execute_context
    conn = self._revalidate_connection()
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 560, in _revalidate_connection
    self._invalid_transaction()
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 540, in _invalid_transaction
    code="8s2b",
sqlalchemy.exc.PendingRollbackError: Can't reconnect until invalid transaction is rolled back. (Background on this error at: https://sqlalche.me/e/14/8s2b)

Still appears to be down. Web page is loading, but queries are giving:
Access denied for user 'quarry'@'172.16.2.72' (using password: NO)

Restarting the services seems to have things connecting again.
kubectl rollout restart deployment.apps/redis deployment.apps/web deployment.apps/worker