Page MenuHomePhabricator

ORES Beta startup errors not being routed to our app logging.
Open, LowPublic

Description

ORES in beta doesn't seem to want to fork worker processes.

$ ps aux | grep -vP "defunct|prometheus" | grep -P "uwsgi|celery"
www-data 11025  4.4 21.7 2417344 1782160 ?     Ss   15:54   0:22 /srv/deployment/ores/deploy-cache/revs/5d977f4607a1c8e351c4817b100dc7f882a2290d/venv/bin/python3 /srv/deployment/ores/deploy/venv/bin/celery worker --app ores_celery.application --loglevel ERROR
www-data 11861  3.0  4.5 998748 375796 ?       Ss   15:58   0:06 /usr/bin/uwsgi --die-on-term --ini /etc/uwsgi/apps-enabled/ores.ini
halfak   12070  0.0  0.0  12848   992 pts/1    S+   16:02   0:00 grep --color=auto -P uwsgi|celery

No errors are reported in /srv/log/ores/app.log or /srv/log/ores/main.log

Event Timeline

I tried running sudo service ores-celery-workers restart and sudo service uwsgi-ores restart. Everything proceeded as expected except that workers didn't start up.

Aha! I found this in the syslog:

Apr 20 16:08:56 deployment-ores01 celery-ores-worker[12127]: Traceback (most recent call last):
Apr 20 16:08:56 deployment-ores01 celery-ores-worker[12127]:   File "/usr/lib/python3.5/multiprocessing/process.py", line 249, in _bootstrap
Apr 20 16:08:56 deployment-ores01 celery-ores-worker[12127]:     self.run()
Apr 20 16:08:56 deployment-ores01 celery-ores-worker[12127]:   File "/usr/lib/python3.5/multiprocessing/process.py", line 93, in run
Apr 20 16:08:56 deployment-ores01 celery-ores-worker[12127]:     self._target(*self._args, **self._kwargs)
Apr 20 16:08:56 deployment-ores01 celery-ores-worker[12127]:   File "/srv/deployment/ores/deploy-cache/revs/5d977f4607a1c8e351c4817b100dc7f882a2290d/ores/scoring_context.py", line 278, in load_model_and_queue
Apr 20 16:08:56 deployment-ores01 celery-ores-worker[12127]:     model = Model.from_config(config, key)
Apr 20 16:08:56 deployment-ores01 celery-ores-worker[12127]:   File "/srv/deployment/ores/deploy-cache/revs/5d977f4607a1c8e351c4817b100dc7f882a2290d/venv/lib/python3.5/site-packages/revscoring/scoring/models/model.py", line 131, in from_config
Apr 20 16:08:56 deployment-ores01 celery-ores-worker[12127]:     return Class.load(stream)
Apr 20 16:08:56 deployment-ores01 celery-ores-worker[12127]:   File "/srv/deployment/ores/deploy-cache/revs/5d977f4607a1c8e351c4817b100dc7f882a2290d/venv/lib/python3.5/site-packages/revscoring/scoring/models/model.py", line 104, in load
Apr 20 16:08:56 deployment-ores01 celery-ores-worker[12127]:     model = pickle.load(f)
Apr 20 16:08:56 deployment-ores01 celery-ores-worker[12127]:   File "./articlequality/feature_lists/ptwiki.py", line 95, in <module>
Apr 20 16:08:56 deployment-ores01 celery-ores-worker[12127]:     image_templates_str = wikitext.revision.datasources.templates_str_matching(
Apr 20 16:08:56 deployment-ores01 celery-ores-worker[12127]: AttributeError: 'Revision' object has no attribute 'templates_str_matching'
Halfak renamed this task from ORES beta won't start workers to ORES Beta startup errors not being routed to our app logging. .Apr 20 2020, 4:10 PM
Halfak triaged this task as Low priority.May 4 2020, 5:02 PM
Halfak moved this task from Unsorted to Maintenance/cleanup on the Machine-Learning-Team board.