Page MenuHomePhabricator

Book Uploader Bot (BUB) queue has stalled
Open, MediumPublic

Description

On Thu, 20 Jul 2017, at 10:02 PM, John Beard wrote:

Hi!

The BUB tool looks great for gettings works from Google or Hathi onto
Commons via IA, which is a very common and time consuming task.
However, it doesn't seem to have uploaded anything to IA since Sept
2016, and the queue is around 500 books long.

Is there anything I can do to help get this useful tool back on the road?

Cheers,

John

The full queue can be viewed at http://tools.wmflabs.org/bub/queue/1000/

Formerly reported at: https://github.com/rohit-dua/BUB/issues/53

The main backend process is worker.py. Launching it with qstat has no effect. Running worker.py directly outputs:

~/public_html/BUB/bot$ python worker.py
Traceback (most recent call last):
  File "worker.py", line 22, in <module>
    import MySQLdb
  File "build/bdist.linux-x86_64/egg/MySQLdb/__init__.py", line 19, in <module>

  File "build/bdist.linux-x86_64/egg/_mysql.py", line 7, in <module>
  File "build/bdist.linux-x86_64/egg/_mysql.py", line 3, in __bootstrap__
  File "/usr/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 36, in <module>
    import email.parser
  File "/usr/lib/python2.7/email/parser.py", line 12, in <module>
    from email.feedparser import FeedParser
  File "/usr/lib/python2.7/email/feedparser.py", line 27, in <module>
    from email import message
  File "/usr/lib/python2.7/email/message.py", line 16, in <module>
    import email.charset
  File "/usr/lib/python2.7/email/charset.py", line 13, in <module>
    import email.base64mime
  File "/usr/lib/python2.7/email/base64mime.py", line 40, in <module>
    from email.utils import fix_eols
  File "/usr/lib/python2.7/email/utils.py", line 27, in <module>
    import random
  File "/usr/lib/python2.7/random.py", line 49, in <module>
    import hashlib as _hashlib
  File "build/bdist.linux-x86_64/egg/hashlib.py", line 115, in <module>
    """
TypeError: 'frozenset' object is not callable
/data/project/bub/public_html/BUB/flask/bin/python --version
Python 2.7.6

The mass_worker.log contains (from the last time a restart of the queue processing was attempted, probably):

2017-02-23 21:26:10.651983  func:manager  args:(<__main__.QueueHandler object at 0x2b207522b510>,){}
  :<class '_mysql_exceptions.OperationalError'>, Retry in 1 seconds...
  :<class '_mysql_exceptions.OperationalError'>, Retry in 2 seconds... object at 0x2b2b04395510>,){}
  :<class '_mysql_exceptions.OperationalError'>, Retry in 1 seconds...
  :<class '_mysql_exceptions.OperationalError'>, Retry in 2 seconds...  :<class '_mysql_exceptions.OperationalError'>, Retry in 4 seconds...--------------------
ERROR: <class '_mysql_exceptions.OperationalError'>  func:manager args:(<__main__.QueueHandler object at 0x2ab669bed490>,){}
nc:manager args:(<__main__.QueueHandler object at 0x2b0e2bfa0510>,){}
  :<class '_mysql_exceptions.OperationalError'>, Retry in 4 seconds...--------------------
ERROR: <class '_mysql_exceptions.OperationalError'>  func:manager args:(<__main__.QueueHandler object at 0x2b21d8752510>,){}
2017-03-18 18:01:47.189082  Started Mass Worker #1
2017-03-18 18:01:47.199334  Started wait_and_add_to_queue
Process Process-1:
Traceback (most recent call last):
  File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
    self._target(*self._args, **self._kwargs)
  File "/mnt/nfs/labstore-secondary-tools-project/bub/public_html/BUB/bot/mass_worker.py", line 392, in wait_and_add_to_queue
    info = json.loads(q_bulk_order.pop_and_remove())
  File "/usr/lib/python2.7/json/__init__.py", line 338, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python2.7/json/decoder.py", line 366, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python2.7/json/decoder.py", line 382, in raw_decode
    obj, end = self.scan_once(s, idx)
ValueError: Unterminated string starting at: line 1 column 2 (char 1)
2017-03-18 18:03:42.202126  Started Mass Worker #3
orker3-2' already active
2017-03-18 18:05:03.356338  Started Mass Worker #3
2017-03-18 18:05:45.160743  Started Mass Worker #3
2017-03-24 14:02:26.740201  func:manager  args:(<__main__.QueueHandler object at 0x2ae930bbc490>,){}
  :<class '_mysql_exceptions.OperationalError'>, Retry in 1 seconds...
  :<class '_mysql_exceptions.OperationalError'>, Retry in 2 seconds...  :<class '_mysql_exceptions.OperationalError'>, Retry in 4 seconds...--------------------
ERROR: <class '_mysql_exceptions.OperationalError'>  func:manager args:(<__main__.QueueHandler object at 0x2ae930bbc490>,){}
2018-02-24 15:12:06.920545  Started Mass Worker #3
2018-03-08 04:10:57.095180  func:manager  args:(<__main__.QueueHandler object at 0x2addb1b4d490>,){}
  :<class '_mysql_exceptions.OperationalError'>, Retry in 1 seconds...
2018-04-05 14:35:38.878704  func:manager  args:(<__main__.QueueHandler object at 0x2b4653ec1490>,){}
  :<class '_mysql_exceptions.OperationalError'>, Retry in 1 seconds...

And from upload_checker (the queue which handles the upload part after the images have been downloaded):

2017-10-26 14:24:15.042495  upload-checker.py started
Traceback (most recent call last):
  File "/mnt/nfs/labstore-secondary-tools-project/bub/public_html/BUB/bot/upload_checker.py", line 192, in <module>
    main()
  File "/mnt/nfs/labstore-secondary-tools-project/bub/public_html/BUB/bot/upload_checker.py", line 188, in main
    check_if_upload_ready()
  File "/mnt/nfs/labstore-secondary-tools-project/bub/public_html/BUB/bot/upload_checker.py", line 157, in check_if_upload_ready
    r = get_ia_metadata(ia_identifier)
  File "../utils/retry.py", line 127, in f_retry
    return f(*args, **kwargs)
  File "/mnt/nfs/labstore-secondary-tools-project/bub/public_html/BUB/bot/upload_checker.py", line 122, in get_ia_metadata
    r = requests.get('http://archive.org/metadata/%s' %(ia_identifier) ).json()
  File "/mnt/nfs/labstore-secondary-tools-project/bub/public_html/BUB/flask/local/lib/python2.7/site-packages/requests/api.py", line 67, in get
    return request('get', url, params=params, **kwargs)
  File "/mnt/nfs/labstore-secondary-tools-project/bub/public_html/BUB/flask/local/lib/python2.7/site-packages/requests/api.py", line 53, in request
    return session.request(method=method, url=url, **kwargs)
  File "/mnt/nfs/labstore-secondary-tools-project/bub/public_html/BUB/flask/local/lib/python2.7/site-packages/requests/sessions.py", line 468, in request
    resp = self.send(prep, **send_kwargs)
  File "/mnt/nfs/labstore-secondary-tools-project/bub/public_html/BUB/flask/local/lib/python2.7/site-packages/requests/sessions.py", line 576, in send
    r = adapter.send(request, **kwargs)
  File "/mnt/nfs/labstore-secondary-tools-project/bub/public_html/BUB/flask/local/lib/python2.7/site-packages/requests/adapters.py", line 376, in send
    timeout=timeout
  File "/mnt/nfs/labstore-secondary-tools-project/bub/public_html/BUB/flask/local/lib/python2.7/site-packages/requests/packages/urllib3/connectionpool.py", line 559, in urlopen
    body=body, headers=headers)
  File "/mnt/nfs/labstore-secondary-tools-project/bub/public_html/BUB/flask/local/lib/python2.7/site-packages/requests/packages/urllib3/connectionpool.py", line 353, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/usr/lib/python2.7/httplib.py", line 1017, in request
    self._send_request(method, url, body, headers)
  File "/usr/lib/python2.7/httplib.py", line 1051, in _send_request
    self.endheaders(body)
  File "/usr/lib/python2.7/httplib.py", line 1013, in endheaders
    self._send_output(message_body)
  File "/usr/lib/python2.7/httplib.py", line 864, in _send_output
    self.send(msg)
  File "/usr/lib/python2.7/httplib.py", line 826, in send
    self.connect()
  File "/mnt/nfs/labstore-secondary-tools-project/bub/public_html/BUB/flask/local/lib/python2.7/site-packages/requests/packages/urllib3/connection.py", line 162, in connect
    conn = self._new_conn()
  File "/mnt/nfs/labstore-secondary-tools-project/bub/public_html/BUB/flask/local/lib/python2.7/site-packages/requests/packages/urllib3/connection.py", line 137, in _new_conn
    (self.host, self.port), self.timeout, **extra_kw)
  File "/mnt/nfs/labstore-secondary-tools-project/bub/public_html/BUB/flask/local/lib/python2.7/site-packages/requests/packages/urllib3/util/connection.py", line 81, in create_connection
    sock.connect(sa)
  File "/usr/lib/python2.7/socket.py", line 224, in meth
    return getattr(self._sock,name)(*args)

Related Objects

Event Timeline

Nemo_bis triaged this task as Medium priority.Apr 16 2018, 6:11 PM
Nemo_bis updated the task description. (Show Details)

The jsub error was

[Mon Apr 16 18:24:43 2018] /mnt/nfs/labstore-secondary-tools-project/bub/public_html/BUB/bot/worker.log: not an executable file

After fixing permissions, at least the process is running:

tools.bub@tools-bastion-03:~/public_html/BUB/bot$ jsub -once -continuous -l release=trusty -N worker1 /data/project/bub/public_html/BUB/bot/worker.log -e /data/project/bub/public_html/BUB/bot/worker.log -cwd -m 1000m ./worker.py
Your job 3995903 ("worker1") has been submitted

The jsub error was

[Mon Apr 16 18:24:43 2018] /mnt/nfs/labstore-secondary-tools-project/bub/public_html/BUB/bot/worker.log: not an executable file

It's executing a log file?

It's executing a log file?

No, I saw later that it was a typo in my jsub command (missing -o).

Meanwhile, I don't even understand if the redis host still exists. /etc/hosts mentions 10.68.22.56 tools-redis tools-redis.eqiad.wmflabs.

Two out of three keys from settings.json produce an exception:

>>> import redis
>>> r = redis.Redis(host='tools-redis', port=6379)
>>> r
Redis<ConnectionPool<Connection<host=tools-redis,port=6379,db=0>>>
>>> r.get('bub_...')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "build/bdist.linux-x86_64/egg/redis/client.py", line 807, in get
  File "build/bdist.linux-x86_64/egg/redis/client.py", line 529, in execute_command
  File "build/bdist.linux-x86_64/egg/redis/client.py", line 541, in parse_response
  File "build/bdist.linux-x86_64/egg/redis/connection.py", line 550, in read_response
redis.exceptions.ResponseError: WRONGTYPE Operation against a key holding the wrong kind of value

redis.exceptions.ResponseError: WRONGTYPE Operation against a key holding the wrong kind of value

You can't run GET on a value that is not a string (eg. a set / list).

And yes, the shared redis server is still alive:

03:07:03 0 ✓ zhuyifei1999@tools-bastion-02: ~$ redis-cli -h tools-redis
tools-redis:6379> info
# Server
redis_version:2.8.17
redis_git_sha1:00000000
redis_git_dirty:0
redis_build_id:4c1d5710660b9479
[...]

Running the TYPE command, the values of the first two keys have zset as their type, and the third has none

redis.exceptions.ResponseError: WRONGTYPE Operation against a key holding the wrong kind of value

You can't run GET on a value that is not a string (eg. a set / list).

So what's the alternative, HGET? https://redis.io/commands/hget https://redis-py.readthedocs.io/en/latest/_modules/redis/client.html#StrictRedis.hget
(I didn't yet look into BUB's Redis structure.)

So what's the alternative, HGET? https://redis.io/commands/hget https://redis-py.readthedocs.io/en/latest/_modules/redis/client.html#StrictRedis.hget
(I didn't yet look into BUB's Redis structure.)

HGET is for getting a single value from a hash with a specified field. If you want all fields from the hash you would want HGETALL.

However, in this case they are not hashes, but zsets (a.k.a. sorted sets). To operated on zsets you would want a command that starts with Z, such as ZRANGE. A command like ZRANGE bub_[REDACTED] 0 -1 would be able to get all values from the set.