Page MenuHomePhabricator

Commons deletion notification bot failing since June 6, 2023
Open, HighPublicBUG REPORT

Description

I had to disable the bot as the tools-db queries were timing out. I did a manual bin/first-run and got:

WARNING: API warning (result): This result was truncated because it would otherwise be larger than the limit of 12,582,912 bytes.
649218 pages found for discussion deletion
Traceback (most recent call last):
  File "/mnt/nfs/labstore-secondary-tools-project/commtech-commons/bot/virtualenv/lib/python3.7/site-packages/pymysql/connections.py", line 1043, in _write_bytes
    self._sock.sendall(data)
ConnectionResetError: [Errno 104] Connection reset by peer

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "./make-list.py", line 64, in <module>
    make_list('discussion', ['Deletion_requests'], depth=2, delay=60 * 60)
  File "./make-list.py", line 53, in make_list
    states = store.refresh_state(files, type)
  File "/mnt/nfs/labstore-secondary-tools-project/commtech-commons/bot/commonsbot/state.py", line 108, in refresh_state
    (present, missing) = self.load_state(files, type)
  File "/mnt/nfs/labstore-secondary-tools-project/commtech-commons/bot/commonsbot/state.py", line 164, in load_state
    present.update(self._state_batch(c, type))
  File "/mnt/nfs/labstore-secondary-tools-project/commtech-commons/bot/commonsbot/state.py", line 186, in _state_batch
    rows = mysql.query(self.conn, sql, files)
  File "/mnt/nfs/labstore-secondary-tools-project/commtech-commons/bot/commonsbot/mysql.py", line 60, in query
    cursor.execute(sql, params)
  File "/mnt/nfs/labstore-secondary-tools-project/commtech-commons/bot/virtualenv/lib/python3.7/site-packages/pymysql/cursors.py", line 165, in execute
    result = self._query(query)
  File "/mnt/nfs/labstore-secondary-tools-project/commtech-commons/bot/virtualenv/lib/python3.7/site-packages/pymysql/cursors.py", line 321, in _query
    conn.query(q)
  File "/mnt/nfs/labstore-secondary-tools-project/commtech-commons/bot/virtualenv/lib/python3.7/site-packages/pymysql/connections.py", line 859, in query
    self._execute_command(COMMAND.COM_QUERY, sql)
  File "/mnt/nfs/labstore-secondary-tools-project/commtech-commons/bot/virtualenv/lib/python3.7/site-packages/pymysql/connections.py", line 1096, in _execute_command
    self._write_bytes(packet)
  File "/mnt/nfs/labstore-secondary-tools-project/commtech-commons/bot/virtualenv/lib/python3.7/site-packages/pymysql/connections.py", line 1048, in _write_bytes
    "MySQL server has gone away (%r)" % (e,))
pymysql.err.OperationalError: (2006, "MySQL server has gone away (ConnectionResetError(104, 'Connection reset by peer'))")
CRITICAL: Exiting due to uncaught exception <class 'pymysql.err.OperationalError'>

649,218 pages is a lot more than we normally have. My guess is the schema we're using doesn't accommodate datasets this large. Or, whatever API request that caused WARNING: API warning (result): This result was truncated caused the rest of the script to malfunction.

Event Timeline

Noting we also need to adjust the various bash scripts to work with k8s instead of the grid, as they are currently written.

Probably should make time for this soon, given how long the outage has been. I'm going to try to look into this week as a broadly related MediaWiki CodeJam Dec 2023 project.

I was concerned that simply no one cared about this bot anymore, but I just uncovered https://commons.wikimedia.org/wiki/Commons:Requests_for_comment/Technical_needs_survey/Bots

Let's get this prioritized. I would have fixed this many moons ago if I had the Python skills. I'll take another stab at it next week if no one beats me to it.

Aklapper renamed this task from Commons deletion notification bot failing since June 6 to Commons deletion notification bot failing since June 6, 2023.Mon, Apr 22, 8:30 AM

@taavi Thank you! You're a rock star! :D

There a few remaining to-dos before we can fully revive the bot, which I've partially taken care of. Maybe if you're interested, we can finish it up at the Hackathon :)

works for me, see you in a day or two :-)