Page MenuHomePhabricator

Querying ProcseeBot's user logs results in internal_api_error_DBQueryError
Closed, DuplicatePublicPRODUCTION ERROR

Description

Querying ProcseeBot's user logs via the API often results in internal_api_error_DBQueryError.

Example request params:

{
   format: 'json',
   formatversion: '2',
   maxlag: 5,
   assert: 'bot',
   action: 'query',
   list: 'logevents',
   leprop: 'title|comment|timestamp',
   leend: '2021-02-25T21:20:28.572Z',
   lelimit: 'max',
   letype: 'block',
   leuser: 'ProcseeBot',
   lestart: '2021-12-09T09:20:14.605Z'
 }

SDZeroBot does this to maintain WP:Bot activity monitor. Starting from Dec 16, the error has occurred 172 times as of 2 Jan.

Bot logs can be seen here: https://sdzerobot.toolforge.org/logs?type=out&log=bot-monitor&lines=10000 Use Ctrl+F for "DBQueryError" to find full response headers and request IDs.

Event Timeline

SD0001 changed the subtype of this task from "Bug Report" to "Production Error".Dec 19 2021, 3:08 PM

I went through logstash to find a requestid from your logs, it looks like the error on the backend (for one of your errors anyway) is Error 1969: Query execution was interrupted (max_statement_time exceeded) (db1184) this may be related to this email sent to wikitech-l https://lists.wikimedia.org/hyperkitty/list/wikitech-l@lists.wikimedia.org/thread/IPJNO75HYAQWIGTHI5LJHTDVLVOC4LJP/

Just to note: I am not sure if it is good to include tokens in your public log (header for oauth). Your timestamp must not include microseconds ('2021-12-09T09:20:14.605Z' => '2021-12-09T09:20:14Z')

The url from the logs is https://en.wikipedia.org/w/api.php?action=query&list=logevents&leuser=ProcseeBot&lelimit=100&leprop=title|comment|timestamp&leend=2022-03-25T05:20:30Z&letype=block which also timeout right know

{
    "error": {
        "code": "internal_api_error_DBQueryError",
        "info": "[fd3370e8-7024-46af-9197-4f4e24a774de] Caught exception of type Wikimedia\\Rdbms\\DBQueryError",
        "errorclass": "Wikimedia\\Rdbms\\DBQueryError"
    },
    "servedby": "mw1404"
}

When looking at the GUI part without timestamp - https://en.wikipedia.org/wiki/Special:Log?type=block&user=ProcseeBot&page=&wpdate=&tagfilter=&subtype= it seems the ProcseeBot has stopped its work on enwiki and that makes searches for nothing extremly expensive for the database or the database is using the wrong index to determine the fact, there are no rows. Maybe the request could be reduced to avoid problems with the database servers

I disabled the check for ProcseeBot's activity.