Currently we are rejecting any query that asks for a user with over 350,000 edits. This is a simple but probably not great solution, since the edit count isn't necessarily proportional to how expensive the query is (e.g. we also query the logging table). So in some cases, a user with only 100,000 edits could go really slow, and someone with 500,000 edits could actually finish within 15 minutes (our current max query time). Using EXPLAIN, we can go by the estimated row count and abort if it is something unreasonable. This should help us stay within our connection quota, while also allowing people to query for some users with over 350,000 edits.
Description
Related Objects
Event Timeline
Would this be with the sum of rows? We've got quite many queries; counting exactly what query will be executed how many times, with the caching, may be a bit annoying.
Or should we go one by one, and on each query, if we get >1M rows, abort? i.e. adding something to executeQuery.
Yeah, adding something to Repository::executeProjectsQuery() would be most ideal. We can't do an EXPLAIN directly on the query with the normal Toolforge DB credentials, unfortunately. Instead we'll have to do it like the SQL Optimizer and run SHOW EXPLAIN on the process ID. This requires an additional connection, which might necessitate us getting a higher DB quota, since we already hit our connection limit a good amount. But on the other hand, if we prevented long-running queries in the first place, then we'll likely have more connections to work with. So this task would be somewhat experimental.
The biggest hurdle here I think is doing the queries asynchronously – one for the actual query, and one to run SHOW EXPLAIN on the MariaDB process. This isn't as straightforward in PHP, but I think it can be done.
I envision it working like:
- Run SET max_statement_time = 1 (probably no need to inject a SLEEP like SQL Optimizer does)
- In a new process, run SHOW EXPLAIN for {process ID}. This could perhaps be done via a Symfony Command.
- Have it catch the exception for a timeout (after 1 second) in which case it should return null, or return the actual query result if it finishes in time
- If we get a result, we can continue, if we don't, we throw a AccessDeniedHttpException (this will force the user to login)
Most queries are quite fast as-is, so I think executeProjectsQuery() should take an opt-in boolean $isHeavy, where we have to tell it to do the extra overhead of running EXPLAIN.
There are other complications I'm sure I'm overlooking.
This task would be just another performance safeguard in the vast array of tactics we employ to protect XTools from bots. So I think if the bot problem is solved, this isn't as important.
I'm probably being thick again. Would this mean limiting all queries to 1 second? If so, that'll mean excluding a lot of stuff.