Context (skippable)
In the performance guidelines there's an example using the user_properties table that recommends:
When saving a change, it may be tempting to delete all existing rows for the user ID, and insert everything you know about the new state. But, this would cause database contention. What MediaWiki does instead is to only make changes to the table by the unique primary key for each row.
Thus far, this always seemed to make sense. Today, I found a query in our code that doesn't make full use of an index for a similar deletion, so I thought I'd fix it. My first idea was to simply remove the conditional at line 104; this way it would fully use a unique index (not the primary key, but I figured it should be roughly the same). I then wanted to verify that my fix actually avoided potential locking, but found pretty much the opposite. I pasted some raw debug info in P71715, and below is a more verbose description of what's going on.
Setting
We need a table with a unique composite index. In my example I also have a surrogate primary key. Not all of these elements might be necessary, but I haven't tried reducing the test case.
CREATE TABLE t ( t_pk BIGINT UNSIGNED AUTO_INCREMENT NOT NULL, t_1 BIGINT UNSIGNED NOT NULL, t_2 BIGINT UNSIGNED NOT NULL, UNIQUE INDEX t_1_2 (t_1, t_2), PRIMARY KEY(t_pk) );
Next, we will insert a few records into our table:
insert into t (t_1,t_2) values (1,1), (1,10), (1,100), (2,2), (2,20), (2,200), (1,1000), (1,5000), (1,10000);
I have played around with the values here, and immediately noticed something: if you remove the last row from each line (so you're left with a total of 6 rows), the bug is no longer reproducible. Let's ignore this for now.
Test case
We will now simulate two clients writing to our table. For this, I used two separate shells called "A" and "B". Note: from now on, I will rollback all the transactions. I'm omitting the rollback commands below for simplicity; just rollback when the snippet ends.
First, let's try with a query that should, in theory, use the first indexed column only.
A> begin; B> begin; A> delete from t where t_1=1; -- ok B> insert into t (t_1,t_2) values (42, 69); -- ok
So far so good. Now, let's try refining the query so that it uses both columns:
A> begin; B> begin; A> delete from t where t_1=1 and t_2 in (1,10,100,1000,5000,10000); -- ok B> insert into t (t_1,t_2) values (42, 69); -- blocked!
B's query hangs, evidently because A is holding a lock. But why, and what lock? I pasted an extract from show engine innodb status in P71715 for both delete queries. I won't try to pretend that I understand everything there, but the first thing that stood out is that the second query is locking the "supremum" pseudo-record. This seems to be a next-key lock and it basically affects the whole table, given that it's almost empty.
Trying to make sense of it
The first thing I want to get out of the way is that I'm only deleting existing records. So, this is not like the problem I recently investigated in T381622.
Next, I came across MySQL bug #112106, where I learned two interesting things: the MySQL team's reticence to provide links to back up their statements, and the fact that MySQL needs to lock the supremum when a statement modifies the last index position. This sounds reasonable, so I added a new record at the end of the index and tried again:
insert into t (t_1,t_2) values (100,100);
A> begin; B> begin; A> delete from t where t_1=1 and t_2 in (1,10,100,1000,5000,10000); -- ok B> insert into t (t_1,t_2) values (42, 69); -- still blocked.
show engine innodb status after the delete shows that there's still a lock on the supremum. So, seems like this isn't it? Or maybe it is. I might have misunderstood what the MySQL team meant in that bug report. Their statements even seem to contradict each other (wrt locking of infimum record), so I'm really not sure.
A step back
At this point I focused my attention on something else: record locks are based on the existence of an index, whose entries can be locked as needed. And my table has a composite unique index on two columns. So far so good. But then I wondered: is MariaDB actually using my index for the delete queries? I removed the last-added record and checked:
> delete from t where t_1 = 100; > explain delete from t where t_1=1; +------+-------------+-------+-------+---------------+-------+---------+------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+-------+---------------+-------+---------+------+------+-------------+ | 1 | SIMPLE | t | range | t_1_2 | t_1_2 | 8 | NULL | 6 | Using where | +------+-------------+-------+-------+---------------+-------+---------+------+------+-------------+ 1 row in set (0.001 sec) > explain delete from t where t_1=1 and t_2 in (1,10,100,1000,5000,10000); +------+-------------+-------+------+---------------+------+---------+------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+------+---------------+------+---------+------+------+-------------+ | 1 | SIMPLE | t | ALL | t_1_2 | NULL | NULL | NULL | 9 | Using where | +------+-------------+-------+------+---------------+------+---------+------+------+-------------+ 1 row in set (0.001 sec)
Ha! Apparently it isn't! At this point I thought I would retry the original test with the inclusion of a FORCE INDEX. According to the docs, the only way to do that is to switch to the multi-table delete syntax. And here I noticed something even more bizarre:
> explain delete t.* from t where t_1=1 and t_2 in (1,10,100,1000,5000,10000); +------+-------------+-------+-------+---------------+-------+---------+------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+-------+---------------+-------+---------+------+------+-------------+ | 1 | SIMPLE | t | range | t_1_2 | t_1_2 | 16 | NULL | 6 | Using where | +------+-------------+-------+-------+---------------+-------+---------+------+------+-------------+
So, huh, changing the syntax to the multi-table version (added t.*) is enough to let it use the index?! Mkay... Let's double-check:
A> begin; A> delete t.* from t where t_1=1 and t_2 in (1,10,100,1000,5000,10000); -- ok B> begin; B> insert into t (t_1,t_2) values (42, 69); -- it works!
As a final test, I added 1000 random records to the table and tried again. Sure enough, with more data to sift through, the index gets used even for the original delete query.
Conclusions
I guess the key takeaway here is that a subpar query plan / index choice can lead to unnecessary (dead)locks. My main questions now are:
- Is anything of the above a bug? For example, the fact that switching a delete query to the multi-table syntax can change the query plan seems really surprising. Same for not using the index in a query where it would seem a natural choice. On the other hand, l'm guessing the optimizer might have good reasons to do this that are far beyond my comprehension. For example, the docs say: "Indexes are less important for [...] small tables [...] When a query needs to access most of the rows, reading sequentially is faster than working through an index."
- Is anything of the above worth documenting somewhere? For example, our transaction best practices could explicitly state that unique indices are not necessarily good replacements for the primary key when it comes to avoiding lock contention, regardless of performance. On the other hand, I don't know if this would also happen with real-world data; and for all I know, the optimizer may also choose to ignore the primary key in similar queries.