The original implementation of conflict detection in the block_target table was faulty, with a race condition that we are hitting in production, causing multiple block_target rows for some users.
The query sequence was UPDATE/SELECT/INSERT. The update query acquires a shared lock and so is not serialized with another update query. However, the sequence 1.update, 2.update, 1.insert, 2.insert fails safely with a deadlock error.
The insert acquires an exclusive lock. If the first thread does both an update and insert before the second thread does its update, the second update will wait for the first thread to commit. The sequence is then 1.update, 1.insert, 2.update(wait), 1.commit, 2.update(complete), 2.select, 2.insert. The 2.update does return affectedRows=1 however the insert was conditional on the result of the select.
Unfortunately, the select was non-locking. A locking select always returns the updated row, however a non-locking select occasionally returns an empty result set despite the thread having waited for the new row to be committed.
The [[https://gerrit.wikimedia.org/r/c/mediawiki/core/+/1134635|committed fix]] uses GET_LOCK() to synchronise externally, and this does help to avoid deadlock errors. That commit also added FOR UPDATE to the select, which would have been enough by itself to avoid duplicate rows.
Adding a unique index to the block_target table and using upsert is not feasible since an upsert on a table with multiple unique indexes is unsafe for replication. Also, nullable fields make it difficult to force the rows to collide, since any null value causes an aggregate index like (bt_user, bt_auto, bt_address) to compare non-equal.