MySQL死锁

MySQL死锁

MySQL死锁是指多个事务之间,由于每个事务持有另一个事务所需的锁而无法继续执行的情况。因所有事务都在等待相同的资源变得可用,而没有一个事务释放它所持有的锁。

多个事务以相反的顺序锁定多个表中的行(通过诸如UPDATE或SELECT ... FOR UPDATE之类的语句),就可能发生死锁。

死锁也可能发生在这些语句锁定索引记录和间隙范围时,每个事务获取一些锁但由于时间问题而没有获取其他锁。

以下是一个死锁的栗子:

使用SET GLOBAL innodb_print_all_deadlocks = ON命令,可以查看死锁信息。

首先客户端A,开启事务,并获取animals表的相关锁:

 1mysql> set global innodb_print_all_deadlocks = on;
 2Query OK, 0 rows affected (0.01 sec)
 3
 4mysql> create table animals (name varchar(10) primary key, value int) engine = InnoDB;
 5Query OK, 0 rows affected (0.03 sec)
 6
 7mysql> create table birds (name varchar(10) primary key, value int ) engine = InnoDB;
 8Query OK, 0 rows affected (0.03 sec)
 9
10mysql> insert into animals(name, value) values ('cow', 1);
11Query OK, 1 row affected (0.01 sec)
12
13mysql> insert into birds(name, value) values ('pigeon', 1);
14Query OK, 1 row affected (0.00 sec)
15
16mysql> start transaction;
17Query OK, 0 rows affected (0.00 sec)
18
19mysql> select value from animals where name = 'cow' for share;
20+-------+
21| value |
22+-------+
23|     1 |
24+-------+
251 row in set (0.00 sec)

接着,客户端B开启事务,并且获取birds表相关锁,并视图更新animals表事务A获得锁的内容:

 1mysql> start transaction;
 2Query OK, 0 rows affected (0.00 sec)
 3
 4mysql> select value from birds where name = 'pigeon' for share;
 5+-------+
 6| value |
 7+-------+
 8|     1 |
 9+-------+
101 row in set (0.00 sec)
11mysql> update animals set value = 2 where name = 'cow';

这时候可以看到更新操作是被阻塞的,它在等待客户端A释放锁。

再打开一个新客户端,查看InnoDB的事务和锁信息:

 1mysql> SELECT ENGINE_TRANSACTION_ID as Trx_Id,
 2    ->               OBJECT_NAME as `Table`,
 3    ->               INDEX_NAME as `Index`,
 4    ->               LOCK_DATA as Data,
 5    ->               LOCK_MODE as Mode,
 6    ->               LOCK_STATUS as Status,
 7    ->               LOCK_TYPE as Type
 8    ->         FROM performance_schema.data_locks;
 9+-----------------+---------+---------+----------+---------------+---------+--------+
10| Trx_Id          | Table   | Index   | Data     | Mode          | Status  | Type   |
11+-----------------+---------+---------+----------+---------------+---------+--------+
12| 422089830973440 | birds   | NULL    | NULL     | IS            | GRANTED | TABLE  |
13| 422089830973440 | birds   | PRIMARY | 'pigeon' | S,GAP         | GRANTED | RECORD |
14| 422089830973440 | birds   | PRIMARY | 'pigeon' | S,REC_NOT_GAP | GRANTED | RECORD |
15| 422089830972632 | animals | NULL    | NULL     | IS            | GRANTED | TABLE  |
16| 422089830972632 | animals | PRIMARY | 'cow'    | S,REC_NOT_GAP | GRANTED | RECORD |
17+-----------------+---------+---------+----------+---------------+---------+--------+
185 rows in set (0.01 sec)
19
20mysql> SELECT REQUESTING_ENGINE_LOCK_ID as Req_Lock_Id,
21    ->               REQUESTING_ENGINE_TRANSACTION_ID as Req_Trx_Id,
22    ->               BLOCKING_ENGINE_LOCK_ID as Blk_Lock_Id,
23    ->               BLOCKING_ENGINE_TRANSACTION_ID as Blk_Trx_Id
24    ->         FROM performance_schema.data_lock_waits;
25+----------------------------------------+------------+----------------------------------------+-----------------+
26| Req_Lock_Id                            | Req_Trx_Id | Blk_Lock_Id                            | Blk_Trx_Id      |
27+----------------------------------------+------------+----------------------------------------+-----------------+
28| 140614854262784:69:4:2:140614746300912 |       6700 | 140614854261976:69:4:2:140614746294048 | 422089830972632 |
29+----------------------------------------+------------+----------------------------------------+-----------------+
301 row in set (0.01 sec)

performance_schema.data_lock_waits表可以看到,此时有2个事务,422089830973440422089830972632,分别代表客户端Bbirds表的锁和客户端Aanimals表的锁。

从等待锁的信息来看,等待锁的事务Id是6700,而等待释放锁的事务Id是422089830972632,正是客户端A开启的事务锁占用的animals表的锁。

若此时再次查看事务信息:

 1mysql> SELECT ENGINE_LOCK_ID as Lock_Id,
 2    ->               ENGINE_TRANSACTION_ID as Trx_id,
 3    ->               OBJECT_NAME as `Table`,
 4    ->               INDEX_NAME as `Index`,
 5    ->               LOCK_DATA as Data,
 6    ->               LOCK_MODE as Mode,
 7    ->               LOCK_STATUS as Status,
 8    ->               LOCK_TYPE as Type
 9    ->         FROM performance_schema.data_locks;
10+----------------------------------------+-----------------+---------+---------+----------+---------------+---------+--------+
11| Lock_Id                                | Trx_id          | Table   | Index   | Data     | Mode          | Status  | Type   |
12+----------------------------------------+-----------------+---------+---------+----------+---------------+---------+--------+
13| 140614854262784:1135:140614746303224   |            6700 | animals | NULL    | NULL     | IX            | GRANTED | TABLE  |
14| 140614854262784:1136:140614746303136   |            6700 | birds   | NULL    | NULL     | IS            | GRANTED | TABLE  |
15| 140614854262784:70:4:2:140614746300224 |            6700 | birds   | PRIMARY | 'pigeon' | S,GAP         | GRANTED | RECORD |
16| 140614854262784:70:4:2:140614746300568 |            6700 | birds   | PRIMARY | 'pigeon' | S,REC_NOT_GAP | GRANTED | RECORD |
17| 140614854261976:1135:140614746297040   | 422089830972632 | animals | NULL    | NULL     | IS            | GRANTED | TABLE  |
18| 140614854261976:69:4:2:140614746294048 | 422089830972632 | animals | PRIMARY | 'cow'    | S,REC_NOT_GAP | GRANTED | RECORD |
19+----------------------------------------+-----------------+---------+---------+----------+---------------+---------+--------+
206 rows in set (0.00 sec)

可以发现,事务(客户端B)相关的birds表的事务Id从422089830973440变成了6700

这是因为,当事务(客户端B)尝试修改数据库时,InnoDB使用序列事务Id,这就会改变事务(客户端B)的Id。

所以,这和上面的信息对应。实际上就是客户端B等待客户端A释放animals表的锁。

这时,如果在客户端A尝试修改birds表的内容,那么就会发生死锁:

1mysql> update birds set value = 2 where name = 'pigeon';
2ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction

InnoDB的死锁是立刻发生的,并且InnoDB回滚了造成死锁的事务,也就是上述客户端A的更新操作。

因此,客户端B的更新操作可以继续执行了。

1mysql> update animals set value = 2 where name = 'cow';
2Query OK, 1 row affected (19.42 sec)
3Rows matched: 1  Changed: 1  Warnings: 0

此时,事务A确实被回滚了。可以通过查询performance_schema.data_locks表得到验证:

 1mysql> SELECT ENGINE_LOCK_ID as Lock_Id,
 2    ->               ENGINE_TRANSACTION_ID as Trx_id,
 3    ->               OBJECT_NAME as `Table`,
 4    ->               INDEX_NAME as `Index`,
 5    ->               LOCK_DATA as Data,
 6    ->               LOCK_MODE as Mode,
 7    ->               LOCK_STATUS as Status,
 8    ->               LOCK_TYPE as Type
 9    ->         FROM performance_schema.data_locks;
10+----------------------------------------+--------+---------+---------+----------+---------------+---------+--------+
11| Lock_Id                                | Trx_id | Table   | Index   | Data     | Mode          | Status  | Type   |
12+----------------------------------------+--------+---------+---------+----------+---------------+---------+--------+
13| 140614854262784:1135:140614746303224   |   6700 | animals | NULL    | NULL     | IX            | GRANTED | TABLE  |
14| 140614854262784:1136:140614746303136   |   6700 | birds   | NULL    | NULL     | IS            | GRANTED | TABLE  |
15| 140614854262784:70:4:2:140614746300224 |   6700 | birds   | PRIMARY | 'pigeon' | S,GAP         | GRANTED | RECORD |
16| 140614854262784:70:4:2:140614746300568 |   6700 | birds   | PRIMARY | 'pigeon' | S,REC_NOT_GAP | GRANTED | RECORD |
17| 140614854262784:69:4:2:140614746301256 |   6700 | animals | PRIMARY | 'cow'    | X,REC_NOT_GAP | GRANTED | RECORD |
18+----------------------------------------+--------+---------+---------+----------+---------------+---------+--------+
195 rows in set (0.00 sec)

可以看到,事务ID全部是6700,说明是客户端B的事务,已经没有客户端A相关的信息了。

虽然死锁发生后很快被InnoDB处理,我们还是可以查看到死锁的信息:

通过Information Schema可以查看死锁的数量:

1mysql> SELECT `count` FROM INFORMATION_SCHEMA.INNODB_METRICS
2          WHERE NAME="lock_deadlocks";
3+-------+
4| count |
5+-------+
6|     1 |
7+-------+
81 row in set (0.00 sec)

此外,通过SHOW ENGINE INNODB STATUS 命令可以查看存储引擎的相关信息,里面包含了死锁的信息:

 1mysql> show engine innodb status ;
 2------------------------
 3LATEST DETECTED DEADLOCK
 4------------------------
 52024-08-05 13:51:17 140614374082112
 6*** (1) TRANSACTION:
 7TRANSACTION 6700, ACTIVE 608 sec starting index read
 8mysql tables in use 1, locked 1
 9LOCK WAIT 5 lock struct(s), heap size 1128, 4 row lock(s)
10MySQL thread id 10, OS thread handle 140614683272768, query id 70 localhost root updating
11update animals set value = 2 where name = 'cow'
12
13*** (1) HOLDS THE LOCK(S):
14RECORD LOCKS space id 70 page no 4 n bits 72 index PRIMARY of table `foobar`.`birds` trx id 6700 lock mode S locks rec but not gap
15Record lock, heap no 2 PHYSICAL RECORD: n_fields 4; compact format; info bits 0
16 0: len 6; hex 706967656f6e; asc pigeon;;
17 1: len 6; hex 000000001a2b; asc      +;;
18 2: len 7; hex 81000001220110; asc     "  ;;
19 3: len 4; hex 80000001; asc     ;;
20
21
22*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
23RECORD LOCKS space id 69 page no 4 n bits 72 index PRIMARY of table `foobar`.`animals` trx id 6700 lock_mode X locks rec but not gap waiting
24Record lock, heap no 2 PHYSICAL RECORD: n_fields 4; compact format; info bits 0
25 0: len 3; hex 636f77; asc cow;;
26 1: len 6; hex 000000001a29; asc      );;
27 2: len 7; hex 810000011b0110; asc        ;;
28 3: len 4; hex 80000001; asc     ;;
29
30
31*** (2) TRANSACTION:
32TRANSACTION 6701, ACTIVE 651 sec starting index read
33mysql tables in use 1, locked 1
34LOCK WAIT 4 lock struct(s), heap size 1128, 2 row lock(s)
35MySQL thread id 8, OS thread handle 140614684329536, query id 71 localhost root updating
36update birds set value = 2 where name = 'pigeon'
37
38*** (2) HOLDS THE LOCK(S):
39RECORD LOCKS space id 69 page no 4 n bits 72 index PRIMARY of table `foobar`.`animals` trx id 6701 lock mode S locks rec but not gap
40Record lock, heap no 2 PHYSICAL RECORD: n_fields 4; compact format; info bits 0
41 0: len 3; hex 636f77; asc cow;;
42 1: len 6; hex 000000001a29; asc      );;
43 2: len 7; hex 810000011b0110; asc        ;;
44 3: len 4; hex 80000001; asc     ;;
45
46
47*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
48RECORD LOCKS space id 70 page no 4 n bits 72 index PRIMARY of table `foobar`.`birds` trx id 6701 lock_mode X locks rec but not gap waiting
49Record lock, heap no 2 PHYSICAL RECORD: n_fields 4; compact format; info bits 0
50 0: len 6; hex 706967656f6e; asc pigeon;;
51 1: len 6; hex 000000001a2b; asc      +;;
52 2: len 7; hex 81000001220110; asc     "  ;;
53 3: len 4; hex 80000001; asc     ;;
54
55*** WE ROLL BACK TRANSACTION (2)

TODO:

  • 死锁的检测和处理
  • 如何避免死锁

References #