Row locks: an important reason for InnoDB to replace MyISAM

行鎖:InnoDB 替代 MyISAM 的重要原因

Posted by MySQL on 2020-01-30 07:00:00

The default storage engine prior to MySQL 5.5 was MyISAM, which was changed to InnoDB after 5.5. The main reasons behind InnoDB's rise were:

InnoDB supports transactions: suitable for scenarios that require consistent data under concurrent conditions.

InnoDB supports row locks: effectively reduce locks caused by deletes or updates.

This section discusses row locks in InnoDB.

Before explaining row locking, let's first look at the two-phase locking protocol.

1 Two-phase lock

One principle of traditional relational database locking is: the two-phase locking principle.

Two-phase lock: The lock operation is divided into two phases, the lock phase and the unlock phase, and the lock phase and the unlock phase are guaranteed to disjoin. A lock is added when the statement is executed, but the lock is not released immediately after the statement is executed, but is not released until the transaction ends.

We can understand the two-phase lock from the following table:

2 InnoDB row lock mode

InnoDB implements the following two types of row locks:

Shared lock (S): allows one transaction to read a row, preventing other transactions from acquiring exclusive locks on the same data set.

Exclusive lock (X): Allows transactions that acquire exclusive locks to update data, preventing other transactions from acquiring shared read locks and exclusive write locks for the same data set.

For ordinary select statements, InnoDB does not add any locks, and the transaction can explicitly add shared or exclusive locks to the recordset with the following statement:

共享锁(S):select * from table_name where ... lock in share mode;

排他锁(X):select * from tablbe_name where ... for update;

3 InnoDB row lock algorithm

Three algorithms for InnoDB row locks:

Record Lock: Index lock on a single record

Gap Lock: Gap lock, which locks the gap between index entries, but does not include the record itself

Next-Key Lock: Gap Lock + Record Lock, lock a range, and lock the record itself.

The characteristics of InnoDB row lock implementation means that if you do not retrieve data through index conditions, InnoDB will lock all records in the table, the actual effect is the same as table lock.

4 transaction isolation levels

Row locks corresponding to different transaction isolation levels are also different, so before explaining the locking scope of row locks, briefly talk about transaction isolation levels. A detailed introduction to transaction isolation levels is in the next chapter.

4 isolation levels for MySQL:

Read uncommitted: At this isolation level, all transactions can see the execution results of other uncommitted transactions. Dirty reads may occur.

Read Committed (Read Committed: RC): A transaction can only see the changes that have been committed to the transaction. Because other instances of the same transaction may have new commits during that instance's processing, phantom reads may occur.

Repeatable Read (RR): This is MySQL's default transaction isolation level, which ensures that multiple instances of the same transaction will see the same data rows when reading data concurrently. Dirty and non-repeatable reads are eliminated, and phantom reads do not occur by default.

Serializable (serial): This is the highest isolation level, it solves the phantom read problem by forcing transaction ordering to make it impossible to conflict with each other.

5 Row lock experiment at RC isolation level

Sometimes we may think about how a certain statement (similar to select * from table_name where a =… for update) is locked?

If you want to analyze how a certain SQL locks, if you don't know other information, then you can score several cases, and the locks are different in different situations. Some common situations are as follows:

RC isolation level, field a is not indexed.

RC isolation level, a field has a unique index.

RC isolation level, a field has a non-unique index.

RR isolation level, field a is not indexed.

RR isolation level, a field has a unique index.

RR isolation level, field a has a non-unique index.


let's start the experiment!

5.1 querying through non-indexed fields

Let's first look at an example where the condition field does not use an index:

 

 

On the surface, it seems that session1 only added an exclusive lock to the b = 1 line, but session2 had a lock wait while requesting an exclusive lock on another line. See the figure below:

 

Because the b field has no index, it can only take a clustered index for a full table scan. As can be seen from the figure above, there is one record that meets the conditions, but all records on the clustered index are added with X locks.

Why not just lock the records that meet the conditions?

This is because in MySQL, if a condition cannot be quickly filtered by the index, the storage engine level will lock all records and return it, and then the server layer will filter it. Therefore, all records are locked.

Of course, MySQL has some improvements here. After the server layer filters out the data that does not meet the conditions, it will lock the records that do not meet the conditions. It is guaranteed that only locks that meet the conditions will be held in the end, but the locking operation of each record will not be omitted.

Summary: In the absence of an index, InnoDB's current read locks all records. So you should pay special attention to InnoDB in your work, otherwise a lot of lock conflicts may occur.

5.2 Querying by Unique Index
Let's look at an example where the condition field has a unique index:

 

Session1 added an exclusive lock to the a = 1 line. When an exclusive lock on another line is requested in session2, no wait occurs; but when an exclusive lock on the a = 1 line is requested in session2, wait occurs. See the figure below:

Because a is a unique index, the select * from t16 where a = 1 for update; (hereinafter referred to as SQL2) statement will select the index of column a for conditional filtering. After finding the record of a = 1, the unique index will be A = 1 adds an X lock to the index record. At the same time, it returns to the clustered index according to the read id column, and then adds an X lock to the clustered index corresponding to id = 1.

Why are records on clustered indexes locked?

For example, a concurrent SQL is updated through the primary key index: update t16 set b = 10 where id = 1; If SQL2 does not lock the record on the primary key index, then the concurrent update does not know that SQL2 is executing, so if When update is executed, it violates the constraint that the update or deletion on the same record requires serial execution.

Summary: If the condition of the query is a unique index, SQL needs to lock on the unique index that meets the condition, and it will lock on the corresponding clustered index.

5.3 Querying by Non-Unique Index
Let's look at an example where the condition field has a non-unique index:

 

 

We have added an exclusive lock to the data that satisfies the condition c = 3. As the result above, it is rows 3 and 4. Therefore, the data of lines 1 and 2 are not locked, and the data of lines 3 and 4 are locked. As shown below:

As can be seen from the above figure, on the non-unique index of the a field, all records satisfying c = 3 are locked. At the same time, the records on the corresponding primary key indexes are also locked. Compared to the case of querying through a unique index, a unique index query has at most one row of records locked, while a non-unique index will lock all records that meet the condition.

Summary: If the query condition is a non-unique index, then SQL needs to lock on the non-unique indexes that meet the condition, and it will lock on their corresponding clustered index.

6 Summary
Today we talked about InnoDB row locks. This is an important reason why InnoDB replaces MyISAM (only table locks are supported).

At the beginning of the manuscript, I explained to you two-stage locking, row locking mode, row locking algorithm, and transaction isolation level.

We did row lock experiments in different scenarios under the RC isolation level. The more important point is that when updating the data, if the condition field is not indexed, all records in the table will be X-locked. So you should let the query go to the index as much as possible in your work.

This section explains the RC isolation level lock experiment. In the next section, we will explain the row lock situation under the RR isolation level, so as to understand the meaning of the gap lock.

7 The problem
is to take the test table t16 in this section and perform the following experiments:

Result1 and Result are the same or different, you can experiment and analyze why this happens? Welcome to share your thoughts and discuss in the comment area. I will also put the result verification and cause analysis in the next section.

8 References
He Dengcheng's github: https://github.com/hedengcheng/tech/tree/master/database/MySQL

"Explore MySQL" (2nd Edition): 20.3.4 InnoDB row lock implementation

The column knowledge map is as follows:



InnoDB 支持事务:适合在并发条件下要求数据一致的场景。

InnoDB 支持行锁:有效降低由于删除或者更新导致的锁定。

本节就一起来探讨 InnoDB 的行锁。

在讲解行锁之前,我们首先来看一下两阶段锁协议。

1 两阶段锁

传统的关系型数据库加锁的一个原则是:两阶段锁原则。

两阶段锁:锁操作分为两个阶段,加锁阶段和解锁阶段,并且保证加锁阶段和解锁阶段不相交。在执行语句的时候加上锁,但并不是语句执行完就立刻释放锁,而是要等到事务结束时才释放。

我们可以通过下面这张表理解两阶段锁:

2 InnoDB 行锁模式

InnoDB 实现了以下两种类型的行锁:

共享锁(S):允许一个事务去读一行,阻止其他事务获得相同数据集的排他锁。

排他锁(X):允许获得排他锁的事务更新数据,阻止其他事务取得相同数据集的共享读锁和排他写锁。

对于普通 select 语句,InnoDB 不会加任何锁,事务可以通过以下语句显式给记录集加共享锁或排他锁:

共享锁(S):select * from table_name where ... lock in share mode;

排他锁(X):select * from tablbe_name where ... for update;

3 InnoDB 行锁算法

InnoDB 行锁的三种算法:

Record Lock:单个记录上的索引加锁

Gap Lock:间隙锁,对索引项之间的间隙加锁,但不包括记录本身

Next-Key Lock:Gap Lock + Record Lock,锁定一个范围,并且锁定记录本身。

InnoDB 行锁实现特点意味着:如果不通过索引条件检索数据,那么 InnoDB 将对表中所有记录加锁,实际效果跟表锁一样。

4 事务隔离级别

不同事务隔离级别对应的行锁也是不一样的,因此在讲解行锁的锁定范围之前,先简单聊聊事务隔离级别。事务隔离级别的详细介绍放在下一章。

MySQL 的 4 种隔离级别:

Read uncommitted(读未提交): 在该隔离级别,所有事务都可以看到其他未提交的事务的执行结果。可能会出现脏读。

Read Committed(读已提交,简称:RC):一个事务只能看见已经提交事务所做的改变。因为同一事务的其他实例在该实例处理期间可能会有新的 commit,所以可能出现幻读。

Repeatable Read(可重复读,简称:RR):这是 MySQL 的默认事务隔离级别,它确保同一事务的多个实例在并发读取数据时,会看到同样的数据行。消除了脏读、不可重复读,默认也不会出现幻读。

Serializable(串行):这是最高的隔离级别,它通过强制事务排序,使之不可能相互冲突,从而解决幻读问题。

5 RC 隔离级别下的行锁实验

有时我们可能会思考,某条语句(类似 select * from table_name where a=… for update)是怎么加锁的?

要想分析某条 SQL 是怎么加锁的,如果其他信息都不知道,那就得分几种情况了,不同情况加锁的方式也各不一样,比较常见的一些情况如下:

RC 隔离级别,a 字段没索引。

RC 隔离级别,a 字段有唯一索引。

RC 隔离级别,a 字段有非唯一索引。

RR 隔离级别,a 字段没索引。

RR 隔离级别,a 字段有唯一索引。

RR 隔离级别,a 字段有非唯一索引。

……
让我们开始实验吧!

5.1 通过非索引字段查询

我们首先来看一下条件字段不使用索引的例子:

 

 

表面看起来 session1 只给了 b=1 这一行加了排他锁,但 session2 在请求其他行的排他锁时,却出现了锁等待。看下图:

 

由于 b 字段没有索引,因此只能走聚簇索引,进行全表扫描。从上图中可以看到,满足条件的记录有一条,但是聚簇索引上的所有记录,都被加上了 X 锁。

为什么不是只在满足条件的记录上加锁呢?

这是因为在 MySQL 中,如果一个条件无法通过索引快速过滤,那么存储引擎层面就会将所有记录加锁后返回,然后由 server 层进行过滤。因此也就把所有记录都锁上了。

当然 MySQL 在这里有一些改进的,在 server 层过滤掉不满足条件的数据后,会把不满足条件的记录放锁。保证了最后只会持有满足条件的锁,但是每条记录的加锁操作还是不会省略。

总结:没有索引的情况下,InnoDB 的当前读会对所有记录都加锁。所以在工作中应该特别注意 InnoDB 这一特性,否则可能会产生大量的锁冲突。

5.2 通过唯一索引查询
我们再来看一下条件字段有唯一索引的例子:

 

session1 给了 a=1 这一行加了排他锁,在 session2 中请求其他行的排他锁时,不会发生等待;但是在 session2 中请求 a=1 这一行的排他锁时,会发生等待。看下图:

由于 a 是唯一索引,因此 select * from t16 where a=1 for update;(后面称为 SQL2) 语句会选择走 a 列的索引进行条件过滤,在找到 a=1 的记录后,会将唯一索引上 a=1 索引记录上加 X 锁,同时,会根据读取到的 id 列,回到聚簇索引,然后将 id=1 对应的聚簇索引项加 X 锁。

为什么聚簇索引上的记录也要加锁呢?

比如,并发的一条 SQL,是通过主键索引来更新:update t16 set b=10 where id =1; 如果 SQL2 没有将主键索引上的记录加锁,那么并发的 update 并不知道 SQL2 在执行,所以如果 update 执行了,就违背了同一记录上的更新或者删除需要串行执行的约束。

總結:如果查詢的條件是唯一索引,那麼SQL 需要在滿足條件的唯一索引上加鎖,並且會在對應的聚簇索引上加鎖。

5.3通過非唯一索引查詢
我們再來看一下條件字段有非唯一索引的例子:

 

 

我們在滿足條件c=3 的數據上加了排他鎖,如上面結果,就是第3、4 行。因此第1、2 行的數據沒被鎖,而3、4 行的數據被鎖了。如下圖:

通過上圖可以看到,在a 字段的非唯一索引上,滿足c=3 的所有記錄,都被加了鎖。同時,對應的主鍵索引上的記錄也都加上了鎖。與通過唯一索引查詢的情況相比,唯一索引查詢最多有一行記錄被鎖,而非唯一索引將會把滿足條件的所有記錄都加上鎖。

總結:如果查詢的條件是非唯一索引,那麼SQL 需要在滿足條件的非唯一索引上都加上鎖,並且會在它們對應的聚簇索引上加鎖。

6總結
今天我們聊了一下InnoDB行鎖,這是InnoDB替代MyISAM(只支持表鎖)的一個比較重要的原因。

在文稿的開始,跟大家講解了兩階段鎖、行鎖模式、行鎖算法以及事務隔離級別等。

我們做了RC 隔離級別下不同場景的行鎖實驗,比較重要的一點是:在更新數據時,如果條件字段沒索引,則表中所有記錄都會被加上X 鎖。所以在工作中應該盡可能的讓查詢走索引。

本節講解了RC 隔離級別的鎖實驗,在下節會講解RR 隔離級別下的行鎖情況,從而理解間隙鎖的意義。

7問題
還是拿本節的測試表t16,進行如下實驗:

Result1和Result結果是相同的還是不同的,你可以實驗一下,然後分析為什麼會出現這種情況?歡迎把你的思路分享在留言區一起討論,我也會把結果驗證和原因分析放在下一節。

8參考資料
何登成的github:https://github.com/hedengcheng/tech/tree/master/database/MySQL

《深入淺出MySQL》(第2 版):20.3.4 InnoDB 行鎖實現方式

專欄知識圖如下:

 

Ref:https://mp.weixin.qq.com/s/Wg0EsCSxW5bU2bT3njZc8g