[转帖]等待事件 enq:TX - row lock contention分析与解决

6月30日，数据库发生了大量锁表。大概持续1小时，并且越锁越多。后来通过业务人员停掉程序，并kill掉会话后解决。

几天后再EM上查看CPU占用：
在这里插入图片描述
CPU发生了明显等待。

主要是由于enq:TX - row lock contention等待事件造成。

等待事件—enq:TX - row lock contention

enq是一种保护共享资源的锁定机制，一个排队机制，先进先出(FIFO)

发生TX锁的原因一般有几个:
1.不同的session更新或删除同一个记录。
2.唯一索引有重复索引
3.位图索引多次更新
4.同时对同一个数据块更新
5.等待索引块分裂

官网上关于TX - row lock contention的内容：

10.3.7.2.4 TX enqueue
These are acquired exclusive when a transaction initiates its first change and held until the transaction does a COMMIT or ROLLBACK.
Waits for TX in mode 6: occurs when a session is waiting for a row level lock that is already held by another session. This occurs when one user is updating or deleting a row, which another session wishes to update or delete. This type of TX enqueue wait corresponds to the wait event enq: TX - row lock contention.
The solution is to have the first session already holding the lock perform a COMMIT or ROLLBACK.
Waits for TX in mode 4 can occur if the session is waiting for an ITL (interested transaction list) slot in a block. This happens when the session wants to lock a row in the block but one or more other sessions have rows locked in the same block, and there is no free ITL slot in the block. Usually, Oracle dynamically adds another ITL slot. This may not be possible if there is insufficient free space in the block to add an ITL. If so, the session waits for a slot with a TX enqueue in mode 4. This type of TX enqueue wait corresponds to the wait event enq: TX - allocate ITL entry.
The solution is to increase the number of ITLs available, either by changing the INITRANS or MAXTRANS for the table (either by using an ALTER statement, or by re-creating the table with the higher values).
Waits for TX in mode 4 can also occur if a session is waiting due to potential duplicates in UNIQUE index. If two sessions try to insert the same key value the second session has to wait to see if an ORA-0001 should be raised or not. This type of TX enqueue wait corresponds to the wait event enq: TX - row lock contention.
The solution is to have the first session already holding the lock perform a COMMIT or ROLLBACK.
Waits for TX in mode 4 is also possible if the session is waiting due to shared bitmap index fragment. Bitmap indexes index key values and a range of ROWIDs. Each 'entry' in a bitmap index can cover many rows in the actual table. If two sessions want to update rows covered by the same bitmap index fragment, then the second session waits for the first transaction to either COMMIT or ROLLBACK by waiting for the TX lock in mode 4. This type of TX enqueue wait corresponds to the wait event enq: TX - row lock contention.
Waits for TX in Mode 4 can also occur waiting for a PREPARED transaction.
Waits for TX in mode 4 also occur when a transaction inserting a row in an index has to wait for the end of an index block split being done by another transaction. This type of TX enqueue wait corresponds to the wait event enq: TX - index contention.
10.3.7 enqueue (enq:) waits
Enqueues are locks that coordinate access to database resources. This event indicates that the session is waiting for a lock that is held by another session.
The name of the enqueue is included as part of the wait event name, in the form enq: enqueue_type - related_details. In some cases, the same enqueue type can be held for different purposes, such as the following related TX types:
enq: TX - allocate ITL entry
enq: TX - contention
enq: TX - index contention
enq: TX - row lock contention
The V$EVENT_NAME view provides a complete list of all the enq: wait events.
You can check the following V$SESSION_WAIT parameter columns for additional information:
P1 - Lock TYPE (or name) and MODE
P2 - Resource identifier ID1 for the lock
P3 - Resource identifier ID2 for the lock
复制

通过awr报告查看：

SQL> @?/rdbms/admin/awrrpt.sql
1
复制

1.这2个小时进行AWR的收集和分析，首先，从报告头中看到DB Time达到2176分钟，(DB Time)/Elapsed=18，这个比值偏高：
在这里插入图片描述
2.再看TOP 10事件：
看到排在第一位的是enq: TX - row lock contention事件，也就是说系统中在这一个小时里产生了较为严重的行级锁等待事件。

3. 同时，从段的统计信息章节中，也看到下面的信息：

看到row lock waits发生在表上。

通过命令查看

那么，究竟是什么操作导致了这个enq: TX - row lock contention等待事件呢？查看系统中，当前有哪些会话产生了enq: TX - row lock contention等待事件？

现在已经解锁了，所以无法查询

SQL> select event,sid,p1,p2,p3 from v$session_wait where event='enq: TX - row lock contention';
未选定行
复制

如果正在锁着，可以参考enq: TX - row lock contention等待事件

查看系统中的当前会话，是在哪个对象上产生了enq: TX - row lock contention等待事件？

查看引起enq: TX - row lock contention等待事件的object_id对象号

SQL> select ROW_WAIT_OBJ#,ROW_WAIT_FILE#,ROW_WAIT_BLOCK#,ROW_WAIT_ROW#  from v$session  where event='enq: TX - row lock contention';
复制

那么这个数据库对象为ROW_WAIT_OBJ#的对象究竟是什么呢？

查看ROW_WAIT_OBJ#对应的对象名称

SQL> select  object_name,object_idfrom  dba_objects  where object_id=【ROW_WAIT_OBJ#】;
复制

通过对象名称查看对象的owner及类型

SQL> select OWNER,OBJECT_NAME,OBJECT_ID,DATA_OBJECT_ID, OBJECT_TYPEfrom dba_objectswhereobject_name='【OBJECT_NAME】';
复制

定位到的结果应该同AWR报告中段统计信息吻合。

通过查询gv$session找到当前的等待事件

SQL> select event,count(*) from gv$session group by event;
EVENT                                   COUNT(*)

SQLNet message from client                           3205
wait for unread message on broadcast channel                  5
Streams AQ: waiting for messages in the queue                  4
ASM background timer                              2
ges remote message                              2
gcs remote message                              8
LNS ASYNC end of log                              2
pmon timer                                  2
rdbms ipc message                             60
jobq slave wait                               4
smon timer                                  2
gc cr request                                  1
Streams AQ: qmn slave idle wait                       2
class slave wait                             12
SQLNet message to client                          1
Space Manager: slave idle wait                          9
GCR sleep                                  2
VKTM Logical Idle Wait                              2
Streams AQ: waiting for time management or cleanup tasks          2
Streams AQ: qmn coordinator idle wait                      2
PX Deq: Execution Msg                              1
PX Deq Credit: send blkd                          1
DIAG idle wait                                  4
PING                                      2
PX Deq: Execute Reply                              1
已选择25行。
复制

因为当前已经没有这个等待事件了，可以查看GV_$ACTIVE_SESSION_HISTORY。

SQL> select SAMPLE_TIME,SESSION_ID,USER_ID,SQL_ID,EVENT,CURRENT_OBJ#,CURRENT_FILE#,CURRENT_BLOCK#  from GV_$ACTIVE_SESSION_HISTORY
where event like 'enq: TX%' and  SAMPLE_TIME like '30-6月%' and module='JDBC Thin Client' and rownum<=500;
1
2
复制

在这里插入图片描述
结果发现很多的enq: TX - row lock contention等待事件。

定位具体SQL：

通过sql_ID字段，查询

SQL> select INST_ID,SQL_TEXT from GV_$SQL where sql_id='f9kdn3mdv252a';
SQL> select * from dba_hist_sqltext where sql_id='f9kdn3mdv252a';
复制

查看是哪个用户的SQL
GV_$ACTIVE_SESSION_HISTORY中还有一个USER_ID=54

SQL> select USERNAME,USER_ID,CREATED from dba_users where USER_ID='54' ;
复制

查看到底是那个表出现了锁等待

SQL> select * from dba_objects where object_id='183598';
OWNER  OBJECT_NAME              SUBOBJECT_NAME    OBJECT_ID DATA_OBJECT_ID OBJECT_TYPE  CREATED                 LAST_DDL_TIME          TIMESTAMP           STATUS  T G S  NAMESPACE EDITION_NAME
------------------------------ -----------------  --------- -------------- ------------ ----------------------- ---------------------- ------------------- ------- - - - ---------- -----------
XX     NODETITLEREPORT_TASK_PRJ                   183598    183598         TABLE        25-4月 -2014 12:23:53   30-6月 -2020 15:42:53   2014-04-25:12:23:53 VALID   N N N            1
复制

解决办法

如果正在等待
1.通过v$session找到BLOCK=1的用户，告知用户提交事务

SQL> select SID,TYPE,ID1,ID2,LMODE,REQUEST,CTIME,BLOCK from V$lock where block=1 or request<>0;
复制

2.通过sid找到pid，kill掉该进程

3.更改sql语句，

SQL> SELECT * FROM QRTZ_LOCKS WHERE LOCK_NAME = :1 FOR UPDATE no wait
复制

加nowait的意思是得到或者得不到，不会等待

一般如果是现网中出现了大量类似的问题，排除人为原因，那么就要检查应用了。

参考：
初次遇见等待事件：enq;tx-row lock contention

enq: TX - row lock contention 等待事件

</article>
复制