http://kernelmaker.github.io/MySQL_8_core
最近对比了MySQL 5.6和8.0在8核环境下oltp_write_only的性能,发现8.0写入性能(QPS 6-7万)反而低于5.6版本的(QPS 14万),所以进一步测试分析了下redo log这里可能导致性能降低的原因,整理记录一下
sysbench –mysql-host=IP –mysql-port=PORT –mysql-user=mysql –mysql-password=PASSWD –tables=250 –table_size=25000 –db_driver=mysql –threads=128 –report-interval=5 –rand-type=uniform prepare
sysbench –mysql-host=IP –mysql-port=PORT –mysql-user=mysql –mysql-password=PASSWD –tables=250 –table_size=25000 –db_driver=mysql –threads=128 –report-interval=5 –rand-type=uniform –max-time=360 –max-requests=3000000 run
sysbench –mysql-host=IP –mysql-port=PORT –mysql-user=mysql –mysql-password=PASSWD –table_size=25000 –db_driver=mysql –threads=128 cleanup
SQL statistics:
queries performed:
read: 0
write: 12000000
other: 6000000
total: 18000000
transactions: 3000000 (10792.71 per sec.)
queries: 18000000 (64756.24 per sec.)
ignored errors: 0 (0.00 per sec.)
reconnects: 0 (0.00 per sec.)
General statistics:
total time: 277.9637s
total number of events: 3000000
CPU:
SQL statistics:
queries performed:
read: 0
write: 12000000
other: 6000000
total: 18000000
transactions: 3000000 (10705.28 per sec.)
queries: 18000000 (64231.71 per sec.)
ignored errors: 0 (0.00 per sec.)
reconnects: 0 (0.00 per sec.)
General statistics:
total time: 280.2336s
total number of events: 3000000
CPU:
SQL statistics:
queries performed:
read: 0
write: 12000000
other: 6000000
total: 18000000
transactions: 3000000 (12860.01 per sec.)
queries: 18000000 (77160.08 per sec.)
ignored errors: 0 (0.00 per sec.)
reconnects: 0 (0.00 per sec.)
General statistics:
total time: 233.2794s
total number of events: 3000000
CPU:
SQL statistics:
queries performed:
read: 0
write: 12000000
other: 6000000
total: 18000000
transactions: 3000000 (15305.69 per sec.)
queries: 18000000 (91834.11 per sec.)
ignored errors: 0 (0.00 per sec.)
reconnects: 0 (0.00 per sec.)
General statistics:
total time: 196.0038s
total number of events: 3000000
CPU:
从上面的结果可以看出来,
综上,关于upstream 8.0写入性能较低的原因推测是log_writer和log_flusher由于和大量用户线程共享CPU核心,得不到充分调度成为瓶颈,影响整体写入性能,并且log_flusher更为严重。
为了进一步验证上面的推测,我使用sysbench –max-requests=3000000分别对如下4中场景:
a.不隔离
b.同时隔离log_writer及log_flusher
c.单独隔离log_writer
d.单独隔离log_flusher
进行等量压测,每次都是300万次请求,另外为了LSN的可比较性,关掉了undo purge,基于此,我在sysbench run阶段对InnoDB log module进行打点记录,下面分组来看具体性能和打点信息:
a.不隔离 | b.隔离writer&flusher | c.隔离writer | d.隔离flusher | |
---|---|---|---|---|
QPS | 78739.83 | 93705.47 | 84082.08 | 81546.58 |
TIME | 228.5992s | 192.0894s | 214.0747s | 220.7310s |
由于关掉了undo purge,除场景b外,其他场景QPS相较于之前的测试有所提升是符合预期的
a.不隔离 | b.隔离writer&flusher | c.隔离writer | d.隔离flusher | |
---|---|---|---|---|
log_lsn_current | 5736265204 | 5736254809 | 5736543934 | 5736474256 |
同样由于关掉了undo purge,对于sysbench –max-requests=3000000,lsn差不多
a.不隔离 | b.隔离writer&flusher | c.隔离writer | d.隔离flusher | |
---|---|---|---|---|
log_waits | 0 | 0 | 0 | 0 |
log_on_buffer_space_no_waits | 0 | 0 | 0 | 0 |
log_on_buffer_space_waits | 0 | 0 | 0 | 0 |
log_on_buffer_space_waits_loops | 0 | 0 | 0 | 0 |
上面的这4个监控信息都是
mtr_t::commit()->
mtr_t::Command::execute()->
log_buffer_reserve()->
log_wait_for_space_after_reserving()->
log_wait_for_space_in_log_buf()->
log_write_up_to(flush_to_disk=false)->
log_wait_for_write()
里的打点,它们全为0,也就是说明配置使用的64M大小log buffer是够用的,无论隔离与否,都不会出现因为log buffer不够用而进行回收时的wait
a.不隔离 | b.隔离writer&flusher | c.隔离writer | d.隔离flusher | |
---|---|---|---|---|
log_on_recent_written_wait_loops | 0 | 0 | 0 | 0 |
log_on_recent_closed_wait_loops | 0 | 0 | 0 | 0 |
mtr_t::commit()->
mtr_t::Command::execute()->
mtr_write_log_t::()->
log_buffer_write_completed()
里如果由于recent_written不足等待回收则进行的打点
mtr_t::commit()->
mtr_t::Command::execute()->
mtr_write_log_t::()->
log_wait_for_space_in_log_recent_closed()
里如果由于recent_closed不足等待回收则进行的打点
它们都为0说明recent_written和recent_closed当前配置够用,并且log_writer和log_closer对其回收及时。
重点关注下面的打点信息,首先先说下打点位置
a.不隔离 | b.隔离writer&flusher | c.隔离writer | d.隔离flusher | |
---|---|---|---|---|
log_write_requests | 53837519 | 53772294 | 53987892 | 54287129 |
log_writes | 24891146 | 22661060 | 40653491 | 4036257 |
mtr_t::commit()->
mtr_t::Command::execute()->
log_buffer_reserve()
里打点,可以看到mtr的个数也差不多
log_writer()->
log_writer_write_buffer()->
log_files_write_buffer()
里log_writer将log写进table cache并更新log_sys->write_lsn之后的打点
a.不隔离 | b.隔离writer&flusher | c.隔离writer | d.隔离flusher | |
---|---|---|---|---|
log_writer_no_waits | 24110198 | 21737096 | 38915995 | 3395741 |
log_writer_waits | 706 | 525 | 341 | 233 |
log_writer_wait_loops | 4505 | 4110 | 3084 | 1467 |
log_writer_on_file_space_waits | 0 | 0 | 0 | 0 |
log_write_notifier_no_waits | 96650 | 117586 | 141001 | 42684 |
log_write_notifier_waits | 6364591 | 2184418 | 2405656 | 2168416 |
log_write_notifier_wait_loops | 6702652 | 2520352 | 2439346 | 2846572 |
log_on_write_no_waits | 0 | 0 | 0 | 0 |
log_on_write_waits | 0 | 0 | 0 | 0 |
log_on_write_wait_loops | 0 | 0 | 0 | 0 |
log_writer()->
os_event_wait_for(log.writer_event)
后的打点,log_writer_no_waits是在os_event_wait_for()时发现有连续可写的内容不需wait的次数,log_writer_waits是在os_event_wait_for()时发现没有连续可写的内容需要wait的次数
log_write_notifier()->
os_event_wait_for(log.write_notifier_event)
后的打点,log_writer_no_waits是在os_event_wait_for()时发现log.write_lsn向前推进所以不需wait的次数,log_write_notifier_waits是os_event_wait_for()时发现log.write_lsn没有向前推进所以需要wait的次数,待分析
XXX->
log_write_up_to()->
log_wait_for_write()->
os_event_wait_for(log.write_events[])
后的打点,可以看到都为0,说明并不存在用户线程在等待log_writer更新log.write_lsn到指定lsn的时候。
a.不隔离 | b.隔离writer&flusher | c.隔离writer | d.隔离flusher | |
---|---|---|---|---|
log_flusher_no_waits | 219898 | 4639077 | 353433 | 843696 |
log_flusher_waits | 11126 | 201415 | 215 | 59647 |
log_flusher_wait_loops | 205948 | 224423 | 2083 | 2797221 |
log_flush_notifier_no_waits | 7049 | 121679 | 2944 | 62289 |
log_flush_notifier_waits | 128905 | 950814 | 231335 | 218895 |
log_flush_notifier_wait_loops | 980173 | 1091132 | 1059829 | 958259 |
log_on_flush_no_waits | 760 | 8934 | 972 | 1506 |
log_on_flush_waits | 2993605 | 2975193 | 2993935 | 2991368 |
log_on_flush_wait_loops | 8616066 | 3011060 | 6091019 | 6954663 |
log_flush_lsn_avg_rate | 358652 | 11395 | 234238 | 94717 |
log_flush_total_time | 181411112us (181s) | 137893867us (138s) | 209647279us (210s) | 29879924us (30s) |
log_flush_avg_time | 683us | 27us | 642us | 25us |
log_flusher()->
os_event_wait_for(log.flusher_event)
后的打点,log_flusher_no_waits是在os_event_wait_for()时发现log.write_lsn向前推进所以不需要wait的次数,
log_flusher_waits是在os_event_wait_for()时发现log.write_lsn没有向前推进所以需要wait的次数,log_flusher_wait_loops则是在wait里被唤醒或timeout之后发现condition还不满足而继续wait的loop次数
log_flush_notifier()->
os_event_wait_for(log.flush_notifier_event)
后的打点,log_flush_notifier_no_waits是在os_event_wait_for()时发现log.flushed_to_disk_lsn更新所以不需要wait的次数,log_flush_notifier_waits是在os_event_wait_for()时发现log.flushed_to_disk_lsn没更新所以需要wait的次数
XXX->
log_write_up_to(flush_to_disk=true)->
log_wait_for_flush()
后的打点,这里是用户线程在(innodb_flush_log_at_trx_commit = 1)等待flush_to_disk_lsn更新到自己对应的LSN,log_on_flush_no_waits是当用户线程对应的LSN的redo log已被log_flusher刷盘,无需wait的次数,log_on_flush_waits是当用户线程对应LSN的暂未不刷盘,需要wait的次数,log_on_flush_wait_loops是在wait中被唤醒或timeout后发现condition还不满足继续wait的loop次数
基于以上信息,分析如下,以a.不隔离作为基准
MySQL 8.0将写redo log拆分成多个线程异步来做的方式,可能并不是理想的优化,
所以目前我能想到的改进方案有如下2种: