本章节包含以下内容:
KingbaseES使用的是基于成本的优化器。优化器会估计SQL语句的每个可能的执行计划的成本,然后选择成本最低的执行计划来执行。因为优化器不计算数据的某些属性,比如列之间的相关性,优化器有时选择的计划并不一定是最优的。
Hint的作用就是通过使用特殊形式的注释中的hint短语来指定执行SQL语句所用的执行计划。Hint为用户提供了直接影响执行计划生成的手段,用户可以通过指定join顺序,join、scan方法,指定结果行数等多个手段来进行执行计划的调优,以提升查询的性能。
从V8R6C4开始使用HINT功能,V8R6C5对V8R6C4进行了更新,增加了新的HINT功能,并对其中的一些功能进行了增强。
版本 |
功能 | |
---|---|---|
V8R6C4 |
表扫描 |
SeqScan(table) TidScan(table) IndexScan(table[ index...]) IndexOnlyScan(table[ index...]) BitmapScan(table[ index...]) NoSeqScan(table) NoTidScan(table) NoIndexScan(table) NoIndexOnlyScan(table) NoBitmapScan(table) IndexScanRegexp (table[regexp...]) BitmapScanRegexp (table[regexp...]) IndexOnlyScanRegexp(table[regexp...]) ForceSeqScan(table) ForceTidScan(table) ForceIndexScan(table[ index...]) ForceIndexOnlyScan(table[ index...]) ForceBitmapScan(table[ index...]) |
连接方式 |
NestLoop(table table[ table...]) HashJoin(table table[ table...]) MergeJoin(table table[ table...]) NoNestLoop(table table[ table...]) NoHashJoin(table table[ table...]) NoMergeJoin(table table[ table...]) ForceNestLoop(table table[ table...]) ForceHashJoin(table table[ table...]) ForceMergeJoin(table table[ table...]) |
|
连接顺序 |
leading(join_table_list) leading((outer_table inner_table)) |
|
行数更正 |
rows(table_list #|+|-|* const) |
|
并行执行 |
Parallel(table_name workers) |
|
设置GUC参数 |
Set(Param_Name Param_Value) |
|
V8R6C5 |
块命名 |
blockname(subquery_name) |
Nestloop内表物化 |
materialize(inner_table_list) |
|
Nestloop内表使用索引 |
use_nl_with_index(inner_table) |
|
表连接顺序 |
ordered |
|
聚集 |
Hashagg Groupagg |
HINT通过在目标SQL语句SELECT之后给出的特殊形式的注释来读取HINT注释。注释的形式以字符序列‘/ +’开头以‘ /’结尾,例如/ +SeqScan(tablename) /,其具体使用方法如下:
在kingbase.conf配置文件中,配置:
enable_hint = on
启动数据库
使用Hint的注释使Hint生效:
Scan类型的hint:指明单表扫描时在目标表上使用特定的扫描方法。它只会在普通表、继承表、UNLOGGED表、临时表和系统目录上生效。外部表、表函数、VALUES子句、CTE、视图和子查询不受影响。其括号内的值只能是一个表名。但可以在一个查询中,对其中的多个目标表进行不同方式的干预。
HINT类型
描述
SeqScan(table)
优先在表上使用顺序扫描。
TidScan(table)
优先在表上使用Tid扫描。
IndexScan(table[ index...])
优先在表上使 用索引扫描,只限制指定的索引。
IndexOnlyScan(table[ index...])
优先在表上使用Index only scan,限制特定的索引。当Index only s can不可用时,可以使用索引扫描。
BitmapScan(table[ index...])
优先在表上使用位图扫描。
NoSeqScan(table)
优先不在表上使用顺序扫描。
NoTidScan(table)
优先不在表上使用Tid扫描。
NoIndexScan(table)
优先不在表上使用索引扫描和index only scan。
NoIndexOnlyScan(table)
优先不在表上使用index only scan。
NoBitmapScan(table)
优先不在表上使用位图索引。
IndexScanRegexp (table[regexp...])
优先在表上使用索引扫描。 索引名要满足指定的正则表达式。
BitmapScanRegexp (table[regexp...])
优先在表上使用位图扫描。 索引名要满足指定的正则表达式。
IndexO nlyScanRegexp(table[regexp...])
优先在表上使用Index only scan。 索引名要满足指定的正则表达式。
ForceSeqScan(table)
强制在表上使用顺序扫描。
ForceTidScan(table)
强制在表上使用Tid扫描。
ForceIndexScan(table[ index...])
强制在表上使 用索引扫描,只限制指定的索引。
ForceIndexOnlyScan(table[ index...])
强制在表上使用Index only scan,限制特定的索引。当Index only s can不可用时,可以使用索引扫描。
ForceBitmapScan(table[ index...])
强制在表上使用 位图扫描。当指定索引名之后,位图 扫描会使用指定的索引进行扫描。
并行执行类型的hint:在扫描时强制执行或禁止并行执行。它可以对普通表,继承父项,UNLOGGED表和系统目录产生影响。外部表,表函数,值子句,CTE,视图和子查询不受影响:
Parallel(table_name workers)
ROWS类型的hint:指明中间结果集的大小,支持绝对值和相对值。该Hint可以修正因优化器的限制而连接操作后错估的行数。该hint支持指定多个表的连接预估行数,也可以指定简单表,继承或分区表,外表,子查询,CTE的行数:
rows(table_list #|+|-|* const)
Join类型的hint:在指定的表进行连接时,强制使用特定的连接方法。它可以影响普通表、继承表、UNLOGGED表、临时表、外部表、系统目录、表函数、VALUES命令结果和CTE上的连接。但是视图上的连接和子查询不受影响:
HINT类型
描述
NestLoop(table table[ table...])
在对指定的表 进行连接时,使用循环嵌套连接。
HashJoin(table table[ table...])
在对指定 的表进行连接时,使用散列连接。
MergeJoin(table table[ table...])
在对指定的表 进行连接时,使用排序合并连接。
NoNestLoop(table table[ table...])
在对指定的表进 行连接时,不使用循环嵌套连接。
NoHashJoin(table table[ table...])
在对指定的 表进行连接时,不使用散列连接。
NoMergeJoin(table table[ table...])
在对指定的表进 行连接时,不使用排序合并连接。
Leading类型的hint:指定在两个或多个表进行连接时的连接顺序,包括内外表顺序。一个查询块上只能有一个leading类型的hint生效,ordered hint具有最高的优先级,当没有ordered而有多个leading时,最后一个起作用:
仅指定join顺序,不指定内外表顺序:
leading(join_table_list)
同时指定join顺序和内外表顺序:
leading((outer_table inner_table))
按照SQL中表出现的顺序指定连接的顺序:
ordered
SET类型的hint:可以对查询语句块设置用户级别的配置参数,即通过SET语句可以修改的参数。参数影响的为该语句块及其子查询涉及的表的执行计划生成:
Set(Param_Name Param_Value)
Blockname类型的hint:为子查询或子链接命名。Hint支持跨查询块的对象引用。即一个查询块中的hint可以通过”块名.表名的”的形式控制其子查询块中的表。当有些子查询块没有名字标识时(例如子链接),可以使用该hint来进行命名:
blockname(subquery_name)
Materialized类型的hint:可多个表进行连接时,强制使用nestloop连接且内表为inner_table_list指定的一个或多个表,并且对内表进行物化:
materialize(inner_table_list)
Use_nl_with_index类型的hint:可多个表进行连接时,强制使用nestloop连接且内表为inner_table指定的表,并且内表使用连接条件上的列使用索引扫描:
use_nl_with_index(inner_table)
Aggregate类型的hint:KingbaseES的聚合运算的算法有两种:排序(GroupAgggregate)和哈希(HashAggregate)。通过hint可以影响优化器选择哪种聚合运算:
Hashagg
Groupagg
参数名
描述
默认值
enable_hint
启用HINT
off
hint_debug_print
是否打印HINT的debug信息
off
hint_message_level
指定de bug打印的信息级别。可用的值为error, warning, notice, info, log, debug1, debug2, debug3, debug4,debug5
log
参数enable_hint默认关闭,当此参数关闭后,SQL查询语句中的HINT注释将不会对执行计划产生影响。
当参数hint_debug_print设置为on,并且hint_message_level设置为log后,在一个带有格式正确的HINT注释的SQL语句运行时,会一同输出hint的调试信息,内容包含used hint、not used hint、duplication hint、error hint。
环境准备:
create table t1(id int, val int, name varchar(64)); create table t2(id int, val int); create table t3(id int, val int); create index t1_idx on t1(id); create index t2_idx on t2(id); create index t3_idx on t3(id); insert into t1 select i, i%5000, 'Kingbase'||(i%5) from generate_series(1,3000000) as x(i); insert into t2 select i, i%5000 from generate_series(1,1000000) as x(i); insert into t3 select i, i%5 from generate_series(1,100) as x(i); analyze t1; analyze t2; analyze t3;
SeqScan HINT例子:
explain select * from t1 where id=20; QUERY PLAN ------------------------------------------------------------------ -> Index Scan using t1_idx on t1 (cost=0.43..8.45 rows=1 width=18) Index Cond: (id = 20) (2 rows) explain select/*+seqscan(t1)*/ * from t1 where id=20; QUERY PLAN ------------------------------------------------------------------ -> Seq Scan on t1 (cost=0.00..56731.00 rows=1 width=18) Filter: (id = 20) (2 rows)
IndexScan HINT例子:
explain select * from t1 where id > 10 and id < 20000000; QUERY PLAN ------------------------------------------------------------------ -> Seq Scan on t1 (cost=0.00..64231.00 rows=2999990 width=18) Filter: ((id > 10) AND (id < 20000000)) (2 rows) explain select/*+IndexScan(t1)*/ * from t1 where id > 10 and id < 20000000; QUERY PLAN ------------------------------------------------------------------ -> Index Scan using t1_idx on t1 (cost=0.43..112326.23 rows=2999990 width=18 Index Cond: ((id > 10) AND (id < 20000000)) (2 rows)
IndexOnlyScan HINT例子:
explain select count(0) from t1; QUERY PLAN ------------------------------------------------------------------ Aggregate (cost=56731.00..56731.01 rows=1 width=8) -> Seq Scan on t1 (cost=0.00..49231.00 rows=3000000 width=0 (2 rows) explain select/*+IndexOnlyScan(t1)*/ count(0) from t1; QUERY PLAN ------------------------------------------------------------------ Aggregate (cost=104826.43..104826.44 rows=1 width=8) -> Index Only Scan using t1_idx on t1 (cost=0.43..97326.43 rows=3000000 width=0) (2 rows)
Bitmapscan HINT例子:
explain select * from t1 where id > 30; QUERY PLAN ------------------------------------------------------------- -> Seq Scan on t1 (cost=0.00..56731.00 rows=2999970 width=18) Filter: (id > 30) (2 rows) explain select/*+bitmapscan(t1)*/ * from t1 where id > 30;** QUERY PLAN ------------------------------------------------------------------ Bitmap Heap Scan on t1 (cost=56342.20..113072.82 rows=2999970 width=18) Recheck Cond: (id > 30) Bitmap Index Scan on t1_idx (cost=0.00..55592.20 rows=2999970 width=0) Index Cond: (id > 30) (4 rows)
使用并行例子
explain analyze select/*+Parallel(t2 2)*/ t2.id from t2,t3 where t2.id=t3.val group by t2.id; QUERY PLAN ------------------------------------------------------------------ Group (cost=5.35..7.15 rows=100 width=4) (actual time=22.341..24.182 rows=4 loops=1) Group Key: t2.id -> Merge Join (cost=5.35..6.90 rows=100 width=4) (actual time=22.338..24.142 rows=80 loops=1) Merge Cond: (t2.id = t3.val) -> Gather Merge (cost=0.02..10424.84 rows=1000000 width=4)(actual time=22.100..23.800 rows=5 loops=1) Workers Planned: 2 Workers Launched: 2 -> Parallel Index Only Scan using t2_idx on t2 (cost=0.00..0.00 rows=416667 width=4) (actual time=0.051..0.328 rows=421 loops=3) Heap Fetches: 1262 -> Sort (cost=5.32..5.57 rows=100 width=4) (actual time=0.210..0.250 rows=100 loops=1) Sort Key: t3.val Sort Method: quicksort Memory: 30kB -> Seq Scan on t3 (cost=0.00..2.00 rows=100 width=4) (actual time=0.042..0.110 rows=100 loops=1) Planning Time: 0.857 ms Execution Time: 24.359 ms (15 rows)
测试示例原始计划,例子中t2表的基数估计值为1467,而实际执行行数是3,差距较大。
explain analyze select t2.id from t2,t3 where t2.id=t3.id and t2.id>5 and t2.val<8 and t3.id<9 group by t2.id; QUERY PLAN ------------------------------------------------------------------ Group (cost=2.77..27.00 rows=1 width=4) (actual time=0.221..6.299 rows=2 loops=1) Group Key: t2.id -> Merge Join (cost=2.77..27.00 rows=1 width=4) (actual time=0.218..6.290 rows=2 loops=1) Merge Cond: (t2.id = t3.id) -> Index Scan using t2_idx on t2 (cost=0.42..35468.31 rows=1467 width=4) (actual time=0.104..6.166 rows=3 loops=1) Index Cond: (id > 5) Filter: (val < 8) Rows Removed by Filter: 4992 -> Sort (cost=2.35..2.37 rows=7 width=4) (actual time=0.100..0.104 rows=8 loops=1) Sort Key: t3.id Sort Method: quicksort Memory: 25kB -> Seq Scan on t3 (cost=0.00..2.25 rows=7 width=4) (actual time=0.029..0.078 rows=8 loops=1) Filter: (id < 9) Rows Removed by Filter: 92 Planning Time: 1.033 ms Execution Time: 6.431 ms (16 rows)
ROWS HINT修正基数估计,修正后优化器选择了最优的连接方式NestLoop,而不是最原始的MergeJoin。
explain analyze select/*+Rows(t2 #3)*/t2.id from t2,t3 where t2.id=t3.id and t2.id>5 and t2.val<8 and t3.id<9 group by t2.id; QUERY PLAN ------------------------------------------------------------------ Group (cost=61.46..61.47 rows=1 width=4) (actual time=0.206..0.215 rows=2 loops=1) Group Key: t2.id -> Sort (cost=61.46..61.47 rows=1 width=4) (actual time=0.201..0.204 rows=2 loops=1) Sort Key: t2.id Sort Method: quicksort Memory: 25kB -> Nested Loop (cost=0.42..61.45 rows=1 width=4) (actual time=0.098..0.186 rows=2 loops=1) -> Seq Scan on t3 (cost=0.00..2.25 rows=7 width=4) (actual time=0.037..0.086 rows=8 loops=1) Filter: (id < 9) Rows Removed by Filter: 92 -> Index Scan using t2_idx on t2 (cost=0.42..8.45 rows=1 width=4) (actual time=0.009..0.010 rows=0 loops=8) Index Cond: ((id = t3.id) AND (id > 5)) Filter: (val < 8) Rows Removed by Filter: 0 Planning Time: 1.079 ms Execution Time: 0.337 ms (15 rows)
本章节以 select t2.id from t2,t3 where t2.id=t3.val and t2.id>5 group by t2.id; 语句为例子进行演示三种常用Join类型的HINT。
原始计划使用MergeJoin连接:
explain select t2.id from t2,t3 where t2.id=t3.val and t2.id>5 group by t2.id; QUERY PLAN -------------------------------------------------------------------- Group (cost=5.75..7.64 rows=100 width=4) Group Key: t2.id -> Merge Join (cost=5.75..7.39 rows=100 width=4) Merge Cond: (t2.id = t3.val -> Index Only Scan using t2_idx on t2 (cost=0.42..32968.32 rows=999994 width=4) Index Cond: (id > 5) -> Sort (cost=5.32..5.57 rows=100 width=4) Sort Key: t3.val -> Seq Scan on t3 (cost=0.00..2.00 rows=100 width=4) (9 rows)
NestLoop连接HINT:
explain select/*+Nestloop(t2 t3)*/ t2.id from t2,t3 where t2.id=t3.val and t2.id>5 group by t2.id; QUERY PLAN ------------------------------------------------------------------ Group (cost=842.82..843.32 rows=100 width=4) Group Key: t2.id -> Sort (cost=842.82..843.07 rows=100 width=4) Sort Key: t2.id -> Nested Loop (cost=0.42..839.50 rows=100 width=4) -> Seq Scan on t3 (cost=0.00..2.00 rows=100 width=4) -> Index Only Scan using t2_idx on t2 (cost=0.42..8.37 rows=1 width=4) Index Cond: ((id = t3.val) AND (id > 5)) (8 rows)
通过HashJoin连接HINT进行更改:
explain select/*+Hashjoin(t2 t3)*/ t2.id from t2,t3 where t2.id=t3.val and t2.id>5 group by t2.id; QUERY PLAN -------------------------------------------------------------------- Group (cost=29431.62..29432.12 rows=100 width=4) Group Key: t2.id -> Sort (cost=29431.62..29431.87 rows=100 width=4) Sort Key: t2.id Hash Join (cost=29424.93..29428.30 rows=100 width=4) Hash Cond: (t3.val = t2.id) -> Seq Scan on t3 (cost=0.00..2.00 rows=100 width=4) -> Hash (cost=16925.00..16925.00 rows=999994 width=4) -> Seq Scan on t2 (cost=0.00..16925.00 rows=999994 width=4) Filter: (id > 5) (10 rows)
下面例子,执行计划中两表使用Hashjoin,使用MergeJoin连接HINT进行更改:
explain select t2.id from t2,t1 where t2.id=t1.val group by t2.id; QUERY PLAN -------------------------------------------------------------------- HashAggregate (cost=124906.00..134906.00 rows=1000000 width=4) Group Key: t2.id Hash Join (cost=26925.00..117406.00 rows=3000000 width=4) Hash Cond: (t1.val = t2.id) -> Seq Scan on t1 (cost=0.00..49231.00 rows=3000000 width=4) -> Hash (cost=14425.00..14425.00 rows=1000000 width=4) -> Seq Scan on t2 (cost=0.00..14425.00 rows=1000000 width=4) (7 rows) explain select/*+Mergejoin(t1 t2)*/ t2.id from t2,t1 where t2.id=t1.val group by t2.id; QUERY PLAN -------------------------------------------------------------------- Group (cost=372004.30..424667.70 rows=1000000 width=4) Group Key: t2.id -> Merge Join (cost=372004.30..417167.70 rows=3000000 width=4) Merge Cond: (t2.id = t1.val) -> Index Only Scan using t2_idx on t2 (cost=0.42..30468.42 rows=1000000 width=4) -> Sort (cost=371978.97..379478.97 rows=3000000 width=4) Sort Key: t1.val -> Seq Scan on t1 (cost=0.00..49231.00 rows=3000000 width=4) (8 rows)
Leading类型的HINT种类1:仅指定join顺序,不指定内外表顺序:
下面例子中各表的连接顺序为(t3 t1 t2),使用leading HINT更改连接顺序,但不指定内外表。
explain select t2.id from t1,t2,t3 where t1.id=t3.id and t1.id<3 and t3.val=t2.id group by t2.id; QUERY PLAN ------------------------------------------------------------------ Group (cost=10.72..10.87 rows=1 width=4) Group Key: t2.id -> Merge Join (cost=10.72..10.87 rows=1 width=4) Merge Cond: (t2.id = t3.val) -> Index Only Scan using t2_idx on t2 (cost=0.42..30468.42 rows=1000000 width=4) -> Sort (cost=10.29..10.30 rows=1 width=4) Sort Key: t3.val -> Merge Join (cost=5.92..10.28 rows=1 width=4) Merge Cond: (t1.id = t3.id) -> Index Only Scan using t1_idx on t1 (cost=0.43..8.46 rows=2 width=4) Index Cond: (id < 3) -> Sort (cost=5.32..5.57 rows=100 width=8) Sort Key: t3.id -> Seq Scan on t3 (cost=0.00..2.00 rows=100 width=8) (14 rows) explain select/*+leading(t3 t2 t1)*/ t2.id from t1,t2,t3 where t1.id=t3.id and t1.id<3 and t3.val=t2.id group by t2.id; QUERY PLAN -------------------------------------------------------------------- Group (cost=15.67..15.68 rows=1 width=4) Group Key: t2.id -> Sort (cost=15.67..15.68 rows=1 width=4) Sort Key: t2.id -> Merge Join (cost=11.30..15.66 rows=1 width=4) Merge Cond: (t3.id = t1.id) -> Sort (cost=10.70..10.95 rows=100 width=8) Sort Key: t3.id -> Merge Join (cost=5.75..7.38 rows=100 width=8) Merge Cond: (t2.id = t3.val) -> Index Only Scan using t2_idx on t2 (cost=0.42..30468.42 rows=1000000 width=4) -> Sort (cost=5.32..5.57 rows=100 width=8) Sort Key: t3.val -> Seq Scan on t3 (cost=0.00..2.00 rows=100 width=8) -> Index Only Scan using t1_idx on t1 (cost=0.43..8.46 rows=2 width=4) Index Cond: (id < 3) (16 rows)
Leading类型的HINT种类2:同时指定join顺序和内外表顺序:
对上面的例子指定t1为外表,t3为内表进行连接,两表连接的结果作为外表与t2作为内表进行连接:
explain select/*+leading(((t1 t3) t2)*/ t2.id from t1,t2,t3 where t1.id=t3.id and t1.id<3 and t3.val=t2.id group by t2.id; QUERY PLAN ------------------------------------------------------------------ Group (cost=10.72..10.87 rows=1 width=4) Group Key: t2.id -> Merge Join (cost=10.72..10.87 rows=1 width=4) Merge Cond: (t2.id = t3.val) -> Index Only Scan using t2_idx on t2 (cost=0.42..30468.42 rows=1000000 width=4) -> Sort (cost=10.29..10.30 rows=1 width=4) Sort Key: t3.val -> Merge Join (cost=5.92..10.28 rows=1 width=4) Merge Cond: (t1.id = t3.id) -> Index Only Scan using t1_idx on t1 (cost=0.43..8.46 rows=2 width=4) Index Cond: (id < 3) -> Sort (cost=5.32..5.57 rows=100 width=8) Sort Key: t3.id -> Seq Scan on t3 (cost=0.00..2.00 rows=100 width=8) (14 rows)
下面的SQL语句执行计划使用MergeJoin,通过SET HINT,设置 enable_mergejoin off,使用Nestloop。
explain select t2.id, count(t3.val) from t2,t3 where t2.id=t3.val group by t2.id; QUERY PLAN -------------------------------------------------------------------- GroupAggregate (cost=5.75..8.88 rows=100 width=12) Group Key: t2.id -> Merge Join (cost=5.75..7.38 rows=100 width=8) Merge Cond: (t2.id = t3.val) -> Index Only Scan using t2_idx on t2 (cost=0.42..30468.42 rows=1000000 width=4) -> Sort (cost=5.32..5.57 rows=100 width=4) Sort Key: t3.val -> Seq Scan on t3 (cost=0.00..2.00 rows=100 width=4) (8 rows) explain select/*+set(enable_mergejoin off)*/ t2.id, count(t3.val) from t2,t3 where t2.id=t3.val group by t2.id; QUERY PLAN -------------------------------------------------------------------- GroupAggregate (cost=842.57..844.32 rows=100 width=12) Group Key: t2.id -> Sort (cost=842.57..842.82 rows=100 width=8) Sort Key: t2.id -> Nested Loop (cost=0.42..839.25 rows=100 width=8) -> Seq Scan on t3 (cost=0.00..2.00 rows=100 width=4) -> Index Only Scan using t2_idx on t2 (cost=0.42..8.36 rows=1 width=4) Index Cond: (id = t3.val) (8 rows)
在下面的例子中,父查询和子查询都是用表t2,通过名字无法区分两个表,有了blockname HINT之后,可以使用blockname进行区分,更改如下:
explain select t2.id from t2 where t2.id in (select id from t2 where t2.val < 9); QUERY PLAN ------------------------------------------------------------------ Nested Loop (cost=16929.55..28344.71 rows=1651 width=4) -> HashAggregate (cost=16929.13..16945.64 rows=1651 width=4) Group Key: t2_1.id -> Seq Scan on t2 t2_1 (cost=0.00..16925.00 rows=1651 width=4) Filter: (val < 9) -> Index Only Scan using t2_idx on t2 (cost=0.42..6.89 rows=1 width=4) Index Cond: (id = t2_1.id) (7 rows) explain select/*+HashJoin(t2 blk.t2)*/ t2.id from t2 where t2.id in (select/*+blockname(blk)*/ id from t2 where t2.val < 9); QUERY PLAN ------------------------------------------------------------------ Hash Semi Join (cost=16945.64..34014.00 rows=1651 width=4) Hash Cond: (t2.id = t2_1.id) -> Seq Scan on t2 (cost=0.00..14425.00 rows=1000000 width=4) -> Hash (cost=16925.00..16925.00 rows=1651 width=4) -> Seq Scan on t2 t2_1 (cost=0.00..16925.00 rows=1651 width=4) Filter: (val < 9) (6 rows)
下面的例子中将(t2 t3)连接的中间结果进行物化处理:
explain select t2.id from t1,t2,t3 where t1.id=t3.id and t1.id<3 and t3.val=t2.id group by t2.id; QUERY PLAN ------------------------------------------------------------------ Group (cost=10.72..10.87 rows=1 width=4) Group Key: t2.id -> Merge Join (cost=10.72..10.87 rows=1 width=4) Merge Cond: (t2.id = t3.val) -> Index Only Scan using t2_idx on t2 (cost=0.42..30468.42 rows=1000000 width=4) -> Sort (cost=10.29..10.30 rows=1 width=4) Sort Key: t3.val -> Merge Join (cost=5.92..10.28 rows=1 width=4) Merge Cond: (t1.id = t3.id) -> Index Only Scan using t1_idx on t1 (cost=0.43..8.46 rows=2 width=4) Index Cond: (id < 3) -> Sort (cost=5.32..5.57 rows=100 width=8) Sort Key: t3.id -> Seq Scan on t3 (cost=0.00..2.00 rows=100 width=8) (14 rows) explain select/*+materialize(t2 t3)*/t2.id from t1,t2,t3 where t1.id=t3.id and t1.id<3 and t3.val=t2.id group by t2.id; QUERY PLAN ------------------------------------------------------------------ Group (cost=19.10..19.11 rows=1 width=4) Group Key: t2.id -> Sort (cost=19.10..19.11 rows=1 width=4) Sort Key: t2.id -> Nested Loop (cost=6.18..19.09 rows=1 width=4) Join Filter: (t3.id = t1.id) -> Index Only Scan using t1_idx on t1 (cost=0.43..8.46 rows=2 width=4) Index Cond: (id < 3) -> Materialize (cost=5.75..7.88 rows=100 width=8) -> Merge Join (cost=5.75..7.38 rows=100 width=8) Merge Cond: (t2.id = t3.val) -> Index Only Scan using t2_idx on t2 (cost=0.42..30468.42 rows=1000000 width=4) -> Sort (cost=5.32..5.57 rows=100 width=8) Sort Key: t3.val -> Seq Scan on t3 (cost=0.00..2.00 rows=100 width=8) (15 rows)
下面的SQL语句中,对表t2使用参数化路径:
explain select t2.id, count(t3.val) from t2,t3 where t2.id=t3.val group by t2.id; QUERY PLAN ------------------------------------------------------------------ GroupAggregate (cost=5.75..8.88 rows=100 width=12) Group Key: t2.id -> Merge Join (cost=5.75..7.38 rows=100 width=8) Merge Cond: (t2.id = t3.val) -> Index Only Scan using t2_idx on t2 (cost=0.42..30468.42 rows=1000000 width=4) -> Sort (cost=5.32..5.57 rows=100 width=4) Sort Key: t3.val -> Seq Scan on t3 (cost=0.00..2.00 rows=100 width=4) (8 rows) explain select/*+use_nl_with_index(t2)*/t2.id, count(t3.val) from t2,t3 where t2.id=t3.val group by t2.id; QUERY PLAN ------------------------------------------------------------------ GroupAggregate (cost=842.57..844.32 rows=100 width=12) Group Key: t2.id -> Sort (cost=842.57..842.82 rows=100 width=8) Sort Key: t2.id -> Nested Loop (cost=0.42..839.25 rows=100 width=8) -> Seq Scan on t3 (cost=0.00..2.00 rows=100 width=4) -> Index Only Scan using t2_idx on t2 (cost=0.42..8.36 rows=1 width=4) Index Cond: (id = t3.val) (8 rows)
使用hashagg HINT:
explain select t2.id, count(t3.val) from t2,t3 where t2.id=t3.val group by t2.id; QUERY PLAN ------------------------------------------------------------------ GroupAggregate (cost=5.75..8.88 rows=100 width=12) Group Key: t2.id -> Merge Join (cost=5.75..7.38 rows=100 width=8) Merge Cond: (t2.id = t3.val) -> Index Only Scan using t2_idx on t2 (cost=0.42..30468.42 rows=1000000 width=4) -> Sort (cost=5.32..5.57 rows=100 width=4) Sort Key: t3.val -> Seq Scan on t3 (cost=0.00..2.00 rows=100 width=4) (8 rows) explain select/*+hashagg*/t2.id, count(t3.val) from t2,t3 where t2.id=t3.val group by t2.id; QUERY PLAN ------------------------------------------------------------------ HashAggregate (cost=7.88..8.88 rows=100 width=12) Group Key: t2.id -> Merge Join (cost=5.75..7.38 rows=100 width=8) Merge Cond: (t2.id = t3.val) -> Index Only Scan using t2_idx on t2 (cost=0.42..30468.42 rows=1000000 width=4) -> Sort (cost=5.32..5.57 rows=100 width=4) Sort Key: t3.val -> Seq Scan on t3 (cost=0.00..2.00 rows=100 width=4) (8 rows)
下面的例子中表的连接顺序是t1,t3,t2,加入ordered HINT后,执行计划按照表出现的顺序进行连接:
explain select t2.id from t2,t3,t1 where t1.id=t3.id and t1.id<3 and t3.val=t2.id group by t2.id; QUERY PLAN ------------------------------------------------------------------ Group (cost=10.72..10.87 rows=1 width=4) Group Key: t2.id -> Merge Join (cost=10.72..10.87 rows=1 width=4) Merge Cond: (t2.id = t3.val) -> Index Only Scan using t2_idx on t2 (cost=0.42..30468.42 rows=1000000 width=4) -> Sort (cost=10.29..10.30 rows=1 width=4) Sort Key: t3.val -> Merge Join (cost=5.92..10.28 rows=1 width=4) Merge Cond: (t3.id = t1.id) -> Sort (cost=5.32..5.57 rows=100 width=8) Sort Key: t3.id -> Seq Scan on t3 (cost=0.00..2.00 rows=100 width=8) -> Index Only Scan using t1_idx on t1 (cost=0.43..8.46 rows=2 width=4) Index Cond: (id < 3) (14 rows) explain select/*+ordered*/t2.id from t2,t3,t1 where t1.id=t3.id and t1.id<3 and t3.val=t2.id group by t2.id; QUERY PLAN ------------------------------------------------------------------ Group (cost=15.67..15.68 rows=1 width=4) Group Key: t2.id -> Sort (cost=15.67..15.68 rows=1 width=4) Sort Key: t2.id -> Merge Join (cost=11.30..15.66 rows=1 width=4) Merge Cond: (t3.id = t1.id) -> Sort (cost=10.70..10.95 rows=100 width=8) Sort Key: t3.id -> Merge Join (cost=5.75..7.38 rows=100 width=8) Merge Cond: (t2.id = t3.val) -> Index Only Scan using t2_idx on t2 (cost=0.42..30468.42 rows=1000000 width=4) -> Sort (cost=5.32..5.57 rows=100 width=8) Sort Key: t3.val -> Seq Scan on t3 (cost=0.00..2.00 rows=100 width=8) -> Index Only Scan using t1_idx on t1 (cost=0.43..8.46 rows=2 width=4) Index Cond: (id < 3) (16 rows)
通常SQL调优中会使用多种HINT,如下例子中使用了连接顺序leading HINT、连接方式NestLoop HINT以及聚集HINT hashagg。
explain select/*+leading(((t2 t3) t1)) Nestloop(t2 t3) hashagg*/ t2.id from t1,t2,t3 where t1.id=t3.id and t1.id<3 and t3.val=t2.id group by t2.id; QUERY PLAN ------------------------------------------------------------------ HashAggregate (cost=1514438.72..1514438.73 rows=1 width=4) Group Key: t2.id -> Nested Loop (cost=0.43..1514438.72 rows=1 width=4) Join Filter: (t3.id = t1.id) -> Nested Loop (cost=0.00..1514427.25 rows=100 width=8) Join Filter: (t2.id = t3.val) -> Seq Scan on t2 (cost=0.00..14425.00 rows=1000000 width=4) Materialize (cost=0.00..2.50 rows=100 width=8) -> Seq Scan on t3 (cost=0.00..2.00 rows=100 width=8) Materialize (cost=0.43..8.47 rows=2 width=4) -> Index Only Scan using t1_idx on t1 (cost=0.43..8.46 rows=2 width=4) Index Cond: (id < 3) (12 rows)
关于并行查询的GUC参数max_parallel_workers和max_parallel_workers_per_gather,当在Hint中指定的max_parallel_workers_per_gather的值比kingbase.conf中max_parallel_workers设置的值大时,会在系统日志中记录这个事件,并行Hint不会生效。
Set类型的Hint只支持用户级可变的优化器相关的参数,对其他参数(例如enable_upper_colname)不支持。
当带有Hint的SQL语句中包含多个“values”表达式时,比如:
explain (costs on) select /*+rows(*VALUES * #1000)*/ * from(values('A'),('B'),('C')) as s(n) , (values('C'),('D'),('E')) as t(n) where s.n=t.n;
由于对于HINT来说,通过一个 VALUES 名字无法区分对应的是哪一个具体的values表达式,因此Hint不支持包含多个values表达式的SQL语句。