数仓性能优化:倾斜优化-表达式计算倾斜的hint优化

性能,优化,倾斜,表达式,计算,hint · 浏览次数 : 18

小编点评

**内容生成指南** **1. 标题** * 内容的标题应简单明了,包含主要内容。 * 示例:"华为云新鲜技术:智能物件分类与材料采购" **2. 排版** * 内容应按排版方式排列,以使内容易于阅读。 * 示例: * 标题 * 正文 * 正文 * 正文 **3. 内容** * 内容应清晰明了,包含主要内容。 * 示例: * 标题 * 正文 * 正文 * 正文 **4. 排版** * 正文内容应按排版方式排列,以使内容易于阅读。 * 示例: * 标题 * 正文 * 正文 * 正文 **5. 标题** * 正文标题应与内容相关,以使内容易于找到。 * 示例:"华为云新鲜技术:智能物件分类与材料采购" **6. 正文** * 正文内容应与标题相关,以使内容易于找到。 * 示例: * 标题 * 正文 * 正文 * 正文 **7. 正文** * 正文内容应与内容相关,以使内容易于找到。 * 示例: * 标题 * 正文 * 正文 * 正文 **8. 排版** * 正文内容应按排版方式排列,以使内容易于阅读。 * 示例: * 标题 * 正文 * 正文 * 正文 **9. 标题** * 正文标题应与内容相关,以使内容易于找到。 * 示例:"华为云新鲜技术:智能物件分类与材料采购" **10. 附件** * 附件应与内容相关,以使内容易于找到。 * 示例: * 标题 * 正文 * 正文 * 正文 * 正文

正文

本文分享自华为云社区《GaussDB(DWS)性能调优:倾斜优化-表达式计算倾斜的hint优化》,作者: 譡里个檔 。

1.原始SQL

SELECT

TMP4.TAX_AMT,

CATE.L1_PUR_ITEM_CATG_CN_NAME || '-' ||

CATE.L2_PUR_ITEM_CATG_CN_NAME || '-' ||

CATE.L3_PUR_ITEM_CATG_CN_NAME AS PRODUCT_CATEGORY,

MATE.ITEM_CODE AS PRODUCT_CODE,

INVEN.INVENTORY_ORG_NAME,

TMP4.INVOICE_WITHHOLDING_TAX_GROUP,

TMP4.PAYMENT_WITHHOLDING_TAX_GROUP,

TMP4.PO_CHARGE_ACCOUNT_CODE,

TMP4.CFS_INVOICE_NUMBER,

APR.TAX_INVOICE_DATE

FROM DWLTAX.DWL_TAX_TAXDP_ERP_AP_INVOICE_TMP5 TMP4,

DWRDIM_DW1.DWR_DIM_PUR_ITEM_CATEGORY_D CATE,

DWRDIM_DW1.DWR_DIM_MATERIAL_CODE_D MATE,

DWRDIM_DW1.DWR_DIM_INVENTORY_ORG_D INVEN,

DWTAXDI.DWI_AP_INVOICE_I AP,

DWTAXDI.DWI_AP_INVOICE_REGSTN_I APR

WHERE 1 = 1

AND TMP4.ITEM_CATEGORY_KEY = CATE.PUR_ITEM_CATG_KEY(+)

AND CATE.DEL_FLAG(+) = 'N'

AND TMP4.ITEM_ID = MATE.ITEM_ID(+)

AND MATE.DEL_FLAG(+) = 'N'

AND TMP4.PO_SHIPMENT_TARGET_INV_ORG_KEY = INVEN.INVENTORY_ORG_KEY(+)

AND INVEN.DEL_FLAG(+) = 'N'

AND TMP4.AP_INVOICE_ID = AP.AP_INVOICE_ID(+)

AND 6600 || AP.ATTRIBUTE1 = TO_CHAR(APR.AP_INVOICE_REGSTN_ID(+))

执行performance,查询具体执行情况和SQL自诊断信息(详细见附件case-step1-原始执行信息.txt)

id | operation | A-time | A-rows | E-rows | E-distinct | Peak Memory | E-memory | A-width | E-width | E-costs

----+------------------------------------------------------------------------------------------------------+------------------------+------------+------------+------------+----------------+----------------+-----------+---------+-------------

1 | -> Row Adapter | 69922.773 | 69237018 | 69237018 | | 87KB | | | 573 | 15160857.61

2 | -> Vector Streaming (type: GATHER) | 65581.989 | 69237018 | 69237018 | | 536KB | | | 573 | 15160857.61

3 | -> Vector Hash Right Join (4, 6) | [61186.201, 73129.055] | 69237018 | 69237018 | | [306MB, 682MB] | 1113MB(9990MB) | | 573 | 15159431.83

4 | -> Vector Streaming(type: BROADCAST ng: LC_DL1->LC_DW1) | [554.217, 21008.078] | 1382000544 | 1381572384 | 282184 | [4MB, 4MB] | 3MB | | 16 | 7056095.88

5 | -> CStore Scan on dwifin.dwi_ap_invoice_regstn s | [5.354, 11.617] | 28791678 | 28782758 | | [1MB, 1MB] | 1MB | | 16 | 28004.18

6 | -> Vector Hash Left Join (7, 19) | [1728.008, 2017.488] | 69237018 | 69237018 | 79721 | [834KB, 834KB] | 16MB | [229,252] | 578 | 1832322.90

7 | -> Vector Hash Left Join (8, 17) | [1428.799, 1925.653] | 69237018 | 69237018 | 179 | [32MB, 32MB] | 28MB(8901MB) | | 576 | 1817105.07

8 | -> Vector Streaming(type: PART REDISTRIBUTE PART ROUNDROBIN) | [996.780, 1635.826] | 69237018 | 69237018 | 4167 | [1MB, 1MB] | 2MB | | 570 | 1788113.85

9 | -> Vector Hash Left Join (10, 14) | [1086.903, 1780.641] | 69237018 | 69237018 | | [173MB, 174MB] | 227MB(9067MB) | | 570 | 1304897.12

10 | -> Vector Streaming(type: PART REDISTRIBUTE PART ROUNDROBIN) | [153.628, 891.680] | 69237018 | 69237018 | 20271 | [1MB, 1MB] | 2MB | | 567 | 847160.16

11 | -> Vector Hash Left Join (12, 13) | [367.155, 465.821] | 69237018 | 69237018 | | [30MB, 30MB] | 22MB(8896MB) | | 567 | 363943.43

12 | -> CStore Scan on dwltax.dwl_tax_taxdp_erp_ap_invoice_tmp5 tmp4 | [150.676, 178.827] | 69237018 | 69237018 | 526 | [4MB, 4MB] | 1MB | | 553 | 340168.44

13 | -> CStore Scan on dwrdim_dw1.dwr_dim_pur_item_category_d cate | [14.549, 24.399] | 8228448 | 8228448 | 171426 | [2MB, 2MB] | 1MB | [104,104] | 26 | 9056.99

14 | -> Vector Streaming(type: PART REDISTRIBUTE PART BROADCAST ng: LC_DL1->LC_DW1) | [315.926, 339.782] | 117191217 | 117191170 | 2441483 | [1MB, 1MB] | 3MB | [47,47] | 22 | 406136.10

15 | -> Vector Partition Iterator | [118.307, 151.248] | 117191170 | 117191170 | | [41KB, 41KB] | 1MB | | 22 | 300641.93

16 | -> Partitioned CStore Scan on dwifin.dwi_ap_invoice s | [86.557, 111.947] | 117191170 | 117191170 | | [6MB, 6MB] | 1MB | | 22 | 300641.93

17 | -> Vector Streaming(type: PART LOCAL PART BROADCAST) | [60.429, 99.381] | 15442613 | 15442566 | 321720 | [584KB, 584KB] | 2MB | [58,58] | 19 | 49578.19

18 | -> CStore Scan on dwrdim_dw1.dwr_dim_material_code_d mate | [19.779, 33.206] | 15442566 | 15442566 | | [1MB, 2MB] | 1MB | | 19 | 35704.02

19 | -> CStore Scan on dwrdim_dw1.dwr_dim_inventory_org_d inven | [0.383, 0.739] | 135072 | 135072 | 2814 | [1MB, 1MB] | 1MB | [53,53] | 14 | 2823.85

SQL Diagnostic Information

--------------------------------------------------------------------------------------------

Execute diagnostic information

PlanNode[4] Large Table in Broadcast "Vector Streaming(type: BROADCAST ng: LC_DL1->LC_DW1)"

Predicate Information (identified by plan id)

------------------------------------------------------------------------------------------------------------------------------

3 --Vector Hash Right Join (4, 6)

Hash Cond: (((numeric_out(s.ap_invoice_regstn_id))::character varying)::text = ('6600'::text || (s.attribute1)::text))

6 --Vector Hash Left Join (7, 19)

Hash Cond: (tmp4.po_shipment_target_inv_org_key = inven.inventory_org_key)

7 --Vector Hash Left Join (8, 17)

Hash Cond: (tmp4.item_id = mate.item_id)

Skew Join Optimized by Statistic

8 --Vector Streaming(type: PART REDISTRIBUTE PART ROUNDROBIN)

Skew Filter(type: ROUNDROBIN): ((tmp4.item_id = (-999999)::numeric) OR (tmp4.item_id IS NULL))

9 --Vector Hash Left Join (10, 14)

Hash Cond: (tmp4.ap_invoice_id = s.ap_invoice_id)

Skew Join Optimized by Statistic

10 --Vector Streaming(type: PART REDISTRIBUTE PART ROUNDROBIN)

Skew Filter(type: ROUNDROBIN): (tmp4.ap_invoice_id = 1001113812002::numeric)

11 --Vector Hash Left Join (12, 13)

Hash Cond: (tmp4.item_category_key = cate.pur_item_catg_key)

13 --CStore Scan on dwrdim_dw1.dwr_dim_pur_item_category_d cate

Filter: ((cate.del_flag)::text = 'N'::text)

Pushdown Predicate Filter: ((cate.del_flag)::text = 'N'::text)

14 --Vector Streaming(type: PART REDISTRIBUTE PART BROADCAST ng: LC_DL1->LC_DW1)

Skew Filter(type: BROADCAST): (s.ap_invoice_id = 1001113812002::numeric)

15 --Vector Partition Iterator

Iterations: 147

16 --Partitioned CStore Scan on dwifin.dwi_ap_invoice s

Partitions Selected by Static Prune: 1..147

17 --Vector Streaming(type: PART LOCAL PART BROADCAST)

Skew Filter(type: BROADCAST): (mate.item_id = (-999999)::numeric)

18 --CStore Scan on dwrdim_dw1.dwr_dim_material_code_d mate

Filter: ((mate.del_flag)::text = 'N'::text)

Pushdown Predicate Filter: ((mate.del_flag)::text = 'N'::text)

19 --CStore Scan on dwrdim_dw1.dwr_dim_inventory_org_d inven

Filter: ((inven.del_flag)::text = 'N'::text)

Pushdown Predicate Filter: ((inven.del_flag)::text = 'N'::text)

2.禁止大表广播

如上小节显示确实是id=4的这一步是一个大的结果集(2879w条)做了broadcast,并且紧接着的id=5的HashJoin耗时很长。因此通过增加hint方式禁止dwifin.dwi_ap_invoice_regstn走广播。分析发现表dwifin.dwi_ap_invoice_regstn是视图apr展开出现的,因此增加如下hint信息,其中

1. no merge (apr)是防止视图apr中的语句提升,导致的hint信息失效

2. no broadcast(apr)表示禁止apr走broadcast

EXPLAIN performance

SELECT /*+ no merge (apr) no broadcast(apr) */

TMP4.TAX_AMT,

CATE.L1_PUR_ITEM_CATG_CN_NAME || '-' ||

CATE.L2_PUR_ITEM_CATG_CN_NAME || '-' ||

CATE.L3_PUR_ITEM_CATG_CN_NAME AS PRODUCT_CATEGORY,

MATE.ITEM_CODE AS PRODUCT_CODE,

INVEN.INVENTORY_ORG_NAME,

TMP4.INVOICE_WITHHOLDING_TAX_GROUP,

TMP4.PAYMENT_WITHHOLDING_TAX_GROUP,

TMP4.PO_CHARGE_ACCOUNT_CODE,

TMP4.CFS_INVOICE_NUMBER,

APR.TAX_INVOICE_DATE

FROM DWLTAX.DWL_TAX_TAXDP_ERP_AP_INVOICE_TMP5 TMP4,

DWRDIM_DW1.DWR_DIM_PUR_ITEM_CATEGORY_D CATE,

DWRDIM_DW1.DWR_DIM_MATERIAL_CODE_D MATE,

DWRDIM_DW1.DWR_DIM_INVENTORY_ORG_D INVEN,

DWTAXDI.DWI_AP_INVOICE_I AP,

DWTAXDI.DWI_AP_INVOICE_REGSTN_I APR

WHERE 1 = 1

AND TMP4.ITEM_CATEGORY_KEY = CATE.PUR_ITEM_CATG_KEY(+)

AND CATE.DEL_FLAG(+) = 'N'

AND TMP4.ITEM_ID = MATE.ITEM_ID(+)

AND MATE.DEL_FLAG(+) = 'N'

AND TMP4.PO_SHIPMENT_TARGET_INV_ORG_KEY = INVEN.INVENTORY_ORG_KEY(+)

AND INVEN.DEL_FLAG(+) = 'N'

AND TMP4.AP_INVOICE_ID = AP.AP_INVOICE_ID(+)

AND 6600 || AP.ATTRIBUTE1 = TO_CHAR(APR.AP_INVOICE_REGSTN_ID(+))

获取如上语句的performance信息(详细见附件 case-step2-禁止大表广播.txt)

id | operation | A-time | A-rows | E-rows | E-distinct | Peak Memory | E-memory | A-width | E-width | E-costs

----+---------------------------------------------------------------------------------------------------------+------------------------+-----------+-----------+------------+----------------+----------------+-----------+---------+-------------

1 | -> Row Adapter | 15685.781 | 69237018 | 69237018 | | 87KB | | | 573 | 33341721.22

2 | -> Vector Streaming (type: GATHER) | 11361.740 | 69237018 | 69237018 | | 536KB | | | 573 | 33341721.22

3 | -> Vector Hash Left Join (4, 19) | [15269.267, 18985.791] | 69237018 | 69237018 | | [74MB, 74MB] | 101MB(9984MB) | | 573 | 33340295.43

4 | -> Vector Streaming(type: REDISTRIBUTE) | [4743.867, 18632.182] | 69237018 | 69237018 | 79721 | [1MB, 2MB] | 2MB | | 578 | 29821930.76

5 | -> Vector Hash Left Join (6, 18) | [1473.990, 15359.055] | 69237018 | 69237018 | | [866KB, 898KB] | 16MB | | 578 | 1832322.90

6 | -> Vector Hash Left Join (7, 16) | [1130.814, 15223.646] | 69237018 | 69237018 | 179 | [32MB, 32MB] | 28MB(9923MB) | | 576 | 1817105.07

7 | -> Vector Streaming(type: PART REDISTRIBUTE PART ROUNDROBIN) | [681.709, 14909.424] | 69237018 | 69237018 | 4167 | [1MB, 1MB] | 2MB | | 570 | 1788113.85

8 | -> Vector Hash Left Join (9, 13) | [1049.201, 12602.796] | 69237018 | 69237018 | | [173MB, 174MB] | 227MB(10089MB) | | 570 | 1304897.12

9 | -> Vector Streaming(type: PART REDISTRIBUTE PART ROUNDROBIN) | [128.704, 11737.099] | 69237018 | 69237018 | 20271 | [1MB, 1MB] | 2MB | | 567 | 847160.16

10 | -> Vector Hash Left Join (11, 12) | [368.537, 443.623] | 69237018 | 69237018 | | [30MB, 30MB] | 22MB(9918MB) | | 567 | 363943.43

11 | -> CStore Scan on dwltax.dwl_tax_taxdp_erp_ap_invoice_tmp5 tmp4 | [148.366, 175.347] | 69237018 | 69237018 | 526 | [4MB, 4MB] | 1MB | | 553 | 340168.44

12 | -> CStore Scan on dwrdim_dw1.dwr_dim_pur_item_category_d cate | [13.319, 24.442] | 8228448 | 8228448 | 171426 | [2MB, 2MB] | 1MB | [104,104] | 26 | 9056.99

13 | -> Vector Streaming(type: PART REDISTRIBUTE PART BROADCAST ng: LC_DL1->LC_DW1) | [242.053, 294.233] | 117191217 | 117191170 | 2441483 | [1MB, 1MB] | 3MB | [47,47] | 22 | 406136.10

14 | -> Vector Partition Iterator | [118.124, 154.954] | 117191170 | 117191170 | | [41KB, 41KB] | 1MB | | 22 | 300641.93

15 | -> Partitioned CStore Scan on dwifin.dwi_ap_invoice s | [86.942, 105.441] | 117191170 | 117191170 | | [6MB, 6MB] | 1MB | | 22 | 300641.93

16 | -> Vector Streaming(type: PART LOCAL PART BROADCAST) | [83.793, 117.853] | 15442613 | 15442566 | 321720 | [584KB, 584KB] | 2MB | [58,58] | 19 | 49578.19

17 | -> CStore Scan on dwrdim_dw1.dwr_dim_material_code_d mate | [21.898, 35.895] | 15442566 | 15442566 | | [1MB, 2MB] | 1MB | | 19 | 35704.02

18 | -> CStore Scan on dwrdim_dw1.dwr_dim_inventory_org_d inven | [0.389, 0.661] | 135072 | 135072 | 2814 | [1MB, 1MB] | 1MB | [53,53] | 14 | 2823.85

19 | -> Vector Streaming(type: REDISTRIBUTE ng: LC_DL1->LC_DW1) | [30.667, 49.474] | 28791678 | 28782758 | 599641 | [2MB, 2MB] | 3MB | [75,75] | 16 | 56030.49

20 | -> Vector Subquery Scan on apr | [42.087, 61.734] | 28791678 | 28782758 | | [376KB, 376KB] | 1MB | | 16 | 30826.02

21 | -> CStore Scan on dwifin.dwi_ap_invoice_regstn s | [5.177, 8.049] | 28791678 | 28782758 | | [1MB, 1MB] | 1MB | | 16 | 28004.18

SQL Diagnostic Information

----------------------------------------------------------------------------------------------------------

Execute diagnostic information

PlanNode[4] DataSkew:"Vector Streaming(type: REDISTRIBUTE)", min_dn_tuples:257082, max_dn_tuples:47206637

Predicate Information (identified by plan id)

----------------------------------------------------------------------------------------------------------------------------------

3 --Vector Hash Left Join (4, 19)

Hash Cond: ((('6600'::text || (s.attribute1)::text)) = ((numeric_out(apr.ap_invoice_regstn_id))::character varying)::text)

5 --Vector Hash Left Join (6, 18)

Hash Cond: (tmp4.po_shipment_target_inv_org_key = inven.inventory_org_key)

6 --Vector Hash Left Join (7, 16)

Hash Cond: (tmp4.item_id = mate.item_id)

Skew Join Optimized by Statistic

7 --Vector Streaming(type: PART REDISTRIBUTE PART ROUNDROBIN)

Skew Filter(type: ROUNDROBIN): ((tmp4.item_id = (-999999)::numeric) OR (tmp4.item_id IS NULL))

8 --Vector Hash Left Join (9, 13)

Hash Cond: (tmp4.ap_invoice_id = s.ap_invoice_id)

Skew Join Optimized by Statistic

9 --Vector Streaming(type: PART REDISTRIBUTE PART ROUNDROBIN)

Skew Filter(type: ROUNDROBIN): (tmp4.ap_invoice_id = 1001113812002::numeric)

10 --Vector Hash Left Join (11, 12)

Hash Cond: (tmp4.item_category_key = cate.pur_item_catg_key)

12 --CStore Scan on dwrdim_dw1.dwr_dim_pur_item_category_d cate

Filter: ((cate.del_flag)::text = 'N'::text)

Pushdown Predicate Filter: ((cate.del_flag)::text = 'N'::text)

13 --Vector Streaming(type: PART REDISTRIBUTE PART BROADCAST ng: LC_DL1->LC_DW1)

Skew Filter(type: BROADCAST): (s.ap_invoice_id = 1001113812002::numeric)

14 --Vector Partition Iterator

Iterations: 147

15 --Partitioned CStore Scan on dwifin.dwi_ap_invoice s

Partitions Selected by Static Prune: 1..147

16 --Vector Streaming(type: PART LOCAL PART BROADCAST)

Skew Filter(type: BROADCAST): (mate.item_id = (-999999)::numeric)

17 --CStore Scan on dwrdim_dw1.dwr_dim_material_code_d mate

Filter: ((mate.del_flag)::text = 'N'::text)

Pushdown Predicate Filter: ((mate.del_flag)::text = 'N'::text)

18 --CStore Scan on dwrdim_dw1.dwr_dim_inventory_org_d inven

Filter: ((inven.del_flag)::text = 'N'::text)

Pushdown Predicate Filter: ((inven.del_flag)::text = 'N'::text)

3.表达式倾斜的hint

发现自诊断信息中倾斜告警

cke_176.png

而Plan ID为4的算子是

cke_177.png

其中是s是视图dwtaxdi.dwi_ap_invoice_i展开后的表dwifin.dwi_ap_invoice,查询此表的列attribute1的统计信息如下,发现在NULL值上存在严重倾斜

cke_178.png

因为重分布列是一个表达式6600 || AP.ATTRIBUTE1,当前DWS的倾斜的hint不支持表达式,因为我们做如下变通实现表达式的值倾斜的hint

SELECT /*+ no merge (apr) no broadcast(apr) no merge(ap) skew(ap (attr1) ('6600')) */

TMP4.TAX_AMT,

CATE.L1_PUR_ITEM_CATG_CN_NAME || '-' ||

CATE.L2_PUR_ITEM_CATG_CN_NAME || '-' ||

CATE.L3_PUR_ITEM_CATG_CN_NAME AS PRODUCT_CATEGORY,

MATE.ITEM_CODE AS PRODUCT_CODE,

INVEN.INVENTORY_ORG_NAME,

TMP4.INVOICE_WITHHOLDING_TAX_GROUP,

TMP4.PAYMENT_WITHHOLDING_TAX_GROUP,

TMP4.PO_CHARGE_ACCOUNT_CODE,

TMP4.CFS_INVOICE_NUMBER,

APR.TAX_INVOICE_DATE

FROM DWLTAX.DWL_TAX_TAXDP_ERP_AP_INVOICE_TMP5 TMP4,

DWRDIM_DW1.DWR_DIM_PUR_ITEM_CATEGORY_D CATE,

DWRDIM_DW1.DWR_DIM_MATERIAL_CODE_D MATE,

DWRDIM_DW1.DWR_DIM_INVENTORY_ORG_D INVEN,

(SELECT *, 6600 || AP.ATTRIBUTE1 AS ATTR1 FROM DWTAXDI.DWI_AP_INVOICE_I AP) AP,

DWTAXDI.DWI_AP_INVOICE_REGSTN_I APR

WHERE 1 = 1

AND TMP4.ITEM_CATEGORY_KEY = CATE.PUR_ITEM_CATG_KEY(+)

AND CATE.DEL_FLAG(+) = 'N'

AND TMP4.ITEM_ID = MATE.ITEM_ID(+)

AND MATE.DEL_FLAG(+) = 'N'

AND TMP4.PO_SHIPMENT_TARGET_INV_ORG_KEY = INVEN.INVENTORY_ORG_KEY(+)

AND INVEN.DEL_FLAG(+) = 'N'

AND TMP4.AP_INVOICE_ID = AP.AP_INVOICE_ID(+)

AND ATTR1 = TO_CHAR(APR.AP_INVOICE_REGSTN_ID(+))

其中构建了子查询 AP

SELECT *, 6600 || AP.ATTRIBUTE1 AS ATTR1 FROM DWTAXDI.DWI_AP_INVOICE_I AP

在把原始的关联列表达式放到子查询里面,然后把 6600 || AP.ATTRIBUTE1 命名为attr1。

在父查询中首先禁止AP这个子查询提升。然后在父查询中通过hint 子查询AP这个结果集的列attr1存在倾斜值'6600' 。这个倾斜值是计算出来的(NULL || 6600 = ‘6600’),并且在原始关联计算中关联表达式是如下,即 6600 || AP.ATTRIBUTE1的结果被转换为text类型(字符串类型)

cke_179.png

获取新的语句的performance如下(详细见附件 case-step3-倾斜优化.txt)

id | operation | A-time | A-rows | E-rows | E-distinct | Peak Memory | E-memory | A-width | E-width | E-costs

----+------------------------------------------------------------------------------------------------------+-----------------------+-----------+-----------+------------+----------------+----------------+-----------+---------+------------

1 | -> Row Adapter | 9045.793 | 69237018 | 69237018 | | 87KB | | | 573 | 2040755.71

2 | -> Vector Streaming (type: GATHER) | 4842.656 | 69237018 | 69237018 | | 520KB | | | 573 | 2040755.71

3 | -> Vector Hash Left Join (4, 21) | [2673.707, 11389.688] | 69237018 | 69237018 | | [1MB, 1MB] | 16MB | | 573 | 2039329.92

4 | -> Vector Hash Left Join (5, 19) | [1951.482, 10931.220] | 69237018 | 69237018 | 179 | [32MB, 32MB] | 28MB(10018MB) | | 571 | 2009687.71

5 | -> Vector Streaming(type: PART REDISTRIBUTE PART ROUNDROBIN) | [1541.777, 10591.702] | 69237018 | 69237018 | 4167 | [1MB, 1MB] | 2MB | | 565 | 1980696.49

6 | -> Vector Hash Left Join (7, 18) | [1703.438, 1980.655] | 69237018 | 69237018 | | [30MB, 30MB] | 22MB(10010MB) | | 565 | 1497479.76

7 | -> Vector Hash Left Join (8, 10) | [1523.277, 1708.622] | 69237018 | 69237018 | 526 | [165MB, 166MB] | 191MB(10151MB) | | 551 | 1473704.77

8 | -> Vector Streaming(type: PART REDISTRIBUTE PART ROUNDROBIN) | [94.501, 203.619] | 69237018 | 69237018 | 20271 | [1MB, 1MB] | 2MB | | 553 | 823385.17

9 | -> CStore Scan on dwltax.dwl_tax_taxdp_erp_ap_invoice_tmp5 tmp4 | [142.734, 171.486] | 69237018 | 69237018 | | [4MB, 4MB] | 1MB | | 553 | 340168.44

10 | -> Vector Streaming(type: PART REDISTRIBUTE PART BROADCAST ng: LC_DL1->LC_DW1) | [811.192, 853.583] | 117191217 | 117191170 | 2441483 | [2MB, 2MB] | 3MB | [44,44] | 17 | 598718.74

11 | -> Vector Hash Left Join (12, 15) | [340.998, 790.399] | 117191170 | 117191170 | | [39MB, 39MB] | 27MB(10015MB) | | 17 | 493224.57

12 | -> Vector Streaming(type: PART REDISTRIBUTE PART ROUNDROBIN) | [53.170, 79.836] | 117191170 | 117191170 | 79721 | [2MB, 2MB] | 3MB | | 41 | 412662.90

13 | -> Vector Partition Iterator | [145.450, 171.527] | 117191170 | 117191170 | | [41KB, 41KB] | 1MB | | 22 | 303514.27

14 | -> Partitioned CStore Scan on dwifin.dwi_ap_invoice s | [112.099, 134.193] | 117191170 | 117191170 | | [6MB, 6MB] | 1MB | | 22 | 300641.93

15 | -> Vector Streaming(type: PART REDISTRIBUTE PART BROADCAST) | [48.632, 99.230] | 28791678 | 28782758 | 282184 | [2MB, 2MB] | 3MB | [75,75] | 16 | 56928.04

16 | -> Vector Subquery Scan on apr | [41.916, 78.189] | 28791678 | 28782758 | | [376KB, 376KB] | 1MB | | 16 | 30826.02

17 | -> CStore Scan on dwifin.dwi_ap_invoice_regstn s | [5.233, 10.667] | 28791678 | 28782758 | | [1MB, 1MB] | 1MB | | 16 | 28004.18

18 | -> CStore Scan on dwrdim_dw1.dwr_dim_pur_item_category_d cate | [12.065, 20.667] | 8228448 | 8228448 | 171426 | [2MB, 2MB] | 1MB | [104,104] | 26 | 9056.99

19 | -> Vector Streaming(type: PART LOCAL PART BROADCAST) | [67.272, 97.378] | 15442613 | 15442566 | 321720 | [584KB, 584KB] | 2MB | [58,58] | 19 | 49578.19

20 | -> CStore Scan on dwrdim_dw1.dwr_dim_material_code_d mate | [18.605, 31.713] | 15442566 | 15442566 | | [1MB, 2MB] | 1MB | | 19 | 35704.02

21 | -> CStore Scan on dwrdim_dw1.dwr_dim_inventory_org_d inven | [0.378, 0.647] | 135072 | 135072 | 2814 | [1MB, 1MB] | 1MB | [53,53] | 14 | 2823.85

Predicate Information (identified by plan id)

----------------------------------------------------------------------------------------------------------------------------------

3 --Vector Hash Left Join (4, 21)

Hash Cond: (tmp4.po_shipment_target_inv_org_key = inven.inventory_org_key)

4 --Vector Hash Left Join (5, 19)

Hash Cond: (tmp4.item_id = mate.item_id)

Skew Join Optimized by Statistic

5 --Vector Streaming(type: PART REDISTRIBUTE PART ROUNDROBIN)

Skew Filter(type: ROUNDROBIN): ((tmp4.item_id = (-999999)::numeric) OR (tmp4.item_id IS NULL))

6 --Vector Hash Left Join (7, 18)

Hash Cond: (tmp4.item_category_key = cate.pur_item_catg_key)

7 --Vector Hash Left Join (8, 10)

Hash Cond: (tmp4.ap_invoice_id = s.ap_invoice_id)

Skew Join Optimized by Statistic

8 --Vector Streaming(type: PART REDISTRIBUTE PART ROUNDROBIN)

Skew Filter(type: ROUNDROBIN): (tmp4.ap_invoice_id = 1001113812002::numeric)

10 --Vector Streaming(type: PART REDISTRIBUTE PART BROADCAST ng: LC_DL1->LC_DW1)

Skew Filter(type: BROADCAST): (s.ap_invoice_id = 1001113812002::numeric)

11 --Vector Hash Left Join (12, 15)

Hash Cond: ((('6600'::text || (s.attribute1)::text)) = ((numeric_out(apr.ap_invoice_regstn_id))::character varying)::text)

Skew Join Optimized by Hint

12 --Vector Streaming(type: PART REDISTRIBUTE PART ROUNDROBIN)

Skew Filter(type: ROUNDROBIN): ((('6600'::text || (s.attribute1)::text)) = '6600'::text)

13 --Vector Partition Iterator

Iterations: 147

14 --Partitioned CStore Scan on dwifin.dwi_ap_invoice s

Partitions Selected by Static Prune: 1..147

15 --Vector Streaming(type: PART REDISTRIBUTE PART BROADCAST)

Skew Filter(type: BROADCAST): ((((numeric_out(apr.ap_invoice_regstn_id))::character varying)::text) = '6600'::text)

18 --CStore Scan on dwrdim_dw1.dwr_dim_pur_item_category_d cate

Filter: ((cate.del_flag)::text = 'N'::text)

Pushdown Predicate Filter: ((cate.del_flag)::text = 'N'::text)

19 --Vector Streaming(type: PART LOCAL PART BROADCAST)

Skew Filter(type: BROADCAST): (mate.item_id = (-999999)::numeric)

20 --CStore Scan on dwrdim_dw1.dwr_dim_material_code_d mate

Filter: ((mate.del_flag)::text = 'N'::text)

Pushdown Predicate Filter: ((mate.del_flag)::text = 'N'::text)

21 --CStore Scan on dwrdim_dw1.dwr_dim_inventory_org_d inven

Filter: ((inven.del_flag)::text = 'N'::text)

Pushdown Predicate Filter: ((inven.del_flag)::text = 'N'::text)

 

点击关注,第一时间了解华为云新鲜技术~

与数仓性能优化:倾斜优化-表达式计算倾斜的hint优化相似的内容:

数仓性能优化:倾斜优化-表达式计算倾斜的hint优化

本文主要讲解表达式倾斜的hint优化。

数仓性能调优:大宽表关联MERGE性能优化

摘要:本文主要为大家讲解在数仓性能调优过程中,关于大宽表关联MERGE性能优化过程。 本文分享自华为云社区《GaussDB(DWS)性能调优:大宽表关联MERGE性能优化》,作者:譡里个檔。 【业务背景】 如下MERGE语句执行耗时长达2034s MERGE INTO sdifin.hah_ae_l

解读数仓常用模糊查询的优化方法

摘要:本文讲解了GaussDB(DWS)上模糊查询常用的性能优化方法,通过创建索引,能够提升多种场景下模糊查询语句的执行速度。 本文分享自华为云社区《GaussDB(DWS) 模糊查询性能优化》,作者: 黎明的风 。 在使用GaussDB(DWS)时,通过like进行模糊查询,有时会遇到查询性能慢的

详解数仓中sequence的应用场景及优化

摘要:本文简单介绍sequence的使用场景及如何修改sequence的cache值提高性能。 本文分享自华为云社区《GaussDB(DWS)关于sequence的那些事》,作者:Arrow0lf 。 什么是sequence sequence,也称作序列,是用来产生唯一整数的数据库对象。序列的值按照

数仓性能调优:如何进行函数下推

摘要:本文主要描述下函数在满足特征的前提下可以把函数属性定义为下推属性。 本文分享自华为云社区《GaussDB(DWS)性能调优:函数下推》,作者:譡里个檔 。 DWS作为MPP架构的数仓产品,其性能优势主要在分布式计算上。默认情况下,DWS为了保证结果的正确性,自定义函数默认属性是不下推的,这会导

带你掌握数仓的作业级监控TopSQL

摘要:目前TopSQL功能被用户广泛使用,是性能定位、劣化分析、审计回溯等重要的基石,为用户提供覆盖内存、耗时、IO、网络、空间等多方面的监控能力。 本文分享自华为云社区《GaussDB(DWS)监控工具指南(一)作业级监控TopSQL》,作者:幕后小黑爪 。 1、引言: 监控系统是智能化管理和自动

文盘Rust -- rust 连接云上数仓 starwift

最近想看看 rust 如何集成 clickhouse,又犯了好吃懒做的心理(不想自己建环境),刚好京东云发布了兼容ck 的云原生数仓 Starwfit,于是搞了个实例折腾一番。 Starwfit 是京东云自主研发的新一代云原生数据仓库,通过存算分离降低了存储成本,同时兼具性能和扩展弹性。其写入和查询速度可达到传统数据仓库的数倍,为用户提供实时数据分析能力。广泛应用于流量分析、精准营销、用户画像、广

[转帖]国产数据库到底行不行?人大金仓KINGBASE数据库与主流开源数据库性能实测

近年来,人大金仓的数据库产品受到了外界诸多的关注。做产品,免不了要接受用户的对比和选择,数据库因其行业的自身特点,还有很多开源的技术产品同台比拼,用户因此也会产生诸多疑问,国产数据库相比开源数据库到底如何,今天我们选择数据库的一项核心能力——性能,将金仓KingbaseES和目前业界主流的两种开源数

数仓在线运维:如何进行在线增删CN?

摘要:集群运行过程中,根据集群的综合负载和业务接入情况进行分析:增加CN可以适当降低CPU消耗,增大接入连接数,分散CN节点业务压力,根据实际情况来识别是否要增加CN,如果是提升集群容量和扩展比能力,建议进行扩容操作。 本文分享自华为云社区《【玩转PB级数仓GaussDB(DWS)】在线运维-在线增

数仓专家面对面 | 为什么我选择GaussDB(DWS)

摘要:你知道数仓是如何应运而生的吗?你了解数仓未来的发展趋势吗?想知道国内数仓专家的看法吗? 导语 数据仓库的发展一直是备受关注的议题,随着近年来技术的不断演进,数仓也在更新迭代。 你知道数仓是如何应运而生的吗?你了解数仓未来的发展趋势吗?想知道国内数仓专家的看法吗? 今天我们邀请到了华为云数据仓库