业务反馈在某套ob 3.2.3.3集群抓到某个多表关联的语句执行很慢,查看执行计划发现是驱动表选择错误,使用leading指定合适的表驱动顺序后SQL执行很快,随后对驱动表选择错误问题进行分析。
原因分析:
1、 查看原执行计划,可以发现驱动表R表过滤后的预估行数是1,所以优化器认为该表作为驱动表执行后的cost成本最低
Query Plan: =================================================================================|ID|OPERATOR |NAME |EST. ROWS|COST|---------------------------------------------------------------------------------|0 |SCALAR GROUP BY | |1 |174 ||1 | PX COORDINATOR | |1 |174 ||2 | EXCHANGE OUT DISTR |:EX10001 |1 |173 ||3 | MERGE GROUP BY | |1 |173 ||4 | NESTED-LOOP JOIN | |1 |173 ||5 | NESTED-LOOP JOIN | |1 |130 ||6 | EXCHANGE IN DISTR | |1 |100 ||7 | EXCHANGE OUT DISTR (PKEY)|:EX10000 |1 |100 ||8 | PX PARTITION ITERATOR | |1 |100 ||9 | TABLE SCAN |R(IDX_STUDENTER_SCHDATE) |1 |100 ||10| PX PARTITION ITERATOR | |1 |29 ||11| TABLE SCAN |S(IDX_TEACHERCER_TEAHNUMBER)|1 |29 ||12| DISTRIBUTED TABLE SCAN |C(PK_CM_CU_SCHOOLER) |2 |5 |=================================================================================
2、 单独对R表进行分析,谓词过滤后的返回行数有14万行,拿14万行作为嵌套循环的驱动表必定是灾难性的根本跑不出来
select count(*)from tbcs.studenter rwhere r.schdate >= sysdate - NUMTODSINTERVAL('90','DAY')and r.schoolid in ('OpenStudeXS')\G+----------+| COUNT(*) |+----------+| 145532 |+----------+1 row in set (0.099 sec)*************************** 1. row ***************************Query Plan: ======================================================================|ID|OPERATOR |NAME |EST. ROWS|COST|----------------------------------------------------------------------|0 |SCALAR GROUP BY | |1 |801 ||1 | PX COORDINATOR | |1 |801 ||2 | EXCHANGE OUT DISTR |:EX10000 |1 |801 ||3 | MERGE GROUP BY | |1 |801 ||4 | PX PARTITION ITERATOR| |1 |801 ||5 | TABLE SCAN |R(IDX_studenter_schdate)|1 |801 |======================================================================Outputs & filters:-------------------------------------0 - output([T_FUN_COUNT_SUM(T_FUN_COUNT(*)(0x7e5548c1d6f0))(0x7e5548cb1ed0)]), filter(nil),group(nil), agg_func([T_FUN_COUNT_SUM(T_FUN_COUNT(*)(0x7e5548c1d6f0))(0x7e5548cb1ed0)])1 - output([T_FUN_COUNT(*)(0x7e5548c1d6f0)]), filter(nil)2 - output([T_FUN_COUNT(*)(0x7e5548c1d6f0)]), filter(nil), dop=13 - output([T_FUN_COUNT(*)(0x7e5548c1d6f0)]), filter(nil),group(nil), agg_func([T_FUN_COUNT(*)(0x7e5548c1d6f0)])4 - output([remove_const(1)(0x7e5548cbfe40)]), filter(nil),force partition granule.5 - output([remove_const(1)(0x7e5548cc0690)]), filter(nil),access([R.schoolid(0x7e5548cbf4b0)]), partitions(p40, p81, p122, p163, p204, p245, p286, p327, p368, p409, p450, p491, p532, p573, p614, p655, p696, p737, p[786-803]),is_index_back=false,range_key([R.schoolid(0x7e5548c1d270)], [R.schdate(0x7e5548c16f10)], [R.STATUS(0x7e5548c92080)], [R.REGION(0x7e5548c16c20)], [R.__pk_increment(0x7e5548c823b0)]),range(OpenStudeXS,2025-08-10 06:23:34,MIN,MIN,MIN ; OpenStudeXS,NULL,MIN,MIN,MIN),range_cond([R.schdate(0x7e5548c16f10) >= ?(0x7e5548c193e0)], [R.schoolid(0x7e5548c1d270) IN (?, ?)(0x7e5548c1c590)(0x7e5548c1bba0)])
小结:
查看执行计划详细信息partitions部分扫描的分区,结合下面查询表分区统计信息的结果,可以发现该sql扫描了表上的所有分区,优化器会在扫描的所有分区中指定行数最多的分区进行估算。
OB优化器估行规则:
优化器估行时依赖的统计信息可以从以下3个地方获取,其优先级为:存储层估行> 直方图统计信息> 基础统计信息(合并时搜集)
est_method:存储层采样方式。有 local_storage、remote_storage、basic_stat 三种。目前仅支持对本地存储进行采样,对于数据分区在远程节点上的情况,只能使用默认收集的统计信息进行代价估计,可能会引入代价偏差。
physical_range_rows:存储层对表估行估出的物理行数,优化器算 cost 参考的行数,当多分区表全表扫描时,优化器会根据数据量最大的分区的行数*分区数(扫描的分区数)来估算。
--表的分区统计信息(预估行数最多的分区是 P_451_202507)select a.svr_ip, b.part_id, b.part_name, a.row_countfrom __all_virtual_partition_table aleft join__all_virtual_partition_item bon a.partition_id = b.partition_id and a.table_Id = b.table_idwhere a.table_id = 1100611139649270and a.role=1order by part_id;+---------------+---------+--------------+-----------+| svr_ip | part_id | part_name | row_count |+---------------+---------+--------------+-----------+| 134.85.20.134 | 40 | P_430_MAX | 2654052 || 192.92.80.110 | 81 | P_431_MAX | 4626111 || 134.85.20.134 | 122 | P_432_MAX | 1243404 || 134.85.20.134 | 163 | P_433_MAX | 3479456 || 192.92.80.110 | 204 | P_434_MAX | 2575294 || 192.92.80.110 | 245 | P_435_MAX | 648928 || 192.92.80.110 | 286 | P_436_MAX | 1025867 || 134.85.20.134 | 327 | P_451_MAX | 0 || 134.85.20.134 | 368 | P_438_MAX | 1249465 || 134.85.20.134 | 409 | P_439_MAX | 250290 || 134.85.20.134 | 450 | P_443_MAX | 479238 || 134.85.20.134 | 491 | P_446_MAX | 780327 || 134.85.20.134 | 532 | P_631_MAX | 2255353 || 192.92.80.110 | 573 | P_632_MAX | 315325 || 192.92.80.110 | 614 | P_633_MAX | 314753 || 134.85.20.134 | 655 | P_634_MAX | 0 || 134.85.20.134 | 696 | P_635_MAX | 2425383 || 134.85.20.134 | 737 | P_888_MAX | 0 || 192.92.80.110 | 786 | P_451_202507 | 10622506 |--统计信息行数最多的分区| 192.92.80.110 | 787 | P_888_202507 | 668884 || 192.92.80.110 | 788 | P_451_202508 | 9014439 || 192.92.80.110 | 789 | P_888_202508 | 1357195 || 192.92.80.110 | 790 | P_451_202509 | 9307493 || 192.92.80.110 | 791 | P_888_202509 | 3723321 || 192.92.80.110 | 792 | P_451_202510 | 9014622 || 192.92.80.110 | 793 | P_888_202510 | 3415043 || 192.92.80.110 | 794 | P_451_202511 | 2425119 || 192.92.80.110 | 795 | P_888_202511 | 128298 || 192.92.80.110 | 796 | P_451_202512 | 0 || 192.92.80.110 | 797 | P_888_202512 | 0 || 192.92.80.110 | 798 | P_451_202601 | 0 || 192.92.80.110 | 799 | P_888_202601 | 0 || 192.92.80.110 | 800 | P_451_202602 | 0 || 192.92.80.110 | 801 | P_888_202602 | 0 || 192.92.80.110 | 802 | P_451_202603 | 0 || 192.92.80.110 | 803 | P_888_202603 | 0 |+---------------+---------+--------------+-----------+
3、 代入过滤查询此分区没有数据返回,优化器就会估出1行
select count(*)from tbcs.studenter partition(P_451_202507) rwhere r.schdate >= sysdate - NUMTODSINTERVAL('90','DAY')and r.schoolid in ('OpenStudeXS');+----------+| COUNT(*) |+----------+| 0 |+----------+
此时就已明了优化器为何出错,由于该笔是一个分区表,我们可以加上一个分区键从而减少分区扫描范围,确定一个有数据的分区做正常估行验证
explain extended select count(*)from tbcs.studenter rwhere r.schdate >= sysdate - NUMTODSINTERVAL('90','DAY')and region=451and r.schoolid in ('OpenStudeXS')\G*************************** 1. row ***************************Query Plan: ========================================================================|ID|OPERATOR |NAME |EST. ROWS|COST |------------------------------------------------------------------------|0 |SCALAR GROUP BY | |1 |211261||1 | PX COORDINATOR | |1 |202141||2 | EXCHANGE OUT DISTR |:EX10000 |1 |202141||3 | MERGE GROUP BY | |1 |202141||4 | PX PARTITION ITERATOR| |238940 |193022||5 | TABLE SCAN |R(IDX_studenter_schdate)|238940 |193022|========================================================================Outputs & filters:-------------------------------------0 - output([T_FUN_COUNT_SUM(T_FUN_COUNT(*)(0x7e6ea34e9bf0))(0x7e6ea3571600)]), filter(nil),group(nil), agg_func([T_FUN_COUNT_SUM(T_FUN_COUNT(*)(0x7e6ea34e9bf0))(0x7e6ea3571600)])1 - output([T_FUN_COUNT(*)(0x7e6ea34e9bf0)]), filter(nil)2 - output([T_FUN_COUNT(*)(0x7e6ea34e9bf0)]), filter(nil), dop=13 - output([T_FUN_COUNT(*)(0x7e6ea34e9bf0)]), filter(nil),group(nil), agg_func([T_FUN_COUNT(*)(0x7e6ea34e9bf0)])4 - output([remove_const(1)(0x7e6ea3581290)]), filter(nil),force partition granule, asc.5 - output([remove_const(1)(0x7e6ea3581ae0)]), filter([R.REGION(0x7e6ea34e2d90) = 451(0x7e6ea34e82c0)]),access([R.REGION(0x7e6ea34e2d90)]), partitions(p327, p368, p788, p790, p792, p794, p796, p798, p800, p802),is_index_back=false, filter_before_indexback[false],range_key([R.schoolid(0x7e6ea34e97f0)], [R.schdate(0x7e6ea34e3080)], [R.STATUS(0x7e6ea3557740)], [R.REGION(0x7e6ea34e2d90)], [R.__pk_increment(0x7e6ea3543060)]),range(OpenStudeXS,2025-08-10 06:24:57,MIN,451,MIN ; OpenStudeXS,NULL,MIN,MIN,MIN),range_cond([R.schdate(0x7e6ea34e3080) >= ?(0x7e6ea34e5550)], [R.schoolid(0x7e6ea34e97f0) = ?(0x7e6ea34e8b00)])Optimization Info:-------------------------------------R:table_rows:31972723, physical_range_rows:477880, logical_range_rows:477880, index_back_rows:0, output_rows:238940, est_method:remote_storage, optimization_method=cost_based, avaiable_index_name[IDX_studenter_schdate,studenter], pruned_index_name[IDX_studenter_RELATERECOID,IDX_studenter_SERVNUM,IDX_studenter_CUSTID,IDX_studenter_ORDERID,PK_CS_REC_studenter,IDX_studenter_ENTITYID,IDX_studenter_FORMNUM,IDX_studenter_GROUPID,IDX_studenter_RECOPID,IDX_studenter_RECORGID], estimation info[table_id:1100611139649489, (table_type:8, version:0-1762542036995292-1762542036995292, logical_rc:47788, physical_rc:47788), (table_type:5, version:1762542000927080-1762542000927080-1762542036995292, logical_rc:0, physical_rc:0), (table_type:0, version:1762542036995292-1762542036995292-9223372036854775807, logical_rc:0, physical_rc:0)]
查看执行计划详细信息partitions部分扫描的分区:partitions(p327, p368, p788, p790, p792, p794, p796, p798, p800, p802)
+---------------+---------+--------------+-----------+| svr_ip | part_id | part_name | row_count |+---------------+---------+--------------+-----------+| 192.92.80.135 | 788 | P_451_202508 | 9014439 || 192.92.80.110 | 790 | P_451_202509 | 9307493 |--统计信息行数最多的分区| 192.92.80.110 | 792 | P_451_202510 | 9014772 || 192.92.80.110 | 794 | P_451_202511 | 3386516 || 192.92.80.110 | 796 | P_451_202512 | 0 || 192.92.80.110 | 798 | P_451_202601 | 0 || 192.92.80.110 | 800 | P_451_202602 | 0 || 192.92.80.110 | 802 | P_451_202603 | 0 || 134.85.20.134 | 327 | P_451_MAX | 0 || 134.85.20.134 | 368 | P_452_MAX | 1249503 |+---------------+---------+--------------+-----------+
代入行数最多的分区P_451_202509做正常估行验证
select count(*)from tbcs.studenter partition(P_451_202509) rwhere r.schdate >= sysdate - NUMTODSINTERVAL('90','DAY')and region=451and r.schoolid in ('OpenStudeXS');+----------+| COUNT(*) |+----------+| 48813 |+----------+
优化器选择数据量最大的分区统计信息行数*扫描的分区数来估算。
则估行为 48813*10(扫描分区数)=488130,与ob估行logical_range_rows:477880 非常接近。
总结:
1. 理解ob的优化器估行逻辑,sql执行三步走,第一步查看扫描分区,第二步锁定行数最多的分区,第三步分区中过滤数据进行估算。
2. 此类问题的规避方法有两个,第一使用hint指定合适的表关联顺序(强烈推荐),第二谓词条件中指定分区(不推荐,还有扫描空数据分区的可能性)
想了解更多干货,可通过下方扫码关注

可扫码添加上智启元官方客服微信👇

17认证网








