Bo's Oracle Station

【博客文章2023】Hands-on:Oracle RAC数据库中的全局资源目录

2023-4-11 10:00| 发布者: admin| 查看: 68| 评论: 0

摘要: 本文介绍:全局资源目录(Global Resource Directory简称GRD)、gcs和ges。了解了原理之后,做了手动remaster操作,并验证结果。
【博客文章2023】Hands-on:Oracle RAC数据库中的全局资源目录

Author: Bo Tang

1. Oracle RAC数据库中的全局资源目录:

    在RAC的体系结构中,全局资源目录(Global Resource Directory简称GRD)是Oracle RAC数据库服务器中最重要的内存结构。它是一套哈希分布于各个实例间,由被称为ges resource pools”、“gcs mastership buckets”、“gcs res hash bucket”和“gcs resources”等内存结构组成的的元数据集。这个哈希分布元数据集用以描述Oracle RAC数据库服务器中数据块的状态、属主信息以及数据块内部和数据块自身的锁信息。GRD分布在所有实例的共享池中,每个实例维护GRD的一部份。所有实例维护的GRD合起来形成哈希分布式的整体集。GRD内部包含“转换队列”和“写队列”,这两个队列被GCS(Global Cache Service)和GES(Global Enqueue Service)维护。所有维护信息通过内连网传输。以下实验在Oracle Database 12.2.0.1 RAC数据库上操作。

2. GES

    查看GES:


select * from v$sgastat where name like 'ges%';

----------------------------------------------------------------

POOL NAME BYTES CON_ID

shared pool ges resource permanent 6673128 0

shared pool ges pdbs uid cache 16400 0

shared pool ges enqueue cur. usage pe 16 0

shared pool ges regular msg buffers 1683208 0

shared pool ges resource dynamic 14591256 0

shared pool ges resource pools 936 0

shared pool ges big mbuf pool 560 0

shared pool ges big msg buffers 15936504 0

shared pool ges process hash table 47872 0

shared pool ges shared global area 255808 0

shared pool ges lmd array 200 0

shared pool ges reg mbuf pool 560 0

shared pool ges enqueue max. usage pe 16 0

shared pool ges deadlock xid hash tab 54032 0

shared pool ges resource hash table 819680 0

shared pool ges resource quarantine q 16 0

shared pool ges s-lock bitvec 16384 0

shared pool ges enqueue multiple free 480 0

shared pool ges ipc instance maps 384 0

shared pool ges value block freelist 336 0

shared pool ges value block freelist 16 0

shared pool ges res mastership bucket 3072 0

shared pool ges mbuf pointer array 16 0

shared pool ges process array 2001920 0

shared pool ges reserved msg buffers 5110912 0

shared pool ges enqueues 9048672 0

shared pool ges process descriptor 42320 0

shared pool ges recovery domain table 11264 0

shared pool ges msg pool watchlist 288 0

shared pool ges process descriptor ar 8 0

shared pool ges recovery domain index 65584 0



    GES(Global Enqueue Service)进程:

    GES(Global Enqueue Service)监控进程:


[oracle@station3 ~]$ ps aux | grep lmon oracle 19896 0.3 0.1 2569040 59072 ? Ss Apr01 34:51 ora_lmon_c01orcl_2 grid 25677 0.2 0.1 2526216 48864 ? Ss Mar28 43:12 asm_lmon_+ASM1 oracle 31464 0.0 0.0 105436 824 pts/10 S+ 06:03 0:00 grep lmon


3. GCS
    

    查看GCS:


select * from v$sgastat where name like 'gcs%';

----------------------------------------------------------------

POOL NAME BYTES CON_ID

shared pool gcs node mapp pool 280 0 shared pool gcs nodemap 87360 0 shared pool gcs I/O statistics struct 136 0 shared pool gcs nodemap pool array 8 0 shared pool gcs process descriptor ar 8 0 shared pool gcs scan queue array 8 0 shared pool gcs resource segmented ar 48 0 shared pool gcs opaque info 19048 0 shared pool gcs opaque info freelist 200 0 shared pool gcs scan queue 176 0 shared pool gcs procsss descriptor 42320 0 shared pool gcs resource freelist seg 8 0 shared pool gcs shadows 14084464 0 shared pool gcs mastership buckets 3584 0 shared pool gcs opaque info freelist 8 0 shared pool gcs commit sga state 426000 0 shared pool gcs delta freelist 8 0 shared pool gcs shadows freelist arra 8 0 shared pool gcs resv res hash bucket 4194304 0 shared pool gcs shadows seg array 8 0 shared pool gcs pcmhv bucket 3072 0 shared pool gcs resource freelist arr 8 0 shared pool gcs res latch table 36864 0 shared pool gcs affinity object 2352 0 shared pool gcs shadows segmented arr 48 0 shared pool gcs resources 22887248 0 shared pool gcs resource freelist 264 0 shared pool gcs pnp history 4160 0 shared pool gcs shadows freelist 288 0 shared pool gcs resv pcmhv bucket 3072 0


    GCS(Global Cache Service)监控进程:

[oracle@station3 ~]$ ps aux | grep lms oracle 19900 0.9 2.3 2605236 861608 ? Ss Apr01 102:23 ora_lms0_c01orcl_2 grid 25681 0.6 0.1 2525152 50332 ? Ss Mar28 116:39 asm_lms0_+ASM1


    数据块的状态和属主等信息被存储成每128个块的信息一个master单元,即128个数据块的状态和属主等信息构成一个“gcs mastership bucket”。超过128个块的表的数据块可以被多个实例分布式地分段master。如果发生自动Remaster(Object Affinity and Dynamic Remastering引起)或手工Remaster(oradebug命令),整个对象将作为master单元而不进行多个实例分布式地分段master:即不管表多大,它的数据块都由同一个实例master。任何时候undo段整段必需由同一个实例master。

4. 一个数据块在多个实例上可能会有的多份拷贝(XI/PI/BI)

    一个数据块在多个实例上可能会有的多份拷贝。某个实例的数据库缓冲区缓存中拥有该数据块拷贝的实例被称作该数据块的Owner实例。Owner实例的个数可以是0(最小)到集群节点总数(最大)中的任何数值。
    如果该数据块内存拷贝在多个实例的数据库缓冲区缓存中同时被找到,也就是说该数据块有多个Owner实例,那么证明在近期有多个实例先后修改或访问过它。在所有这些该数据块的内存拷贝中,SCN最大的那个内存拷贝被称为Current Image(XI), 其他的那些内存拷贝就都被统统称为该数据块的 Past Image(PI)。PI的存在主要是为了在实例恢复过程中能被利用来减少实例交叉恢复的时间。如果由于检查点事件XI被写回硬盘,那么所有它所对应的PI都将被从内存中直接flush掉。还有一点值得注意的是:XI和PI都可以包含各自的new value和old value。通常old value在Oracle文档中又被称为Before Image即:BI。




5. 数据块的BL锁

    某个实例对数据库缓冲区缓存中的数据块的修改或访问都要先得到它的master实例的“Protected Read”授权,简称PR授权(锁状态为:KJUSERPR)。PR授权就是获得该数据块BL锁的过程。申请成功,并对该数据块在该实例的缓冲区缓存中的拷贝作了更改后,该拷贝就变成XI。某个数据块的BL锁申请的次数可以通过X$OBJECT_POLICY_STATISTICS查询XOPENS得到。

SQL> select object_id from dba_objects where owner='HR' and object_name='T04209_UNAME'; OBJECT_ID ---------- 93261

SQL> select  tablespace_name from dba_tables where table_name='T04209_UNAME' and owner='HR';


TABLESPACE_NAME

------------------------------

USERS


SQL> select * from v$tablespace where name='USERS';


       TS# NAME                           INC BIG FLA ENC     CON_ID

---------- ------------------------------ --- --- --- --- ----------

         4 USERS                          YES NO  YES              0


     以sys用户查询X$OBJECT_POLICY_STATISTICS:

select * from x$OBJECT_POLICY_STATISTICS where object = 93261;

----------------------------------------------------------------------------------------------------------------------------

ADDR INDX INST_ID CON_ID PDB TSN OBJECT NODE SOPENS XOPENS XFERS

00007F2B6E7D1D98 0 1 0 0 4 93261 1 1 245 0 232

00007F2B6E7D1D98 1 1 0 0 4 93261 2 245 2 0 0


    每一个XOPEN代表一次BL锁申请。具体的锁情况:

select to_char(93261, 'xxxxxxxx') from dual;

----------------------------------------------------------------------------------------------------------------------------

TO_CHAR(93261, 'XXXXXXXX')

16c4d

     查询gv$ges_resource,看见TM锁:

select inst_id, resource_name, on_convert_q, on_grant_q, master_node, next_cvt_level from gv$ges_resource where resource_name like '[0x16c4d]%';
-------------------------------------------------------------------------------------------------------------------------- 1 [0x16c4d][0x0],[TM][ext 0x0,0x 0 1 1 KJUSERNL 2 [0x16c4d][0x0],[TM][ext 0x0,0x 0 0 1 KJUSERNL


    查询gv$ges_enqueue,看见TM锁:

select inst_id, grant_level, request_level, resource_name2, PID, TRANSACTION_ID0, TRANSACTION_ID1 from gv$ges_enqueue where resource_name2 like '93261%';  -------------------------------------------------------------------------------------------------------------------------- 1 KJUSERCW KJUSERCW 93261,0,TM 0 0 0


6. 拥有某个块XI的实例不一定是该块的master实例

    拥有某个块XI的实例不一定是该块的master实例,这就会产生跨实例的通信。进一步分析“gc [current/cr] [multiblock] request”、“gc [current/cr] [2/3]-way“、”gc [current/cr] block busy“、”gc [current/cr] grant 2-way“、”gc [current/cr] [block/grant] congested“和“gc [current/cr] [failure/retry]“等待事件中的current(new value)和cr(old value)分别对应的master实例有可能不是同一个。
    下面这个查询语句能够查出master实例和owner实例(XI/PI/BI)不是同一个的数据块:

select kj.block#, kjblname, kjblname2, kjblowner+1 "OWNER_Instance", kjblmaster+1 "MASTER_Instance", kjbllockp from ( select kjblname, kjblname2, kjblowner, kjblmaster, kjbllockp, (substr ( kjblname2, instr(kjblname2,',')+1, instr(kjblname2,',',1,2)-instr(kjblname2,',',1,1)-1))/65536 file#, substr ( kjblname2, 1, instr(kjblname2,',')-1) block# from x$kjbl ) kj, ( select block_id block#_begin, block_id+blocks-1 block#_end, e.file_id file# from dba_extents e where e.owner='HR' and e.segment_name='T04209_UNAME' ) e where kj.block# between e.block#_begin and e.block#_end and kjblowner <> kjblmaster order by block# ; ---------------------------------------------------------------------------------------------------------------------------------------------------------------- BLOCK# KJBLNAME KJBLNAME2 OWNER_Instance MASTER_Instance KJBLLOCKP 218 [0xda][0x6],[BL][ext 0x0,0x0] 218,6,BL 2 3 000000007DF99950 224 [0xe0][0x1],[BL][ext 0x0,0x0] 224,1,BL 2 3 000000007DF8D3C8 225 [0xe1][0x1],[BL][ext 0x0,0x0] 225,1,BL 2 3 00000000AFFA0F90 232 [0xe8][0x1],[BL][ext 0x0,0x0] 232,1,BL 2 3 00000000ACF73898 233 [0xe9][0x1],[BL][ext 0x0,0x0] 233,1,BL 2 3 00000000ACF76398 296 [0x128][0x1],[BL][ext 0x0,0x0] 296,1,BL 3 2 000000006415E2A8 297 [0x129][0x1],[BL][ext 0x0,0x0] 297,1,BL 3 2 0000000064229E28 298 [0x12a][0x1],[BL][ext 0x0,0x0] 298,1,BL 3 2 000000006411B528 299 [0x12b][0x1],[BL][ext 0x0,0x0] 299,1,BL 3 2 0000000064378A28 300 [0x12c][0x1],[BL][ext 0x0,0x0] 300,1,BL 3 2 0000000064265D28 301 [0x12d][0x1],[BL][ext 0x0,0x0] 301,1,BL 3 2 0000000064206DA8 302 [0x12e][0x1],[BL][ext 0x0,0x0] 302,1,BL 3 2 00000000643C7DA8 305 [0x131][0x1],[BL][ext 0x0,0x0] 305,1,BL 3 2 00000000642791A8 306 [0x132][0x1],[BL][ext 0x0,0x0] 306,1,BL 3 2 0000000065F8F538 307 [0x133][0x1],[BL][ext 0x0,0x0] 307,1,BL 3 2 00000000642ABD28 308 [0x134][0x1],[BL][ext 0x0,0x0] 308,1,BL 3 2 00000000642D45A8 309 [0x135][0x1],[BL][ext 0x0,0x0] 309,1,BL 3 2 0000000065F70A38 310 [0x136][0x1],[BL][ext 0x0,0x0] 310,1,BL 3 2 00000000643C44A8 311 [0x137][0x1],[BL][ext 0x0,0x0] 311,1,BL 3 2 00000000643C7A28 312 [0x138][0x1],[BL][ext 0x0,0x0] 312,1,BL 3 2 0000000065F30EB8 313 [0x139][0x1],[BL][ext 0x0,0x0] 313,1,BL 3 2 0000000064419228 314 [0x13a][0x1],[BL][ext 0x0,0x0] 314,1,BL 3 2 00000000641E4228 315 [0x13b][0x1],[BL][ext 0x0,0x0] 315,1,BL 3 2 0000000064200628 316 [0x13c][0x1],[BL][ext 0x0,0x0] 316,1,BL 3 2 00000000640FC728 317 [0x13d][0x1],[BL][ext 0x0,0x0] 317,1,BL 3 2 00000000644541A8 318 [0x13e][0x1],[BL][ext 0x0,0x0] 318,1,BL 3 2 00000000642D44A8 319 [0x13f][0x1],[BL][ext 0x0,0x0] 319,1,BL 3 2 00000000641B7128 321 [0x141][0x1],[BL][ext 0x0,0x0] 321,1,BL 3 2 000000006420DD28 329 [0x149][0x1],[BL][ext 0x0,0x0] 329,1,BL 3 2 0000000064221E28 393 [0x189][0x1],[BL][ext 0x0,0x0] 393,1,BL 2 3 00000000B1FD9538 401 [0x191][0x1],[BL][ext 0x0,0x0] 401,1,BL 2 3 00000000AAF915A0 409 [0x199][0x1],[BL][ext 0x0,0x0] 409,1,BL 2 3 000000007CFE9488 416 [0x1a0][0x1],[BL][ext 0x0,0x0] 416,1,BL 2 3 00000000ADF95620 417 [0x1a1][0x1],[BL][ext 0x0,0x0] 417,1,BL 2 3 0000000081F69CB8 418 [0x1a2][0x1],[BL][ext 0x0,0x0] 418,1,BL 2 3 00000000B1FC2BC0 419 [0x1a3][0x1],[BL][ext 0x0,0x0] 419,1,BL 2 3 000000007BF7D070 420 [0x1a4][0x1],[BL][ext 0x0,0x0] 420,1,BL 2 3 00000000A7FF4F50 450 [0x1c2][0x3],[BL][ext 0x0,0x0] 450,3,BL 2 3 00000000B3FEF950 451 [0x1c3][0x3],[BL][ext 0x0,0x0] 451,3,BL 2 3 00000000B3F89CB0 452 [0x1c4][0x3],[BL][ext 0x0,0x0] 452,3,BL 2 3 00000000B3FA0378 453 [0x1c5][0x3],[BL][ext 0x0,0x0] 453,3,BL 2 3 00000000A6F98528 454 [0x1c6][0x3],[BL][ext 0x0,0x0] 454,3,BL 2 3 00000000AAFFB570 455 [0x1c7][0x3],[BL][ext 0x0,0x0] 455,3,BL 2 3 00000000B2F702D8 473 [0x1d9][0x1],[BL][ext 0x0,0x0] 473,1,BL 2 3 00000000B3FB9540 481 [0x1e1][0x1],[BL][ext 0x0,0x0] 481,1,BL 2 3 0000000081F690A0 489 [0x1e9][0x1],[BL][ext 0x0,0x0] 489,1,BL 2 3 00000000ACFC18F0 490 [0x1ea][0x1],[BL][ext 0x0,0x0] 490,1,BL 2 3 00000000ACFC43F0 491 [0x1eb][0x3],[BL][ext 0x0,0x0] 491,3,BL 2 3 00000000A7FCA0A8 492 [0x1ec][0x1],[BL][ext 0x0,0x0] 492,1,BL 2 3 00000000ADF84400 493 [0x1ed][0x1],[BL][ext 0x0,0x0] 493,1,BL 2 3 0000000079FA5010 494 [0x1ee][0x1],[BL][ext 0x0,0x0] 494,1,BL 2 3 00000000AEFA3A90 496 [0x1f0][0x3],[BL][ext 0x0,0x0] 496,3,BL 2 3 00000000AEF993F0 497 [0x1f1][0x1],[BL][ext 0x0,0x0] 497,1,BL 2 3 0000000079F89E08 498 [0x1f2][0x3],[BL][ext 0x0,0x0] 498,3,BL 2 3 00000000ACF7A010 499 [0x1f3][0x3],[BL][ext 0x0,0x0] 499,3,BL 2 3 00000000B2FA5D80 500 [0x1f4][0x3],[BL][ext 0x0,0x0] 500,3,BL 2 3 00000000A8F82670 501 [0x1f5][0x3],[BL][ext 0x0,0x0] 501,3,BL 2 3 000000007CF804D8 502 [0x1f6][0x3],[BL][ext 0x0,0x0] 502,3,BL 2 3 00000000A9F76398 503 [0x1f7][0x3],[BL][ext 0x0,0x0] 503,3,BL 2 3 0000000078FB5A20 505 [0x1f9][0x1],[BL][ext 0x0,0x0] 505,1,BL 2 3 00000000ACFC6EF0 506 [0x1fa][0x1],[BL][ext 0x0,0x0] 506,1,BL 2 3 0000000081FC5568 507 [0x1fb][0x1],[BL][ext 0x0,0x0] 507,1,BL 2 3 00000000A6F82D28 507 [0x1fb][0x3],[BL][ext 0x0,0x0] 507,3,BL 2 3 00000000B2F5EB58 508 [0x1fc][0x1],[BL][ext 0x0,0x0] 508,1,BL 2 3 0000000082FA2FD0 509 [0x1fd][0x1],[BL][ext 0x0,0x0] 509,1,BL 2 3 00000000AFFD5768 510 [0x1fe][0x1],[BL][ext 0x0,0x0] 510,1,BL 2 3 00000000A6F75220 511 [0x1ff][0x1],[BL][ext 0x0,0x0] 511,1,BL 2 3 0000000084FE27B0

7. 手动Remaster减少节点间通信

    为了减少跨实例的通信。进一步减少“gc [current/cr] [multiblock] request”、“gc [current/cr] [2/3]-way“、”gc [current/cr] block busy“、”gc [current/cr] grant 2-way“、”gc [current/cr] [block/grant] congested“和“gc [current/cr] [failure/retry]“等待事件。我们来执行手动Remaster,以使数据块的XI就在它的master实例上:


SQL> oradebug setmypid; Statement processed. SQL> oradebug lkdebug -a hashcount; Statement processed. SQL> oradebug lkdebug -k; Statement processed. SQL> oradebug lkdebug -m pkey 93261 4; Statement processed. SQL> oradebug lkdebug -k; Statement processed. SQL> oradebug tracefile_name; /u01/app/oracle/diag/rdbms/c01orcl/c01orcl_2/trace/c01orcl_2_ora_109982.trc

    在trace文件上看见对象93261发生了手动Remaster操作:

* >> PT table contents ---: pt table bucket = 3149 pkey 0.4.93261 undo 0 stat 0 masters[32768, 2->2] reminc 4 RM# 3 flg x0 type x0 afftime xe4d0d7d5, acquire time 603977 #replays: 0:0 benefit 0, total 0, remote 0, remote cost 0 cr benefit 0, cr total 0, cr remote 0, cr remote cost 0

    
select * from v$gcspfmaster_info where data_object_id = 93261; ----------------------------------------------------------------------------------------------------
FILE_ID     DATA_OBJECT_ID   GC_MASTERING_POLICY    CURRENT_MASTER     PREVIOUS_MASTER   REMASTER_CNT  CON_ID
0                   93261                                 Affinity 2                                32767                           1 0

    每做一次手动Remaster,上面的REMASTER_CNT会加1。PREVIOUS_MASTER为32767表示之前没有发生过Remaster,下面展示连接不同的实例后连续再执行两次Remaster后的结果:

select * from v$gcspfmaster_info where data_object_id = 93261; ----------------------------------------------------------------------------------------------------
FILE_ID     DATA_OBJECT_ID   GC_MASTERING_POLICY    CURRENT_MASTER     PREVIOUS_MASTER   REMASTER_CNT  CON_ID
0                   93261                                 Affinity 2                               1                                  3  0


路过

雷人

握手

鲜花

鸡蛋

QQ|手机版|Bo's Oracle Station   

GMT+8, 2023-4-12 21:50 , Processed in 0.034299 second(s), 21 queries .

返回顶部