5分钟，彻底精通Oracle DG切换！

没错，就是标题党，5分钟就想精通Oracle DG切换了吗？恐怕原理知识都掌握不了，更别说实操部分了，不要相信这些鬼话，太容易得到的知识你也不会去珍惜。

先看几个思考题？

1.主库切换为备库，成功执行如下命令后，原主库实例是什么状态？

alter database commit to switchover to physical standby with session shutdown;

A.shutdown B.nomount C.mount D.read only

2.备库切换为主库，成功执行如下命令后，原备库实例是什么状态？

alter database commit to switchover to primary with session shutdown;

A.shutdown B.nomount C.mount D.oread write

3.配置了dg broker以后，没有开启FSFO，是否还可以正常通过之前的SQL语句进行dg切换呢？

A.可以 B.不可以

4.配置了dg broker并且启用了FSFO，是否还可以正常通过之前的SQL语句进行dg切换呢？

A.可以 B.不可以

5.配置了dg broker并且启用了FSFO，主库执行shutdown immediate后，会自动进行主备切换吗？

A.会 B.不会

本篇文章主要讲解如下内容：

一：通过sqlplus进行switchover 二：通过dg broker进行switchover 三：通过dg broker进行自动failover 四：通过keepalived进行vip自动切换

一：通过sqlplus工具进行switchover

主机信息

172.16.6.137 cjc-db-01  172.16.6.138 cjc-db-02 OS:Redhat 7.9 DB:Oracle 11.2.0.4.0 单机

切换前检查

1.检查数据库基本信息

set lin 200 pages 100 col FLASHBACK_ON for a10 col current_scn for 99999999999999 col open_mode for a10 col SWITCHOVER_STATUS for a20 col PROTECTION_MODE for a20 col name for a10 select name,current_scn,protection_mode,database_role,force_logging,FLASHBACK_ON,open_mode,switchover_status from v$database;

主库：

NAME       CURRENT_SCN PROTECTION_MODEDATABASE_ROLE FOR FLASHBACK_ OPEN_MODE  SWITCHOVER_STATUS ---------- --------------- -------------------- ---------------- --- ---------- ---------- -------------------- CJC    978784 MAXIMUM PERFORMANCEPRIMARY  YES NO READ WRITE TO STANDBY

从库：

NAME       CURRENT_SCN PROTECTION_MODEDATABASE_ROLE FOR FLASHBACK_ OPEN_MODE  SWITCHOVER_STATUS ---------- --------------- -------------------- ---------------- --- ---------- ---------- -------------------- CJC    971317 MAXIMUM PERFORMANCEPHYSICAL STANDBY YES NO READ ONLY  NOT ALLOWED

从库的OPEN_MODE为READ ONLY，切换前需要开启MRP。

ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;  ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT FROM SESSION;

SWITCHOVER_STATUS说明：

NOT ALLOWED             当前的数据库不是带有备用数据库的主数据库 PREPARING DICTIONARY    该逻辑备用数据库正在向一个主数据库和其他备用数据库发送它的重做数据，以便为切换做准备 PREPARING SWITCHOVER    接受用于切换的重做数据时，逻辑备用配置会使用它 RECOVERY NEEDED         备用数据库还没有接收到切换请求 SESSIONS ACTIVE         在主数据库中存在活动的SQL会话；在继续执行之前必须断开这些会话 SWITCHOVER PENDING      适用于那些已收到主数据库切换请求但是还没有处理该请求的备用数据库 SWITCHOVER LATENT       切换没有完成并返回到主数据库 TO LOGICAL STANDBY      主数据库已经收到了来自逻辑备用数据库的完整的字典 TO PRIMARY              该备用数据库可以转换为主数据库 TO STANDBY              该主数据库可以转换为备用数据库

2.检查DG参数

set linesize 500 pages 100 col value for a70  col name for a30  select name, value  from v$parameter  where name in ('db_name','db_unique_name',  'log_archive_config',  'log_archive_dest_1','log_archive_dest_2',  'log_archive_dest_state_1', 'log_archive_dest_3',  'log_archive_dest_state_3',  'log_archive_dest_state_2',  'remote_login_passwordfile',  'log_archive_format',  'log_archive_max_processes',  'fal_server','db_file_name_convert',  'log_file_name_convert',  'standby_file_management');

主库：

NAME       VALUE ------------------------------ ---------------------------------------------------------------------- db_file_name_convert       cjc2, cjc1 log_file_name_convert       cjc2, cjc1 log_archive_dest_1       LOCATION=/arch VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=cjc1 log_archive_dest_2       SERVICE=cjc2 ASYNC VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_                NAME=cjc2 
 
 log_archive_dest_3 log_archive_dest_state_1       ENABLE log_archive_dest_state_2       ENABLE log_archive_dest_state_3       enable fal_server               cjc2 log_archive_config       DG_CONFIG=(cjc1,cjc2) log_archive_format       cjc_%t_%s_%r.arc log_archive_max_processes      4 standby_file_management        AUTO remote_login_passwordfile      EXCLUSIVE db_name            cjc db_unique_name       cjc1 
 16 rows selected.

从库：

NAME       VALUE ------------------------------ ---------------------------------------------------------------------- db_file_name_convert       cjc1, cjc2 log_file_name_convert       cjc1, cjc2 log_archive_dest_1       LOCATION=/arch VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=cjc2 log_archive_dest_2       SERVICE=cjc1 ASYNC VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_        NAME=cjc1 
 
 log_archive_dest_3 log_archive_dest_state_1       ENABLE log_archive_dest_state_2       ENABLE log_archive_dest_state_3       enable fal_server           cjc1 log_archive_config       DG_CONFIG=(cjc2,cjc1) log_archive_format       cjc_%t_%s_%r.arc log_archive_max_processes      4 standby_file_management        AUTO remote_login_passwordfile      EXCLUSIVE db_name            cjc db_unique_name       cjc2 
 16 rows selected.

3.检查DG进程状态

col dest_name for a30 col error for a20 set lin 200 pages 100 col applied_scn for 9999999999999 select dest_id,error,status,log_sequence,applied_scn,MAX_CONNECTIONS,NET_TIMEOUT,COMPRESSION from v$archive_dest where dest_id<5;

主库：

 DEST_ID ERRORSTATUS  LOG_SEQUENCE  APPLIED_SCN MAX_CONNECTIONS NET_TIMEOUT COMPRES ---------- -------------------- --------- ------------ -------------- --------------- ----------- -------  1VALID    18    0    10 DISABLE  2VALID    19       9713171   30 DISABLE  3INACTIVE     0    0    10 DISABLE  4INACTIVE     0    0    10 DISABLE

从库：

  DEST_ID ERRORSTATUS  LOG_SEQUENCE  APPLIED_SCN MAX_CONNECTIONS NET_TIMEOUT COMPRES ---------- -------------------- --------- ------------ -------------- --------------- ----------- -------  1VALID    18    0    10 DISABLE  2VALID     0    0    1   30 DISABLE  3INACTIVE     0    0    10 DISABLE  4INACTIVE     0    0    10 DISABLE

4.检查进程状态

select INST_ID,process,status,thread#,sequence#,block#,blocks from gv$managed_standby order by INST_ID ;

主库：

   INST_ID PROCESS   STATUS     THREAD#  SEQUENCE#     BLOCK#     BLOCKS ---------- --------- ------------ ---------- ---------- ---------- ----------  1 ARCH      CLOSING   1     19      20480  831  1 ARCH      CLOSING   1     17  1   75  1 LNS     WRITING   1     20        168    1  1 ARCH      CLOSING   1     18  1   22  1 ARCH      CLOSING   1     14  1 2404

从库：

   INST_ID PROCESS   STATUS     THREAD#  SEQUENCE#     BLOCK#     BLOCKS ---------- --------- ------------ ---------- ---------- ---------- ----------  1 ARCH      CONNECTED   0      0  0    0  1 ARCH      CONNECTED   0      0  0    0  1 ARCH      CLOSING   1     18  1   22  1 ARCH      CLOSING   1     19      20480  831  1 MRP0      APPLYING_LOG   1     20        152     102400  1 RFS     IDLE   1     20        152    1  1 RFS     IDLE   0      0  0    0  1 RFS     IDLE   0      0  0    0  1 RFS     IDLE   0      0  0    0 
 9 rows selected.

5.检查从库应用状态

set lines 200 col dest_name for a30 select DEST_ID,DEST_NAME,RECOVERY_MODE from gv$archive_dest_status where RECOVERY_MODE <>'IDLE';

  DEST_ID DEST_NAME  RECOVERY_MODE ---------- ------------------------------ -----------------------  1 LOG_ARCHIVE_DEST_1  MANAGED REAL TIME APPLY

6.gap检查

主库：

SELECT LOG_ARCHIVED-LOG_APPLIED+1 LOGGAP FROM (SELECT MAX(SEQUENCE#) LOG_ARCHIVED FROM V$ARCHIVED_LOG WHERE DEST_ID=1 AND ARCHIVED='YES' AND RESETLOGS_CHANGE#=(SELECT MAX(RESETLOGS_CHANGE#) FROM V$ARCHIVED_LOG)), (SELECT MAX(SEQUENCE#) LOG_APPLIED FROM V$ARCHIVED_LOG WHERE DEST_ID=2 AND APPLIED='YES' AND RESETLOGS_CHANGE#=(SELECT MAX(RESETLOGS_CHANGE#) FROM V$ARCHIVED_LOG)) ;     LOGGAP ----------  1

7.主库创建测试数据，检查数据是否正常同步

SQL> col name for a50 SQL> select name from v$dbfile; NAME -------------------------------------------------- /oradata/cjc/users01.dbf /oradata/cjc/undotbs01.dbf /oradata/cjc/sysaux01.dbf /oradata/cjc/system01.dbf 
 SQL> select * from cjc.t1; ID ----------  1

8.主切备

sqlplus  as sysdba SELECT trim(DATABASE_ROLE) DBROLE FROM v$DATABASE; DBROLE ---------------- PRIMARY

alter system switch logfile; alter system flush SHARED_POOL; ALTER SYSTEM SET log_archive_trace=8191 sid='*';

执行命令后，原主库自动关闭实例

alter database commit to switchover to physical standby with session shutdown;

对应切换日志如下：

Tue Sep 06 14:15:08 2022 alter database commit to switchover to physical standby with session shutdown ALTER DATABASE COMMIT TO SWITCHOVER TO PHYSICAL STANDBY [Process Id: 21460] (cjc1) krss_find_arc: Selecting ARC2 to receive message as last resort Waiting for all non-current ORLs to be archived... All non-current ORLs have been archived. Waiting for all FAL entries to be archived... All FAL entries have been archived. Waiting for potential Physical Standby switchover target to become synchronized... Tue Sep 06 14:15:08 2022 OCISessionBegin with PasswordVerifier succeeded Client pid [10698] attached to RFS pid [21661] at remote instance number [1] at dest 'cjc2' Active, synchronized Physical Standby switchover target has been identified Tue Sep 06 14:15:10 2022 Process (ospid 10671) is suspended due to switchover to physical standby operation. Switchover End-Of-Redo Log thread 1 sequence 21 has been fixed Switchover: Primary highest seen SCN set to 0x0.0xf62e8 ARCH: Noswitch archival of thread 1, sequence 21 ARCH: End-Of-Redo Branch archival of thread 1 sequence 21 ARCH: Evaluating archive   log 3 thread 1 sequence 21 ARCH: LGWR is actively archiving destination LOG_ARCHIVE_DEST_2 ARCH: Beginning to archive thread 1 sequence 21 (988339-Infinity) (cjc1) ARCH: Creating remote archive destination LOG_ARCHIVE_DEST_2: 'cjc2' (thread 1 sequence 21) (cjc1) ARCH: Transmitting activation ID 0xdfebe7b0 OCISessionBegin with PasswordVerifier succeeded Client pid [21460] attached to RFS pid [21665] at remote instance number [1] at dest 'cjc2' ARCH: Standby redo logfile selected for thread 1 sequence 21 for destination LOG_ARCHIVE_DEST_2 ARCH: Standby redo logfile selected for thread 1 sequence 21 for destination LOG_ARCHIVE_DEST_2 ARCH: Creating local archive destination LOG_ARCHIVE_DEST_1: '/arch/cjc_1_21_1114613364.arc' (thread 1 sequence 21) (cjc1) ARCH: Closing remote archive destination LOG_ARCHIVE_DEST_2: 'cjc2' (cjc1) ARCH: Closing local archive destination LOG_ARCHIVE_DEST_1: '/arch/cjc_1_21_1114613364.arc' (cjc1) Committing creation of archivelog '/arch/cjc_1_21_1114613364.arc' Archived Log entry 25 added for thread 1 sequence 21 ID 0xdfebe7b0 dest 1: Archived Log entry 26 added for thread 1 sequence 21 ID 0xdfebe7b0 dest 2: ARCH: Completed archiving thread 1 sequence 21 (988339-1008360) (cjc1) ARCH: Archiving is disabled due to current logfile archival Primary will check for some target standby to have received alls redo Final check for a synchronized target standby. Check will be made once. ARCH: Transmitting activation ID 0xdfebe7b0 LOG_ARCHIVE_DEST_2 is a potential Physical Standby switchover target Active, synchronized target has been identified Target has also received all redo ----------------------------------------------------------- |          Target Standby Status                           | | LOG_ARCHIVE_DEST_1  : NOT ACTIVE                     | | LOG_ARCHIVE_DEST_2  : HAS RECEIVED ALL DATA          | | LOG_ARCHIVE_DEST_3  : NOT ACTIVE                     | | LOG_ARCHIVE_DEST_4  : NOT ACTIVE                     | | LOG_ARCHIVE_DEST_5  : NOT ACTIVE                     | | LOG_ARCHIVE_DEST_6  : NOT ACTIVE                     | | LOG_ARCHIVE_DEST_7  : NOT ACTIVE                     | | LOG_ARCHIVE_DEST_8  : NOT ACTIVE                     | | LOG_ARCHIVE_DEST_9  : NOT ACTIVE                     | | LOG_ARCHIVE_DEST_10  : NOT ACTIVE                     | | LOG_ARCHIVE_DEST_11  : NOT ACTIVE                     | | LOG_ARCHIVE_DEST_12  : NOT ACTIVE                     | | LOG_ARCHIVE_DEST_13  : NOT ACTIVE                     | | LOG_ARCHIVE_DEST_14  : NOT ACTIVE                     | | LOG_ARCHIVE_DEST_15  : NOT ACTIVE                     | | LOG_ARCHIVE_DEST_16  : NOT ACTIVE                     | | LOG_ARCHIVE_DEST_17  : NOT ACTIVE                     | | LOG_ARCHIVE_DEST_18  : NOT ACTIVE                     | | LOG_ARCHIVE_DEST_19  : NOT ACTIVE                     | | LOG_ARCHIVE_DEST_20  : NOT ACTIVE                     | | LOG_ARCHIVE_DEST_21  : NOT ACTIVE                     | | LOG_ARCHIVE_DEST_22  : NOT ACTIVE                     | | LOG_ARCHIVE_DEST_23  : NOT ACTIVE                     | | LOG_ARCHIVE_DEST_24  : NOT ACTIVE                     | | LOG_ARCHIVE_DEST_25  : NOT ACTIVE                     | | LOG_ARCHIVE_DEST_26  : NOT ACTIVE                     | | LOG_ARCHIVE_DEST_27  : NOT ACTIVE                     | | LOG_ARCHIVE_DEST_28  : NOT ACTIVE                     | | LOG_ARCHIVE_DEST_29  : NOT ACTIVE                     | | LOG_ARCHIVE_DEST_30  : NOT ACTIVE                     | | LOG_ARCHIVE_DEST_31  : NOT ACTIVE                     | ------------------------------------------------------------ Backup controlfile written to trace file oracle/app/oracle/diag/rdbms/cjc1/cjc1/trace/cjc1_ora_21460.trc Clearing standby activation ID 3756779440 (0xdfebe7b0) The primary database controlfile was created using the 'MAXLOGFILES 16' clause. There is space for up to 13 standby redo logfiles Use the following SQL commands on the standby database to create standby redo logfiles that match the primary database: ALTER DATABASE ADD STANDBY LOGFILE 'srl1.f' SIZE 52428800; ALTER DATABASE ADD STANDBY LOGFILE 'srl2.f' SIZE 52428800; ALTER DATABASE ADD STANDBY LOGFILE 'srl3.f' SIZE 52428800; ALTER DATABASE ADD STANDBY LOGFILE 'srl4.f' SIZE 52428800; Archivelog for thread 1 sequence 21 required for standby recovery Switchover: Primary controlfile converted to standby controlfile succesfully. Switchover: Complete - Database shutdown required USER (ospid: 21460): terminating the instance Instance terminated by USER, pid = 21460 Completed: alter database commit to switchover to physical standby with session shutdown Shutting down instance (abort) License high water mark = 6 Tue Sep 06 14:15:11 2022 Instance shutdown complete

启动实例

startup mount

9.备切主

su - oracle sqlplus  as sysdba show parameter instance_name SELECT trim(DATABASE_ROLE) DBROLE FROM v$DATABASE; DBROLE ---------------- PHYSICAL STANDBY

ALTER SYSTEM SET log_archive_trace=8191 sid='*';

执行命令后，原备库自动启动到mount

alter database commit to switchover to primary with session shutdown;

切换日志如下：

Tue Sep 06 14:18:05 2022 alter database commit to switchover to primary with session shutdown ALTER DATABASE SWITCHOVER TO PRIMARY (cjc2) Maximum wait for role transition is 15 minutes. Switchover: Media recovery is still active Role Change: Canceling MRP - no more redo to apply Tue Sep 06 14:18:06 2022 MRP0: Background Media Recovery cancelled with status 16037 Errors in file oracle/app/oracle/diag/rdbms/cjc2/cjc2/trace/cjc2_pr00_20372.trc: ORA-16037: user requested cancel of managed recovery operation Managed Standby Recovery not using Real Time Apply Recovery interrupted! Errors in file oracle/app/oracle/diag/rdbms/cjc2/cjc2/trace/cjc2_pr00_20372.trc: ORA-16037: user requested cancel of managed recovery operation Tue Sep 06 14:18:07 2022 Errors in file oracle/app/oracle/diag/rdbms/cjc2/cjc2/trace/cjc2_mrp0_20358.trc: ORA-10877: error signaled in parallel recovery slave  MRP0: Background Media Recovery process shutdown (cjc2) Role Change: Canceled MRP All dispatchers and shared servers shutdown CLOSE: killing server sessions. CLOSE: all sessions shutdown successfully. Tue Sep 06 14:18:08 2022 SMON: disabling cache recovery Backup controlfile written to trace file oracle/app/oracle/diag/rdbms/cjc2/cjc2/trace/cjc2_ora_20276.trc SwitchOver after complete recovery through change 1008360 Online log oradata/cjc/redo01.log: Thread 1 Group 1 was previously cleared Online log oradata/cjc/redo02.log: Thread 1 Group 2 was previously cleared Online log oradata/cjc/redo03.log: Thread 1 Group 3 was previously cleared Standby became primary SCN: 1008358 Switchover: Complete - Database mounted as primary Completed: alter database commit to switchover to primary with session shutdown Tue Sep 06 14:18:29 2022 ARC2: Becoming the 'no SRL' ARCH Tue Sep 06 14:18:48 2022 idle dispatcher 'D000' terminated, pid = (17, 1)

重启实例

shutdown immediate; startup;

9.切换后检查

set lin 200 pages 100 set lin 200 pages 100 col FLASHBACK_ON for a10 col current_scn for 99999999999999 col open_mode for a20 col SWITCHOVER_STATUS for a20 col PROTECTION_MODE for a20 select current_scn,protection_mode,database_role,force_logging,FLASHBACK_ON,open_mode,switchover_status from v$database;

 CURRENT_SCN PROTECTION_MODE      DATABASE_ROLE    FOR FLASHBACK_ OPEN_MODESWITCHOVER_STATUS --------------- -------------------- ---------------- --- ---------- ---------- -------------------- 1008736 MAXIMUM PERFORMANCE  PRIMARY      YES NO     READ WRITE RESOLVABLE GAP

ALTER SYSTEM SET log_archive_trace=0 sid='*';

10.新备库启动mrp

sqlplus  as sysdba alter database open; ALTER SYSTEM SET log_archive_trace=0 sid='*'; ###recover managed standby database using current logfile disconnect from session; ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;  ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT FROM SESSION;

从库

  CURRENT_SCN PROTECTION_MODE      DATABASE_ROLE    FOR FLASHBACK_ OPEN_MODESWITCHOVER_STATUS --------------- -------------------- ---------------- --- ---------- ---------- -------------------- 1009025 MAXIMUM PERFORMANCE  PHYSICAL STANDBY YES NO     READ ONLYNOT ALLOWED      WITH APPLY

主库：

 CURRENT_SCN PROTECTION_MODE      DATABASE_ROLE    FOR FLASHBACK_ OPEN_MODESWITCHOVER_STATUS --------------- -------------------- ---------------- --- ---------- ---------- -------------------- 1008800 MAXIMUM PERFORMANCE  PRIMARY      YES NO     READ WRITE TO STANDBY

主库：

SQL> insert into cjc.t1 values(2); SQL> insert into cjc.t1 values(3); SQL> commit;

从库：

SQL> select * from cjc.t1; ID ----------  1  2  3

回切：

主切备

alter database commit to switchover to physical standby with session shutdown;  startup mount

备切主

alter database commit to switchover to primary with session shutdown; shutdown immediate; startup;

备库启动MRP

alter database open; ###ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;  ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT FROM SESSION;

二：通过dg broker进行switchover

手动进行DG switchover，步骤有些麻烦，是否有更简单的方式呢，可以试试dg broker。

Oracle DataGuard Broker分为Client Side和Server Side。

Client Side可以通过EM和DGMGRL两种工具对服务端进行管理和维护。

Server side会有一个配置文件和一个后台进程叫Data Guard Broker monitor process(DMON)。

DMON：它是一个用来管理Broker的后台进程，这个进程负责本地数据库与standby数据库的DMON进程进行通讯，当主库上接收到一个请求的时候，它会协调其他数据库上的DMON进程处理相应的请求，比如switchover。

同时会更新本地系统中的配置文件，并与standby数据库上的DMON进程进行通信，更新Standby上的配置文件。

1.启用dg broker

主库、从库：

配置DG_BROKER_START参数

show parameter dg_broker_start; NAME     TYPE VALUE ------------------------------------ ----------- ------------------------------ dg_broker_start      boolean FALSE

启用dg_broker_start，启用后oracle会自动启动一个dmon进程

alter system set dg_broker_start = true;

对应日志：

Tue Sep 06 15:04:40 2022 ALTER SYSTEM SET dg_broker_start=TRUE SCOPE=BOTH; Tue Sep 06 15:04:40 2022 DMON started with pid=31, OS id=3206  Starting Data Guard Broker (DMON) Tue Sep 06 15:04:48 2022 INSV started with pid=32, OS id=3215

2.调整监听文件

在监听文件中加入DGMGRL静态监听

主库

SID_LIST_LISTENER =    (SID_LIST =      (SID_DESC =        (GLOBAL_DBNAME = cjc1)        (ORACLE_HOME = /oracle/app/oracle/product/11.2/db)        (SID_NAME = cjc1)      )      (SID_DESC =        (GLOBAL_DBNAME = cjc1_DGMGRL)        (ORACLE_HOME = /oracle/app/oracle/product/11.2/db)        (SID_NAME = cjc1)      )    )    LISTENER =   (DESCRIPTION_LIST =     (DESCRIPTION =       (ADDRESS = (PROTOCOL = TCP)(HOST = 172.16.6.137)(PORT = 1521))     )   )

从库

SID_LIST_LISTENER =    (SID_LIST =      (SID_DESC =        (GLOBAL_DBNAME = cjc2)        (ORACLE_HOME = /oracle/app/oracle/product/11.2/db)        (SID_NAME = cjc2)      )      (SID_DESC =        (GLOBAL_DBNAME = cjc2_DGMGRL)        (ORACLE_HOME = /oracle/app/oracle/product/11.2/db)        (SID_NAME = cjc2)      )    )    LISTENER =   (DESCRIPTION_LIST =     (DESCRIPTION =       (ADDRESS = (PROTOCOL = TCP)(HOST = 172.16.6.138)(PORT = 1521))     )   )

监听状态

lsnrctl reload lsnrctl status

主库：

Services Summary... Service "cjc1" has 2 instance(s).   Instance "cjc1", status UNKNOWN, has 1 handler(s) for this service...   Instance "cjc1", status READY, has 1 handler(s) for this service... Service "cjc1_DGMGRL" has 1 instance(s).   Instance "cjc1", status UNKNOWN, has 1 handler(s) for this service... Service "cjcXDB" has 1 instance(s).   Instance "cjc1", status READY, has 1 handler(s) for this service... The command completed successfully

从库：

Services Summary... Service "cjc2" has 2 instance(s).   Instance "cjc2", status UNKNOWN, has 1 handler(s) for this service...   Instance "cjc2", status READY, has 1 handler(s) for this service... Service "cjc2_DGMGRL" has 1 instance(s).   Instance "cjc2", status UNKNOWN, has 1 handler(s) for this service... Service "cjcXDB" has 1 instance(s).   Instance "cjc2", status READY, has 1 handler(s) for this service... The command completed successfully

3.配置broker

[oracle@cjc-db-01 ~]$ dgmgrl sys/oracle DGMGRL for Linux: Version 11.2.0.4.0 - 64bit Production Copyright (c) 2000, 2009, Oracle. All rights reserved. Welcome to DGMGRL, type "help" for information. Connected. DGMGRL>

显示配置

DGMGRL> show configuration ORA-16532: Data Guard broker configuration does not exist 
 Configuration details cannot be determined by DGMGRL

添加配置

DGMGRL> create configuration 'cjcdgbroker' as primary database is 'cjc1' connect identifier is cjc1; Configuration "cjcdgbroker" created with primary database "cjc1"

显示配置

DGMGRL> show configuration 
 Configuration - cjcdgbroker 
   Protection Mode: MaxPerformance   Databases:     cjc1 - Primary database 
 Fast-Start Failover: DISABLED 
 Configuration Status: DISABLED

新增备库配置

DGMGRL> add database 'cjc2' as connect identifier is 'cjc2' maintained as physical; Database "cjc2" added

显示配置

DGMGRL> show configuration 
 Configuration - cjcdgbroker 
   Protection Mode: MaxPerformance   Databases:     cjc1 - Primary database     cjc2 - Physical standby database 
 Fast-Start Failover: DISABLED 
 Configuration Status: DISABLED

启用配置

DGMGRL> enable configuration Enabled.

显示配置

DGMGRL> show configuration 
 Configuration - cjcdgbroker 
   Protection Mode: MaxPerformance   Databases:     cjc1 - Primary database     cjc2 - Physical standby database 
 Fast-Start Failover: DISABLED 
 Configuration Status: SUCCESS

4.测试switchover

DGMGRL> switchover to cjc2 Performing switchover NOW, please wait... Operation requires a connection to instance "cjc2" on database "cjc2" Connecting to instance "cjc2"... Connected. New primary database "cjc2" is opening... Operation requires startup of instance "cjc1" on database "cjc1" Starting instance "cjc1"... ORACLE instance started. Database mounted. Database opened. Switchover succeeded, new primary is "cjc2"

检查配置

DGMGRL> show configuration 
 Configuration - cjcdgbroker 
   Protection Mode: MaxPerformance   Databases:     cjc2 - Primary database     cjc1 - Physical standby database 
 Fast-Start Failover: DISABLED 
 Configuration Status: SUCCESS

SQL检查

set lin 200 pages 100 set lin 200 pages 100 col FLASHBACK_ON for a10 col current_scn for 99999999999999 col open_mode for a20 col SWITCHOVER_STATUS for a20 col PROTECTION_MODE for a20 select current_scn,protection_mode,database_role,force_logging,FLASHBACK_ON,open_mode,switchover_status from v$database;

新备：

 CURRENT_SCN PROTECTION_MODE      DATABASE_ROLE    FOR FLASHBACK_ OPEN_MODE  SWITCHOVER_STATUS --------------- -------------------- ---------------- --- ---------- -------------------- -------------------- 1052426 MAXIMUM PERFORMANCE  PHYSICAL STANDBY YES NO     READ ONLY WITH APPLY NOT ALLOWED

新主

 CURRENT_SCN PROTECTION_MODE      DATABASE_ROLE    FOR FLASHBACK_ OPEN_MODE  SWITCHOVER_STATUS --------------- -------------------- ---------------- --- ---------- -------------------- -------------------- 1052498 MAXIMUM PERFORMANCE  PRIMARY      YES NO     READ WRITE   SESSIONS ACTIVE

5.回切

DGMGRL> switchover to cjc1; Performing switchover NOW, please wait... Operation requires a connection to instance "cjc1" on database "cjc1" Connecting to instance "cjc1"... Connected. New primary database "cjc1" is opening... Operation requires startup of instance "cjc2" on database "cjc2" Starting instance "cjc2"... ORACLE instance started. Database mounted. Database opened. Switchover succeeded, new primary is "cjc1"

检查配置

DGMGRL> show configuration 
 Configuration - cjcdgbroker 
   Protection Mode: MaxPerformance   Databases:     cjc1 - Primary database     cjc2 - Physical standby database 
 Fast-Start Failover: DISABLED 
 Configuration Status: SUCCESS

思考：配置了dg broker以后，是否还可以通过之前的SQL语句进行dg切换呢？

此时，通过SQL命令仍然可以进行切换

主切备

alter database commit to switchover to physical standby with session shutdown;  startup mount

备切主

alter database commit to switchover to primary with session shutdown; shutdown immediate; startup;

备库启动MRP

alter database open; ###recover managed standby database using current logfile disconnect from session; ###ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;  ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT FROM SESSION;

可以使用，既然已经启动了dg broker，建议通过broker进行切换。

三：通过dg broker进行自动failover

启动自动故障转移FSFO

Fast-Start Failove

FSFO允许代理在主库故障的情况下自动故障转移到先前选择的备库，无需手动执行任何步骤，以便快速可靠地恢复业务。

FSFO只能在代理配置中使用，并且只能通过DGMGRL或OEM进行配置。

FSFO支持在最高可用性与最高性能模式下使用。

最大可用性模式保证不会丢失任何数据，最高性能模式可保证丢失的数据量不超过FastStartFailoverLagLimit属性指定的数据量（单位为秒）。

思考：何时会进行FSFO？

配置完成后，FSFO将在以下情况下起作用：

1.主库与 Observer和目标备库 失联时间均超过FastStartFailoverThreshold属性配置阈值（单位为秒） 2.单实例数据库中主实例崩溃 3.RAC中所有主实例崩溃 4.shutdown abort主库 5.应用程序通过调用DBMS_DG.INITIATE_FS_FAILOVER函数启动FSFO 6.Oracle 错误：可以指定启动 FSFO 故障切换的 ORA 错误列表（默认为空）

Broker 可配置为遇到以下任一条件时启动FSFO

1.Datafile Offline    数据文件由于写错误脱机Yes 2.Corrupted Dictionary关键数据库对象的字典损坏，该状态可在数据库open时被检测到Yes 3.Corrupted Controlfile控制文件由于写入错误永久损坏Yes 4.Inaccessible Logfile由于IO错误，LGWR无法写入日志组的任何成员No 5.Stuck Archiver        由于设备已满或不可用，归档进程无法归档redo log

设置保护模式与LogXptMode

LOGXPTMODE属性在最大可用性模式下应为SYNC，在最大性能模式下应为ASYNC，主备库LOGXPTMODE设置必须相同

###DGMGRL> EDIT CONFIGURATION SET PROTECTION MODE AS MaxAvailability; DGMGRL> edit database cjc1 set property 'LogXptMode'='ASYNC';  Property "LogXptMode" updated DGMGRL> edit database cjc2 set property 'LogXptMode'='ASYNC'; Property "LogXptMode" updated

设置FSFO阈值

如果主库与 Observer和目标备库失联时间均超过FastStartFailoverThreshold属性配置阈值（单位为秒），将启动FSFO。

换句话说，FastStartFailoverThreshold表示，Observer和目标备库在检测到主库不可用后、希望等多少秒才启动FSFO。再此期间，它们会尝试重连主库。

默认值为30秒，最小6秒。

DGMGRL> edit configuration set property faststartfailoverthreshold=20; Property "faststartfailoverthreshold" updated

设置最大可接受延迟时间

FastStartFailoverLagLimit表示，自动故障转移允许的最大数据丢失量（单位为秒），仅在最高性能保护模式时才可使用。

FastStartFailoverLagLimit作为DG可接受的最大延迟时间，备库的apply lag只有在此限制之内，才允许FSFO。

如果无法维持已配置的数据丢失保证，主库上的redo生成将停止。

为了避免长时间停顿，Observer或者目标备库可能会在第一次记录到无法发生FSFO之后，允许主库继续生成redo。

此时若主库故障，将无法发生FSFO。

默认值为30秒，最小值为10秒。

DGMGRL> edit configuration set property faststartfailoverlaglimit=60;

设置自动恢复数据库属性

如果FastStartFailoverAutoReinstate属性设置为TRUE，在原主库故障修复后，会自动尝试将其恢复为新主库的备库。

DGMGRL> EDIT CONFIGURATION SET PROPERTY FASTSTARTFAILOVERAUTOREINSTATE=TRUE; Property "faststartfailoverautoreinstate" updated

启动Observer

创建脚本

[oracle@cjc-db-02 dg]$ pwd /home/oracle/dg [oracle@cjc-db-02 dg]$ mkdir observer [oracle@cjc-db-02 dg]$ cd observer/ vi observer.sh nohup dgmgrl sys/oracle@cjc2 "start observer file=FSFO.dat">>fsfo.log 2>&1 &

执行脚本

[oracle@cjc-db-02 observer]$ sh observer.sh

启动FSFO，报错ORA-16651

DGMGRL> enable fast_start failover; Error: ORA-16651: requirements not met for enabling fast-start failover

检查报错信息，发现主从库必须开启闪回

[oracle@cjc-db-02 observer]$ oerr ora 16651 16651, 0000, "requirements not met for enabling fast-start failover" // *Cause:  The attempt to enable fast-start failover could not be completed //          because one or more requirements were not met: //          - The Data Guard configuration must be in either MaxAvailability //            or MaxPerformance protection mode. //          - The LogXptMode property for both the primary database and //            the fast-start failover target standby database must be //            set to SYNC if the configuration protection mode is set to //            MaxAvailability mode. //          - The LogXptMode property for both the primary database and //            the fast-start failover target standby database must be //            set to ASYNC if the configuration protection mode is set to //            MaxPerformance mode. //          - The primary database and the fast-start failover target standby //            database must both have flashback enabled. //          - No valid target standby database was specified in the primary //            database FastStartFailoverTarget property prior to the attempt //            to enable fast-start failover, and more than one standby //            database exists in the Data Guard configuration. // *Action: Retry the command after correcting the issue: //          - Set the Data Guard configuration to either MaxAvailability //            or MaxPerformance protection mode. //          - Ensure that the LogXptMode property for both the primary //            database and the fast-start failover target standby database //            are set to SYNC if the configuration protection mode is set to //            MaxAvailability. //          - Ensure that the LogXptMode property for both the primary //            database and the fast-start failover target standby database //            are set to ASYNC if the configuration protection mode is set to //            MaxPerformance. //          - Ensure that both the primary database and the fast-start failover //            target standby database have flashback enabled. //          - Set the primary database FastStartFailoverTarget property to //            the DB_UNIQUE_NAME value of the desired target standby database //            and the desired target standby database FastStartFailoverTarget //            property to the DB_UNIQUE_NAME value of the primary database.

启动闪回

alter database flashback on;

再次开启FSFO

DGMGRL> enable fast_start failover; Enabled.

对应日志：

Tue Sep 06 16:31:43 2022 Fast-Start Failover (FSFO) has been enabled between:   Primary = "cjc1"   Standby = "cjc2" Tue Sep 06 16:31:43 2022 FSFP started with pid=38, OS id=8869  OCISessionBegin with PasswordVerifier succeeded

查看配置,Fast-Start Failover状态变成ENABLED了。

DGMGRL> show configuration 
 Configuration - cjcdgbroker 
   Protection Mode: MaxPerformance   Databases:     cjc1 - Primary database     cjc2 - (*) Physical standby database 
 Fast-Start Failover: ENABLED 
 Configuration Status: SUCCESS

测试自动切换

主库

SQL> shutdown abort     ORACLE instance shut down.

查看fsfo日志

[oracle@cjc-db-02 observer]$ tail -10f fsfo.log 16:38:41.35  Tuesday, September 06, 2022 Initiating Fast-Start Failover to database "cjc2"... Performing failover NOW, please wait... Failover succeeded, new primary is "cjc2" 16:38:48.09  Tuesday, September 06, 2022

完成了自动切换

DGMGRL> show configuration 
 Configuration - cjcdgbroker 
   Protection Mode: MaxPerformance   Databases:     cjc2 - Primary database       Warning: ORA-16829: fast-start failover configuration is lagging 
     cjc1 - (*) Physical standby database (disabled)       ORA-16661: the standby database needs to be reinstated 
 Fast-Start Failover: ENABLED 
 Configuration Status: WARNING

已完成自动切换

  CURRENT_SCN PROTECTION_MODE      DATABASE_ROLE    FOR FLASHBACK_ OPEN_MODE  SWITCHOVER_STATUS --------------- -------------------- ---------------- --- ---------- -------------------- -------------------- 1159699 MAXIMUM PERFORMANCE  PRIMARY      YES YES     READ WRITE   NOT ALLOWED

启动原主库

SQL> startup ORACLE instance started. 
 Total System Global Area 1135747072 bytes Fixed Size    2252544 bytes Variable Size  754974976 bytes Database Buffers  369098752 bytes Redo Buffers    9420800 bytes Database mounted. ORA-16649: possible failover to another database prevents this database from being opened

检查错误

[oracle@cjc-db-01 ~]$ oerr ora 16449 16449, 00000, "incomplete redo thread enable operation" // *Cause:  The switchover operation could not continue because it failed to //          disable a thread that was left in an incomplete thread enable //          state. // *Action: Check alert log for more details.

启动数据库时报错ORA-16649，不需要处理，后台会自动执行闪回数据库，自动open实例。

SQL> set lin 200 pages 100 set lin 200 pages 100 col FLASHBACK_ON for a10 col current_scn for 99999999999999 col open_mode for a20 col SWITCHOVER_STATUS for a20 col PROTECTION_MODE for a20 select current_scn,protection_mode,database_role,force_logging,FLASHBACK_ON,open_mode,switchover_status from v$database; 
     CURRENT_SCN PROTECTION_MODE      DATABASE_ROLE    FOR FLASHBACK_ OPEN_MODE  SWITCHOVER_STATUS --------------- -------------------- ---------------- --- ---------- -------------------- -------------------- 1160947 MAXIMUM PERFORMANCE  PHYSICAL STANDBY YES YES     READ ONLY WITH APPLY SWITCHOVER PENDING

查看配置

DGMGRL> show configuration 
 Configuration - cjcdgbroker 
   Protection Mode: MaxPerformance   Databases:     cjc1 - Primary database     cjc2 - (*) Physical standby database 
 Fast-Start Failover: ENABLED 
 Configuration Status: SUCCESS

原主库自动变为备库

告警日志

FLASHBACK DATABASE TO SCN 1160354 Flashback Restore Start Flashback Restore Complete Flashback Media Recovery Start  started logmerger process Parallel Media Recovery started with 2 slaves Flashback Media Recovery Log arch/cjc_1_2_1114706326.arc Flashback Media Recovery Log arch/cjc_1_3_1114706326.arc Tue Sep 06 16:53:25 2022 Recovery of Online Redo Log: Thread 1 Group 1 Seq 4 Reading mem 0   Mem# 0: oradata/cjc/redo01.log Recovery of Online Redo Log: Thread 1 Group 2 Seq 5 Reading mem 0   Mem# 0: oradata/cjc/redo02.log Recovery of Online Redo Log: Thread 1 Group 3 Seq 6 Reading mem 0   Mem# 0: oradata/cjc/redo03.log Incomplete Recovery applied until change 1160355 time 09/06/2022 16:50:59 Flashback Media Recovery Complete Completed: FLASHBACK DATABASE TO SCN 1160354 alter database convert to physical standby ALTER DATABASE CONVERT TO PHYSICAL STANDBY (cjc2) Flush standby redo logfile failed:1649 Clearing standby activation ID 3756857555 (0xdfed18d3) The primary database controlfile was created using the 'MAXLOGFILES 16' clause. There is space for up to 13 standby redo logfiles Use the following SQL commands on the standby database to create standby redo logfiles that match the primary database: ALTER DATABASE ADD STANDBY LOGFILE 'srl1.f' SIZE 52428800; ALTER DATABASE ADD STANDBY LOGFILE 'srl2.f' SIZE 52428800; ALTER DATABASE ADD STANDBY LOGFILE 'srl3.f' SIZE 52428800; ALTER DATABASE ADD STANDBY LOGFILE 'srl4.f' SIZE 52428800; Waiting for all non-current ORLs to be archived... All non-current ORLs have been archived. Clearing online redo logfile 1 oradata/cjc/redo01.log Clearing online log 1 of thread 1 sequence number 4 Clearing online redo logfile 1 complete Clearing online redo logfile 2 oradata/cjc/redo02.log Clearing online log 2 of thread 1 sequence number 5 Tue Sep 06 16:53:28 2022 RFS[2]: Assigned to RFS process 10506 RFS[2]: Opened log for thread 1 sequence 6 dbid -538205520 branch 1114706326 Clearing online redo logfile 2 complete Archived Log entry 87 added for thread 1 sequence 6 rlc 1114706326 ID 0xdfed18d3 dest 2: Clearing online redo logfile 3 oradata/cjc/redo03.log Clearing online log 3 of thread 1 sequence number 6 Tue Sep 06 16:53:31 2022 RFS[3]: Assigned to RFS process 10503 RFS[3]: Opened log for thread 1 sequence 1 dbid -538205520 branch 1114707085 RFS[2]: Opened log for thread 1 sequence 2 dbid -538205520 branch 1114707085 Tue Sep 06 16:53:31 2022 Clearing online redo logfile 3 complete A new recovery destination branch has been registered RFS[3]: New Archival REDO Branch(resetlogs_id): 1114707085  Prior: 1114706326 RFS[3]: Archival Activation ID: 0xdfeddf20 Current: 0x0 RFS[3]: Effect of primary database OPEN RESETLOGS RFS[3]: Incarnation entry added for Branch(resetlogs_id): 1114707085 (cjc2) Tue Sep 06 16:53:31 2022 Setting recovery target incarnation to 4 Completed: alter database convert to physical standby Archived Log entry 88 added for thread 1 sequence 1 rlc 1114707085 ID 0xdfeddf20 dest 2: Killing 2 processes with pids 10503,10506 (all RFS) in order to disallow current and future RFS connections. Requested by OS process 10491 Archived Log entry 89 added for thread 1 sequence 2 rlc 1114707085 ID 0xdfeddf20 dest 2: Tue Sep 06 16:53:33 2022 RFS[4]: Assigned to RFS process 10515 RFS[4]: Opened log for thread 1 sequence 3 dbid -538205520 branch 1114707085 Tue Sep 06 16:53:33 2022 Shutting down instance (immediate) Shutting down instance: further logons disabled Stopping background process MMNL Archived Log entry 89 added for thread 1 sequence 3 rlc 1114707085 ID 0xdfeddf20 dest 2: Stopping background process MMON License high water mark = 5 All dispatchers and shared servers shutdown alter database CLOSE NORMAL ORA-1109 signalled during: alter database CLOSE NORMAL... alter database DISMOUNT Shutting down archive processes Archiving is disabled Completed: alter database DISMOUNT ARCH: Archival disabled due to shutdown: 1089 Shutting down archive processes Archiving is disabled Shutting down Data Guard Broker processes Tue Sep 06 16:53:39 2022 Completed: Data Guard Broker shutdown ARCH: Archival disabled due to shutdown: 1089 Shutting down archive processes Tue Sep 06 16:53:41 2022 Stopping background process VKTM Archiving is disabled Tue Sep 06 16:53:43 2022 Instance shutdown complete Tue Sep 06 16:53:44 2022 Starting ORACLE instance (normal) LICENSE_MAX_SESSION = 0 LICENSE_SESSIONS_WARNING = 0 Initial number of CPU is 2 Number of processor cores in the system is 2 Number of processor sockets in the system is 1 CELL communication is configured to use 0 interface(s): CELL IP affinity details:     NUMA status: non-NUMA system     cellaffinity.ora status: N/A CELL communication will use 1 IP group(s):     Grp 0:  Picked latch-free SCN scheme 3 Autotune of undo retention is turned on.  IMODE=BR ILAT =27 LICENSE_MAX_USERS = 0 SYS auditing is disabled Starting up: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production With the Partitioning, OLAP, Data Mining and Real Application Testing options. ORACLE_HOME = oracle/app/oracle/product/11.2/db System name:Linux Node name:cjc-db-02 Release:5.4.17-2102.201.3.el7uek.x86_64 Version:#2 SMP Fri Apr 23 09:05:55 PDT 2021 Machine:x86_64 Using parameter settings in server-side spfile oracle/app/oracle/product/11.2/db/dbs/spfilecjc2.ora System parameters with non-default values:   processes                = 150   event                    = "28401 trace name context forever,level 1"   event                    = "10949 trace name context forever,level 1"   memory_target            = 1088M   control_files            = "/oradata/cjc/control01.ctl"   control_files            = "/oracle/app/oracle/fast_recovery_area/cjc/control02.ctl"   db_file_name_convert     = "cjc1"   db_file_name_convert     = "cjc2"   log_file_name_convert    = "cjc1"   log_file_name_convert    = "cjc2"   control_file_record_keep_time= 31   db_block_size            = 8192   compatible               = "11.2.0.4.0"   log_archive_dest_1       = "LOCATION=/arch VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=cjc2"   log_archive_dest_2       = "service="cjc1""   log_archive_dest_2       = "LGWR ASYNC NOAFFIRM delay=0 optional compression=disable max_failure=0 max_connections=1 reopen=300 db_unique_name="cjc1" net_timeout=30"   log_archive_dest_2       = "valid_for=(all_logfiles,primary_role)"   log_archive_dest_state_1 = "ENABLE"   log_archive_dest_state_2 = "ENABLE"   log_archive_min_succeed_dest= 1   fal_server               = "cjc1"   log_archive_trace        = 0   log_archive_config       = "DG_CONFIG=(cjc2,cjc1)"   log_archive_format       = "cjc_%t_%s_%r.arc"   log_archive_max_processes= 4   archive_lag_target       = 0   _use_adaptive_log_file_sync= "FALSE"   db_files                 = 2048   db_recovery_file_dest    = "/home/oracle/dg/recover"   db_recovery_file_dest_size= 10G   standby_file_management  = "AUTO"   _cleanup_rollback_entries= 10000   undo_tablespace          = "UNDOTBS1"   _partition_large_extents = "FALSE"   remote_login_passwordfile= "EXCLUSIVE"   audit_sys_operations     = FALSE   db_domain                = ""   dispatchers              = "(PROTOCOL=TCP) (SERVICE=cjcXDB)"   parallel_execution_message_size= 32768   _PX_use_large_pool       = TRUE   result_cache_max_size    = 0   audit_file_dest          = "/oracle/app/oracle/admin/cjc/adump"   audit_trail              = "NONE"   cell_offload_processing  = FALSE   db_name                  = "cjc"   db_unique_name           = "cjc2"   open_cursors             = 300   _optimizer_null_aware_antijoin= FALSE   _b_tree_bitmap_plans     = FALSE   _optimizer_extended_cursor_sharing= "NONE"   _optimizer_extended_cursor_sharing_rel= "NONE"   _optimizer_adaptive_cursor_sharing= FALSE   deferred_segment_creation= FALSE   _optimizer_use_feedback  = FALSE   dg_broker_start          = TRUE   diagnostic_dest          = "/oracle/app/oracle"   max_dump_file_size       = "1024M" Tue Sep 06 16:53:45 2022 PMON started with pid=2, OS id=10533  Tue Sep 06 16:53:45 2022 PSP0 started with pid=3, OS id=10535  Tue Sep 06 16:53:46 2022 VKTM started with pid=4, OS id=10538 at elevated priority VKTM running at (1)millisec precision with DBRM quantum (100)ms Tue Sep 06 16:53:46 2022 GEN0 started with pid=5, OS id=10543  Tue Sep 06 16:53:46 2022 DIAG started with pid=6, OS id=10545  Tue Sep 06 16:53:46 2022 DBRM started with pid=7, OS id=10547  Tue Sep 06 16:53:46 2022 DIA0 started with pid=8, OS id=10549  Tue Sep 06 16:53:46 2022 MMAN started with pid=9, OS id=10551  Tue Sep 06 16:53:46 2022 DBW0 started with pid=10, OS id=10553  Tue Sep 06 16:53:46 2022 LGWR started with pid=11, OS id=10555  Tue Sep 06 16:53:46 2022 CKPT started with pid=12, OS id=10557  Tue Sep 06 16:53:46 2022 SMON started with pid=13, OS id=10559  Tue Sep 06 16:53:46 2022 RECO started with pid=14, OS id=10561  Tue Sep 06 16:53:46 2022 MMON started with pid=15, OS id=10563  Tue Sep 06 16:53:46 2022 MMNL started with pid=16, OS id=10565  starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'... starting up 1 shared server(s) ... Tue Sep 06 16:53:46 2022 DMON started with pid=19, OS id=10571  ORACLE_BASE from environment = oracle/app/oracle Tue Sep 06 16:53:46 2022 alter database  mount ARCH: STARTING ARCH PROCESSES Tue Sep 06 16:53:51 2022 ARC0 started with pid=21, OS id=10582  ARC0: Archival started ARCH: STARTING ARCH PROCESSES COMPLETE ARC0: STARTING ARCH PROCESSES Tue Sep 06 16:53:52 2022 ARC1 started with pid=22, OS id=10585  Tue Sep 06 16:53:52 2022 ARC2 started with pid=23, OS id=10587  ARC1: Archival started ARC2: Archival started ARC2: Becoming the 'no FAL' ARCH ARC2: Becoming the 'no SRL' ARCH ARC2: Thread not mounted Tue Sep 06 16:53:52 2022 ARC3 started with pid=24, OS id=10589  ARC1: Becoming the heartbeat ARCH ARC1: Thread not mounted Successful mount of redo thread 1, with mount id 3756855130 Allocated 8388608 bytes in shared pool for flashback generation buffer Starting background process RVWR Tue Sep 06 16:53:52 2022 RVWR started with pid=25, OS id=10591  Physical Standby Database mounted. Lost write protection disabled ARC1: Becoming the active heartbeat ARCH Completed: alter database  mount ARC3: Archival started ARC0: STARTING ARCH PROCESSES COMPLETE Starting Data Guard Broker (DMON) Tue Sep 06 16:53:55 2022 INSV started with pid=26, OS id=10597  Tue Sep 06 16:54:01 2022 NSV0 started with pid=27, OS id=10607  Tue Sep 06 16:54:04 2022 Using STANDBY_ARCHIVE_DEST parameter default value as arch Tue Sep 06 16:54:04 2022 RSM0 started with pid=29, OS id=10615  Tue Sep 06 16:54:04 2022 RFS[1]: Assigned to RFS process 10617 RFS[1]: Opened log for thread 1 sequence 2 dbid -538205520 branch 1114707085 Archived Log entry 90 added for thread 1 sequence 2 rlc 1114707085 ID 0xdfeddf20 dest 2: RFS[1]: Opened log for thread 1 sequence 4 dbid -538205520 branch 1114707085 Archived Log entry 91 added for thread 1 sequence 4 rlc 1114707085 ID 0xdfeddf20 dest 2: RFS[1]: Selected log 4 for thread 1 sequence 5 dbid -538205520 branch 1114707085 Tue Sep 06 16:54:06 2022 Primary database is in MAXIMUM PERFORMANCE mode RFS[2]: Assigned to RFS process 10621 RFS[2]: Selected log 5 for thread 1 sequence 6 dbid -538205520 branch 1114707085 Tue Sep 06 16:54:06 2022 Archived Log entry 92 added for thread 1 sequence 5 ID 0xdfeddf20 dest 1: Data Guard: Failover target was a Real Time Query standby; attempting to open this standby after reinstatement ... ALTER DATABASE OPEN READ ONLY Data Guard Broker initializing... Data Guard Broker initialization complete Tue Sep 06 16:54:07 2022 SMON: enabling cache recovery Dictionary check beginning Dictionary check complete Database Characterset is AL32UTF8 No Resource Manager plan active replication_dependency_tracking turned off (no async multimaster replication found) Physical standby database opened for read only access. Completed: ALTER DATABASE OPEN READ ONLY Tue Sep 06 16:54:08 2022 db_recovery_file_dest_size of 10240 MB is 0.49% used. This is a user-specified limit on the amount of space that will be used by this database for recovery-related files, and does not reflect the amount of space available in the underlying filesystem or ASM diskgroup. ALTER SYSTEM SET log_archive_trace=0 SCOPE=BOTH SID='cjc2'; ALTER SYSTEM SET log_archive_format='cjc_%t_%s_%r.arc' SCOPE=SPFILE SID='cjc2'; ALTER SYSTEM SET standby_file_management='AUTO' SCOPE=BOTH SID='*'; ALTER SYSTEM SET archive_lag_target=0 SCOPE=BOTH SID='*'; ALTER SYSTEM SET log_archive_max_processes=4 SCOPE=BOTH SID='*'; ALTER SYSTEM SET log_archive_min_succeed_dest=1 SCOPE=BOTH SID='*'; ALTER SYSTEM SET db_file_name_convert='cjc1','cjc2' SCOPE=SPFILE; ALTER SYSTEM SET log_file_name_convert='cjc1','cjc2' SCOPE=SPFILE; ALTER SYSTEM SET fal_server='cjc1' SCOPE=BOTH; ALTER DATABASE RECOVER MANAGED STANDBY DATABASE  THROUGH ALL SWITCHOVER DISCONNECT  USING CURRENT LOGFILE Attempt to start background Managed Standby Recovery process (cjc2) Tue Sep 06 16:54:09 2022 MRP0 started with pid=33, OS id=10629  MRP0: Background Managed Standby Recovery process started (cjc2)  started logmerger process Tue Sep 06 16:54:15 2022 Managed Standby Recovery starting Real Time Apply Parallel Media Recovery started with 2 slaves Media Recovery start incarnation depth : 1, target inc# : 4, irscn : 1160356 Waiting for all non-current ORLs to be archived... All non-current ORLs have been archived. Media Recovery Log arch/cjc_1_6_1114706326.arc Identified End-Of-Redo (failover) for thread 1 sequence 6 at SCN 0x0.11b4a4 Tue Sep 06 16:54:15 2022 Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE  THROUGH ALL SWITCHOVER DISCONNECT  USING CURRENT LOGFILE Resetting standby activation ID 3756857555 (0xdfed18d3) Media Recovery End-Of-Redo indicator encountered Media Recovery Continuing Media Recovery Log arch/cjc_1_1_1114707085.arc Archived Log entry 93 added for thread 1 sequence 6 ID 0xdfeddf20 dest 1: Tue Sep 06 16:54:16 2022 Primary database is in MAXIMUM PERFORMANCE mode Media Recovery Log arch/cjc_1_2_1114707085.arc RFS[3]: Assigned to RFS process 10651 RFS[3]: Selected log 4 for thread 1 sequence 7 dbid -538205520 branch 1114707085 Media Recovery Log arch/cjc_1_3_1114707085.arc Media Recovery Log arch/cjc_1_4_1114707085.arc Media Recovery Log arch/cjc_1_5_1114707085.arc Media Recovery Log arch/cjc_1_6_1114707085.arc Media Recovery Waiting for thread 1 sequence 7 (in transit) Recovery of Online Redo Log: Thread 1 Group 4 Seq 7 Reading mem 0   Mem# 0: oradata/cjc/standby_redo04.log

思考：

主库shutdown immediate会自动切换吗？

主库：

SQL> shutdown immediate Database closed. Database dismounted. ORACLE instance shut down.

没有自动切换

过10min后启动数据库

startup

检查当前配置信息

DGMGRL> show configuration 
 Configuration - cjcdgbroker 
   Protection Mode: MaxPerformance   Databases:     cjc1 - Primary database     cjc2 - (*) Physical standby database 
 Fast-Start Failover: ENABLED 
 Configuration Status: SUCCESS

SQL>  CURRENT_SCN PROTECTION_MODE      DATABASE_ROLE    FOR FLASHBACK_ OPEN_MODE  SWITCHOVER_STATUS --------------- -------------------- ---------------- --- ---------- -------------------- -------------------- 1164197 MAXIMUM PERFORMANCE  PRIMARY      YES YES     READ WRITE   TO STANDBY

思考：

启动FSFO后还可以通过之前的SQL命令进行主备切换吗？

主切备

alter database commit to switchover to physical standby with session shutdown;  startup mount

备切主，报错ORA-16109

alter database commit to switchover to primary with session shutdown; ORA-16109: failed to apply log data from previous primary

[oracle@cjc-db-02 ~]$ oerr ora 16109 16109, 00000, "failed to apply log data from previous primary" // *Cause:  Log data from previous primary could not be completely applied. // *Action: Check DBA_LOGSTDBY_EVENTS for failures and take corrective //          action.  Then, reissue command

检查fsfo日志

[oracle@cjc-db-02 observer]$ tail -10f fsfo.log 17:41:12.95  Tuesday, September 06, 2022 Initiating Fast-Start Failover to database "cjc2"... Performing failover NOW, please wait... Failover succeeded, new primary is "cjc2" 17:41:22.02  Tuesday, September 06, 2022

备库自动切成主库了

set lin 200 pages 100 set lin 200 pages 100 col FLASHBACK_ON for a10 col current_scn for 99999999999999 col open_mode for a20 col SWITCHOVER_STATUS for a20 col PROTECTION_MODE for a20 select current_scn,protection_mode,database_role,force_logging,FLASHBACK_ON,open_mode,switchover_status from v$database; 
  CURRENT_SCN PROTECTION_MODE      DATABASE_ROLE    FOR FLASHBACK_ OPEN_MODE  SWITCHOVER_STATUS --------------- -------------------- ---------------- --- ---------- -------------------- -------------------- 1185108 MAXIMUM PERFORMANCE  PRIMARY      YES YES     READ WRITE   NOT ALLOWED

备库需要手动启动MRP

alter database open; ###recover managed standby database using current logfile disconnect from session; ###ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;  ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT FROM SESSION;

检查当前配置，显示需要重建备库

[oracle@cjc-db-01 ~]$ dgmgrl sys/oracle DGMGRL for Linux: Version 11.2.0.4.0 - 64bit Production 
 Copyright (c) 2000, 2009, Oracle. All rights reserved. 
 Welcome to DGMGRL, type "help" for information. Connected. DGMGRL> show configuration; ORA-16795: the standby database needs to be re-created 
 Configuration details cannot be determined by DGMGRL

尝试回切

DGMGRL> switchover to cjc1; ORA-16795: the standby database needs to be re-created 
 Configuration details cannot be determined by DGMGRL 
 
 DGMGRL> show configuration; 
 Configuration - cjcdgbroker 
   Protection Mode: MaxPerformance   Databases:     cjc2 - Primary database       Warning: ORA-16829: fast-start failover configuration is lagging 
     cjc1 - (*) Physical standby database (disabled)       ORA-16661: the standby database needs to be reinstated 
 Fast-Start Failover: ENABLED 
 Configuration Status: WARNING

备库提示

DGMGRL> show configuration; ORA-16795: the standby database needs to be re-created 
 Configuration details cannot be determined by DGMGRL

检查cjc1 DGMGRL> show database cjc1; 
 Database - cjc1 
   Role:            PHYSICAL STANDBY   Intended State:  APPLY-ON   Transport Lag:   (unknown)   Apply Lag:       (unknown)   Apply Rate:      (unknown)   Real Time Query: OFF   Instance(s):     cjc1 
 Database Status: ORA-16661: the standby database needs to be reinstated 
 检查cjc2 DGMGRL> show database cjc2; 
 Database - cjc2 
   Role:            PRIMARY   Intended State:  TRANSPORT-ON   Instance(s):     cjc2 
   Database Warning(s):     ORA-16829: fast-start failover configuration is lagging 
 Database Status: WARNING

查看参数

备库

DGMGRL> show database verbose cjc1; 
 Database - cjc1 
   Role:            PHYSICAL STANDBY   Intended State:  APPLY-ON   Transport Lag:   (unknown)   Apply Lag:       (unknown)   Apply Rate:      (unknown)   Real Time Query: OFF   Instance(s):     cjc1 
   Properties:     DGConnectIdentifier             = 'cjc1'     ObserverConnectIdentifier       = ''     LogXptMode                      = 'async'     DelayMins                       = '0'     Binding                         = 'optional'     MaxFailure                      = '0'     MaxConnections                  = '1'     ReopenSecs                      = '300'     NetTimeout                      = '30'     RedoCompression                 = 'DISABLE'     LogShipping                     = 'ON'     PreferredApplyInstance          = ''     ApplyInstanceTimeout            = '0'     ApplyParallel                   = 'AUTO'     StandbyFileManagement           = 'AUTO'     ArchiveLagTarget                = '0'     LogArchiveMaxProcesses          = '4'     LogArchiveMinSucceedDest        = '1'     DbFileNameConvert               = 'cjc2, cjc1'     LogFileNameConvert              = 'cjc2, cjc1'     FastStartFailoverTarget         = 'cjc2'     InconsistentProperties          = '(monitor)'     InconsistentLogXptProps         = '(monitor)'     SendQEntries                    = '(monitor)'     LogXptStatus                    = '(monitor)'     RecvQEntries                    = '(monitor)'     ApplyLagThreshold               = '0'     TransportLagThreshold           = '0'     TransportDisconnectedThreshold  = '30'     SidName                         = 'cjc1'     StaticConnectIdentifier         = '(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=cjc-db-01)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=cjc1_DGMGRL)(INSTANCE_NAME=cjc1)(SERVER=DEDICATED)))'     StandbyArchiveLocation          = '/arch'     AlternateLocation               = ''     LogArchiveTrace                 = '8191'     LogArchiveFormat                = 'cjc_%t_%s_%r.arc'     TopWaitEvents                   = '(monitor)' 
 Database Status: ORA-16661: the standby database needs to be reinstated

主库

DGMGRL> show database verbose cjc2; 
 Database - cjc2 
   Role:            PRIMARY   Intended State:  TRANSPORT-ON   Instance(s):     cjc2 
   Database Warning(s):     ORA-16829: fast-start failover configuration is lagging 
   Properties:     DGConnectIdentifier             = 'cjc2'     ObserverConnectIdentifier       = ''     LogXptMode                      = 'async'     DelayMins                       = '0'     Binding                         = 'OPTIONAL'     MaxFailure                      = '0'     MaxConnections                  = '1'     ReopenSecs                      = '300'     NetTimeout                      = '30'     RedoCompression                 = 'DISABLE'     LogShipping                     = 'ON'     PreferredApplyInstance          = ''     ApplyInstanceTimeout            = '0'     ApplyParallel                   = 'AUTO'     StandbyFileManagement           = 'AUTO'     ArchiveLagTarget                = '0'     LogArchiveMaxProcesses          = '4'     LogArchiveMinSucceedDest        = '1'     DbFileNameConvert               = 'cjc1, cjc2'     LogFileNameConvert              = 'cjc1, cjc2'     FastStartFailoverTarget         = 'cjc1'     InconsistentProperties          = '(monitor)'     InconsistentLogXptProps         = '(monitor)'     SendQEntries                    = '(monitor)'     LogXptStatus                    = '(monitor)'     RecvQEntries                    = '(monitor)'     ApplyLagThreshold               = '0'     TransportLagThreshold           = '0'     TransportDisconnectedThreshold  = '30'     SidName                         = 'cjc2'     StaticConnectIdentifier         = '(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=cjc-db-02)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=cjc2_DGMGRL)(INSTANCE_NAME=cjc2)(SERVER=DEDICATED)))'     StandbyArchiveLocation          = '/arch'     AlternateLocation               = ''     LogArchiveTrace                 = '0'     LogArchiveFormat                = 'cjc_%t_%s_%r.arc'     TopWaitEvents                   = '(monitor)' 
 Database Status: WARNING

主从不同步了，归档收不到

 主库  DEST_ID ERRORSTATUS  LOG_SEQUENCE  APPLIED_SCN MAX_CONNECTIONS NET_TIMEOUT COMPRES ---------- -------------------- --------- ------------ -------------- --------------- ----------- -------  1VALID    12    0    10 DISABLE  2DEFERRED     0    0    1       30 DISABLE  3INACTIVE     0    0    10 DISABLE  4INACTIVE     0    0    10 DISABLE   从库 SQL> col dest_name for a30 col error for a20 set lin 200 pages 100 col applied_scn for 9999999999999 select dest_id,error,status,log_sequence,applied_scn,MAX_CONNECTIONS,NET_TIMEOUT,COMPRESSION from v$archive_dest where dest_id<5;SQL> SQL> SQL> SQL>  
    DEST_ID ERRORSTATUS  LOG_SEQUENCE  APPLIED_SCN MAX_CONNECTIONS NET_TIMEOUT COMPRES ---------- -------------------- --------- ------------ -------------- --------------- ----------- -------  1VALID     0    0    10 DISABLE  2VALID     0    0    1       30 DISABLE  3INACTIVE     0    0    10 DISABLE  4INACTIVE     0    0    10 DISABLE

主库

SQL> show parameter log_archive_dest_state_2 
 NAME     TYPE VALUE ------------------------------------ ----------- ------------------------------ log_archive_dest_state_2     string RESET

重新激活

SQL> alter system set log_archive_dest_state_2=defer; System altered. 
 SQL> alter system set log_archive_dest_state_2=enable; System altered.

归档可以正常接收，主库新增数据可以正常同步到从库。

但是dgmgr显示还是不对

ORA-16829: fast-start failover configuration is lagging

ORA-16661: the standby database needs to be reinstated

重建备库

DGMGRL> reinstate database cjc1;   Reinstating database "cjc1", please wait... Reinstatement of database "cjc1" succeeded

查看备库对应告警日志

Wed Sep 07 14:04:45 2022 NSV1 started with pid=38, OS id=3216  OCISessionBegin with PasswordVerifier succeeded Wed Sep 07 14:04:49 2022 RSM0 started with pid=39, OS id=3223  Killing 4 processes with pids 2641,2710,2652,2654 (all RFS) in order to reinstate the database after a failover. Requested by OS process 3223 Data Guard: Stopping apply to check viability of standby ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL Wed Sep 07 14:04:54 2022 MRP0: Background Media Recovery cancelled with status 16037 Errors in file oracle/app/oracle/diag/rdbms/cjc1/cjc1/trace/cjc1_pr00_2725.trc: ORA-16037: user requested cancel of managed recovery operation Managed Standby Recovery not using Real Time Apply Recovery interrupted! Recovered data files to a consistent state at change 1186859 Errors in file oracle/app/oracle/diag/rdbms/cjc1/cjc1/trace/cjc1_pr00_2725.trc: ORA-16037: user requested cancel of managed recovery operation Wed Sep 07 14:04:54 2022 Errors in file oracle/app/oracle/diag/rdbms/cjc1/cjc1/trace/cjc1_mrp0_2719.trc: ORA-10877: error signaled in parallel recovery slave  MRP0: Background Media Recovery process shutdown (cjc1) Managed Standby Recovery Canceled (cjc1) Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL krsk_ler_purge_scn: Purged kccle 4 (next SCN 65535:-1) ALTER SYSTEM SET log_archive_trace=8191 SCOPE=BOTH SID='cjc1'; ALTER SYSTEM SET log_archive_format='cjc_%t_%s_%r.arc' SCOPE=SPFILE SID='cjc1'; ALTER SYSTEM SET standby_file_management='AUTO' SCOPE=BOTH SID='*'; ALTER SYSTEM SET archive_lag_target=0 SCOPE=BOTH SID='*'; ALTER SYSTEM SET log_archive_max_processes=4 SCOPE=BOTH SID='*'; ALTER SYSTEM SET log_archive_min_succeed_dest=1 SCOPE=BOTH SID='*'; ALTER SYSTEM SET db_file_name_convert='cjc2','cjc1' SCOPE=SPFILE; ALTER SYSTEM SET log_file_name_convert='cjc2','cjc1' SCOPE=SPFILE; ALTER SYSTEM SET fal_server='cjc2' SCOPE=BOTH; ALTER DATABASE RECOVER MANAGED STANDBY DATABASE  THROUGH ALL SWITCHOVER DISCONNECT  USING CURRENT LOGFILE Attempt to start background Managed Standby Recovery process (cjc1) Wed Sep 07 14:04:56 2022 MRP0 started with pid=22, OS id=3233  MRP0: Background Managed Standby Recovery process started (cjc1)  started logmerger process Wed Sep 07 14:05:02 2022 Managed Standby Recovery starting Real Time Apply Parallel Media Recovery started with 2 slaves krss_find_arc: Selecting ARC2 to receive message as last resort Waiting for all non-current ORLs to be archived... All non-current ORLs have been archived. Media Recovery Waiting for thread 1 sequence 17 OCISessionBegin with PasswordVerifier succeeded Wed Sep 07 14:05:02 2022 Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE  THROUGH ALL SWITCHOVER DISCONNECT  USING CURRENT LOGFILE Wed Sep 07 14:05:03 2022 Redo Shipping Client Connected as PUBLIC -- Connected User is Valid RFS[6]: Assigned to RFS process 3248 RFS[6]: Identified database type as 'physical standby': Client is LGWR ASYNC pid 2512 Primary database is in MAXIMUM PERFORMANCE mode RFS[6]: Begin archive primary thread 1 sequence 18 (cjc1) Primary database is in MAXIMUM PERFORMANCE mode RFS[6]: Successfully opened standby log 4: '/oradata/cjc/standby_redo04.log' RFS[6]: Selected log 4 for thread 1 sequence 18 dbid -538205520 branch 1114707085 Wed Sep 07 14:05:12 2022 Fetching gap sequence in thread 1, gap sequence 17-17 FAL[client]: Trying FAL server: cjc2 OCISessionBegin with PasswordVerifier succeeded Wed Sep 07 14:05:23 2022 FAL[client]: Trying FAL server: cjc2 OCISessionBegin with PasswordVerifier succeeded Wed Sep 07 14:05:33 2022 FAL[client]: Trying FAL server: cjc2 OCISessionBegin with PasswordVerifier succeeded Wed Sep 07 14:05:43 2022 FAL[client]: Trying FAL server: cjc2 OCISessionBegin with PasswordVerifier succeeded Wed Sep 07 14:05:44 2022 Redo Shipping Client Connected as PUBLIC -- Connected User is Valid RFS[7]: Assigned to RFS process 3288 RFS[7]: Identified database type as 'physical standby': Client is ARCH pid 2506 Wed Sep 07 14:05:44 2022 Redo Shipping Client Connected as PUBLIC -- Connected User is Valid RFS[8]: Assigned to RFS process 3290 RFS[8]: Identified database type as 'physical standby': Client is ARCH pid 2503 RFS[8]: Begin archive primary thread 1 sequence 17 (cjc1) RFS[8]: Successfully opened standby log 5: '/oradata/cjc/standby_redo05.log' RFS[8]: Selected log 5 for thread 1 sequence 17 dbid -538205520 branch 1114707085 RFS[8]: Completed archive log 0 thread 1 sequence 17 (cjc1) Wed Sep 07 14:05:44 2022 ARC3: Evaluating archive   log 5 thread 1 sequence 17 ARC3: Beginning to archive thread 1 sequence 17 (1186140-1186874) (cjc1) ARC3: Creating local archive destination LOG_ARCHIVE_DEST_1: '/arch/cjc_1_17_1114707085.arc' (thread 1 sequence 17) (cjc1) ARC3: Closing local archive destination LOG_ARCHIVE_DEST_1: '/arch/cjc_1_17_1114707085.arc' (cjc1) Committing creation of archivelog '/arch/cjc_1_17_1114707085.arc' Archived Log entry 127 added for thread 1 sequence 17 ID 0xdfed0f5a dest 1: ARC3: Completed archiving thread 1 sequence 17 (0-0) (cjc1) Wed Sep 07 14:05:45 2022 Redo Shipping Client Connected as PUBLIC -- Connected User is Valid RFS[9]: Assigned to RFS process 3294 RFS[9]: Identified database type as 'physical standby': Client is ARCH pid 2510 Media Recovery Log arch/cjc_1_17_1114707085.arc RFS[9]: Begin archive primary thread 1 sequence 17 (cjc1) Errors in file oracle/app/oracle/diag/rdbms/cjc1/cjc1/trace/cjc1_rfs_3294.trc: ORA-16401: archive log rejected by Remote File Server (RFS) Media Recovery Waiting for thread 1 sequence 18 (in transit) Recovery of Online Redo Log: Thread 1 Group 4 Seq 18 Reading mem 0   Mem# 0: oradata/cjc/standby_redo04.log

再次检查，恢复正常

DGMGRL> show configuration; 
 Configuration - cjcdgbroker 
   Protection Mode: MaxPerformance   Databases:     cjc2 - Primary database     cjc1 - (*) Physical standby database 
 Fast-Start Failover: ENABLED 
 Configuration Status: SUCCESS 
 回切 DGMGRL> switchover to cjc1; Performing switchover NOW, please wait... New primary database "cjc1" is opening... Operation requires startup of instance "cjc2" on database "cjc2" Starting instance "cjc2"... ORACLE instance started. Database mounted. Database opened. Switchover succeeded, new primary is "cjc1

总结：

在使用了DGMGRL来管理Dataguard时，不要再使用SQLPLUS命令行来管理了，所有的操作做好在DGMGRL下进行修改配置，否则会造成参数冲突，状态异常。

四：通过keepalived进行vip自动切换

既然DG可以自动切换了，如何保障主库出现异常时，应用中断时间最短呢，或者尽量避免人为干预？

这需要VIP或一个域名，应用连接数据库连接的是VIP或域名，当主库异常时，通过 FSFO 进行自动主备切换，VIP地址通过keepalived自动切换，切换完成后，应用仍然可以通过VIP进行访问数据库。

1.安装keepalived

[root@cjc-db-02 ~]# cd soft/ [root@cjc-db-02 soft]# tar -zxvf keepalived-2.0.15.tar.gz  [root@cjc-db-02 soft]# cd keepalived-2.0.15/ [root@cjc-db-02 keepalived-2.0.15]# ./configure  --prefix=/usr/local/keepalived

报错

configure: error:    !!! OpenSSL is not properly installed on your system. !!!   !!! Can not include OpenSSL headers files.            !!!

解决

[root@cjc-db-01 keepalived-2.0.15]# yum install openssl openssl-devel

编译

make && make install echo $?

将命令复制到/usr/sbin里:

cp usr/local/keepalived/sbin/keepalived usr/sbin/ cp usr/local/keepalived/etc/sysconfig/keepalived etc/sysconfig/

配置keepalived,VIP地址为172.16.6.150

节点1 172.16.6.137

vi /etc/keepalived/keepalived.conf ! Configuration File for keepalived 
 
 vrrp_script chk_dg_stats {       script "/etc/keepalived/check_dataguard.sh"      interval 2     weight -5     fall 2       rise 1   }         vrrp_instance VI_1 {     state MASTER         interface enp0s3      mcast_src_ip 172.16.6.137     virtual_router_id 100      priority 100     inopreempt     advert_int 1              authentication {            auth_type PASS          auth_pass 888888        }     virtual_ipaddress {             172.16.6.150     }            track_script {                       chk_dg_stats                  } }

节点2 172.16.6.138

[root@test05 ~]# cat etc/keepalived/keepalived.conf ! Configuration File for keepalived 
 
 vrrp_script chk_dg_stats {       script "/etc/keepalived/check_dataguard.sh"      interval 2     weight -5     fall 2       rise 1   }         vrrp_instance VI_1 {     state BACKUP     interface enp0s3      mcast_src_ip 172.16.6.138     virtual_router_id 100      priority 100      inopreempt     advert_int 1              authentication {            auth_type PASS          auth_pass 888888        }     virtual_ipaddress {             172.16.6.150     }            track_script {                       chk_dg_stats                  } }

检查脚本

vi etc/keepalived/check_dataguard.sh  #!/bin/bash dbstats=`ps -ef | grep ora_smon | grep -v grep | wc -l` dgstats=`ps -ef | grep ora_mrp | grep -v grep | wc -l` 
 if [ "${dbstats}" -eq 0  ]; then systemctl stop keepalived.service #elif [[ "${dbstats}" -gt 0 ]] && [[ "${dgstats}" -gt 0 ]]; then #systemctl stop keepalived.service fi

添加执行权限

chmod a+x etc/keepalived/check_dataguard.sh

启动keepalived

[root@cjc-db-01 keepalived]# systemctl start keepalived.service  [root@cjc-db-01 keepalived]# systemctl status keepalived.service  ?.keepalived.service - LVS and VRRP High Availability Monitor    Loaded: loaded (/usr/lib/systemd/system/keepalived.service; disabled; vendor preset: disabled)    Active: active (running) since Wed 2022-09-07 15:28:01 CST; 6s ago   Process: 16728 ExecStart=/usr/local/keepalived/sbin/keepalived $KEEPALIVED_OPTIONS (code=exited, status=0/SUCCESS)  Main PID: 16730 (keepalived)     Tasks: 2    CGroup: system.slice/keepalived.service            ?..16730 usr/local/keepalived/sbin/keepalived -D            ?..16731 usr/local/keepalived/sbin/keepalived -D 
 Sep 07 15:28:01 cjc-db-01 Keepalived_vrrp[16731]: VRRP_Script(chk_dg_stats) succeeded Sep 07 15:28:05 cjc-db-01 Keepalived_vrrp[16731]: (VI_1) Receive advertisement timeout Sep 07 15:28:05 cjc-db-01 Keepalived_vrrp[16731]: (VI_1) Entering MASTER STATE Sep 07 15:28:05 cjc-db-01 Keepalived_vrrp[16731]: (VI_1) setting VIPs. Sep 07 15:28:05 cjc-db-01 Keepalived_vrrp[16731]: Sending gratuitous ARP on enp0s3 for 172.16.6.150 Sep 07 15:28:05 cjc-db-01 Keepalived_vrrp[16731]: (VI_1) Sending/queueing gratuitous ARPs on enp0s3 for 172.16.6.150 Sep 07 15:28:05 cjc-db-01 Keepalived_vrrp[16731]: Sending gratuitous ARP on enp0s3 for 172.16.6.150 Sep 07 15:28:05 cjc-db-01 Keepalived_vrrp[16731]: Sending gratuitous ARP on enp0s3 for 172.16.6.150 Sep 07 15:28:05 cjc-db-01 Keepalived_vrrp[16731]: Sending gratuitous ARP on enp0s3 for 172.16.6.150 Sep 07 15:28:05 cjc-db-01 Keepalived_vrrp[16731]: Sending gratuitous ARP on enp0s3 for 172.16.6.150

检查vip

ip a 2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000     link/ether 08:00:27:29:44:a2 brd ff:ff:ff:ff:ff:ff     inet 172.16.6.137/16 brd 172.16.255.255 scope global noprefixroute enp0s3        valid_lft forever preferred_lft forever     inet 172.16.6.150/32 scope global enp0s3        valid_lft forever preferred_lft forever     inet6 fe80::2d75:f64c:8022:6f66/64 scope link noprefixroute         valid_lft forever preferred_lft forever

测试VIP漂移

[root@cjc-db-01 keepalived]# systemctl stop keepalived.service  [root@cjc-db-02 keepalived]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000     link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00     inet 127.0.0.1/8 scope host lo        valid_lft forever preferred_lft forever     inet6 ::1/128 scope host         valid_lft forever preferred_lft forever 2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000     link/ether 08:00:27:5c:ac:40 brd ff:ff:ff:ff:ff:ff     inet 172.16.6.138/16 brd 172.16.255.255 scope global noprefixroute enp0s3        valid_lft forever preferred_lft forever     inet 172.16.6.150/32 scope global enp0s3        valid_lft forever preferred_lft forever     inet6 fe80::2d75:f64c:8022:6f66/64 scope link tentative noprefixroute dadfailed         valid_lft forever preferred_lft forever     inet6 fe80::f061:4b23:c393:ca5/64 scope link noprefixroute         valid_lft forever preferred_lft forever

禁用自动切换

DGMGRL> disable fast_start failover; Disabled. DGMGRL> show configuration 
 Configuration - cjcdgbroker 
   Protection Mode: MaxPerformance   Databases:     cjc1 - Primary database     cjc2 - Physical standby database 
 Fast-Start Failover: DISABLED 
 Configuration Status: SUCCESS

配置监听

[oracle@cjc-db-01 ~]$ sqlplus system/oracle@172.16.6.150:1521/cjc1 
 SQL*Plus: Release 11.2.0.4.0 Production on Wed Sep 7 15:33:18 2022 
 Copyright (c) 1982, 2013, Oracle.  All rights reserved. 
 ERROR: ORA-12541: TNS:no listener

[oracle@cjc-db-01 admin]$ cat listener.ora SID_LIST_LISTENER =    (SID_LIST =      (SID_DESC =        (GLOBAL_DBNAME = cjc1)        (ORACLE_HOME = /oracle/app/oracle/product/11.2/db)        (SID_NAME = cjc1)      )      (SID_DESC =        (GLOBAL_DBNAME = cjc1_DGMGRL)        (ORACLE_HOME = /oracle/app/oracle/product/11.2/db)        (SID_NAME = cjc1)      )    )    LISTENER =   (DESCRIPTION_LIST =     (DESCRIPTION =       (ADDRESS = (PROTOCOL = TCP)(HOST = 172.16.6.137)(PORT = 1521))     )     (DESCRIPTION =       (ADDRESS = (PROTOCOL = TCP)(HOST = 172.16.6.150)(PORT = 1521))     )   )

配置服务

SQL> show parameter name NAME     TYPE VALUE ------------------------------------ ----------- ------------------------------ cell_offloadgroup_name     string db_file_name_convert     string cjc2, cjc1 db_name      string cjc db_unique_name     string cjc1 global_names     boolean FALSE instance_name     string cjc1 lock_name_space      string log_file_name_convert     string cjc2, cjc1 processor_group_name     string service_names     string cjc1 
 SQL> alter system set service_names='cjc1,cjc' scope=both; 
 System altered. 
 SQL> show parameter service_name 
 NAME     TYPE VALUE ------------------------------------ ----------- ------------------------------ service_names     string cjc1,cjc 
 从库 alter system set service_names='cjc2,cjc' scope=both;

重启监听

lsnrctl reload lsnrctl stop lsnrctl start

查看监听状态

[oracle@cjc-db-01 admin]$ lsnrctl status 
 LSNRCTL for Linux: Version 11.2.0.4.0 - Production on 07-SEP-2022 15:45:15 
 Copyright (c) 1991, 2013, Oracle.  All rights reserved. 
 Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=172.16.6.137)(PORT=1521))) STATUS of the LISTENER ------------------------ Alias                     LISTENER Version                   TNSLSNR for Linux: Version 11.2.0.4.0 - Production Start Date                07-SEP-2022 15:44:08 Uptime                    0 days 0 hr. 1 min. 7 sec Trace Level               off Security                  ON: Local OS Authentication SNMP                      OFF Listener Parameter File   /oracle/app/oracle/product/11.2/db/network/admin/listener.ora Listener Log File         /oracle/app/oracle/diag/tnslsnr/cjc-db-01/listener/alert/log.xml Listening Endpoints Summary...   (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=172.16.6.137)(PORT=1521)))   (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=172.16.6.150)(PORT=1521))) Services Summary... Service "cjc1" has 2 instance(s).   Instance "cjc1", status UNKNOWN, has 1 handler(s) for this service...   Instance "cjc1", status READY, has 1 handler(s) for this service... Service "cjc1_DGB" has 1 instance(s).   Instance "cjc1", status READY, has 1 handler(s) for this service... Service "cjc1_DGMGRL" has 1 instance(s).   Instance "cjc1", status UNKNOWN, has 1 handler(s) for this service... Service "cjcXDB" has 1 instance(s).   Instance "cjc1", status READY, has 1 handler(s) for this service... The command completed successfully

连接测试

SQL> conn system/oracle@172.16.6.137:1521/cjc1 Connected. SQL> conn system/oracle@172.16.6.150:1521/cjc1 Connected.

监听文件

[oracle@cjc-db-01 admin]$ cat listener.ora SID_LIST_LISTENER =    (SID_LIST =      (SID_DESC =        (GLOBAL_DBNAME = cjc1)        (ORACLE_HOME = /oracle/app/oracle/product/11.2/db)        (SID_NAME = cjc1)      )      (SID_DESC =        (GLOBAL_DBNAME = cjc1_DGMGRL)        (ORACLE_HOME = /oracle/app/oracle/product/11.2/db)        (SID_NAME = cjc1)      )      (SID_DESC =        (GLOBAL_DBNAME = cjc)        (ORACLE_HOME = /oracle/app/oracle/product/11.2/db)        (SID_NAME = cjc)      )    )    LISTENER =   (DESCRIPTION_LIST =     (DESCRIPTION =       (ADDRESS = (PROTOCOL = TCP)(HOST = 172.16.6.137)(PORT = 1521))     )     (DESCRIPTION =       (ADDRESS = (PROTOCOL = TCP)(HOST = 172.16.6.150)(PORT = 1521))     )   )       测试 SQL> conn system/oracle@172.16.6.137:1521/cjc1 Connected. SQL> conn system/oracle@172.16.6.137:1521/cjc Connected. SQL> conn system/oracle@172.16.6.150:1521/cjc1 Connected. SQL> conn system/oracle@172.16.6.150:1521/cjc Connected. SQL> conn system/oracle@172.16.6.138:1521/cjc2 Connected. SQL> conn system/oracle@172.16.6.138:1521/cjc Connected. 
 监听状态 SQL> ho lsnrctl status 
 LSNRCTL for Linux: Version 11.2.0.4.0 - Production on 07-SEP-2022 16:05:22 
 Copyright (c) 1991, 2013, Oracle.  All rights reserved. 
 Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=172.16.6.137)(PORT=1521))) STATUS of the LISTENER ------------------------ Alias                     LISTENER Version                   TNSLSNR for Linux: Version 11.2.0.4.0 - Production Start Date                07-SEP-2022 15:44:08 Uptime                    0 days 0 hr. 21 min. 13 sec Trace Level               off Security                  ON: Local OS Authentication SNMP                      OFF Listener Parameter File   /oracle/app/oracle/product/11.2/db/network/admin/listener.ora Listener Log File         /oracle/app/oracle/diag/tnslsnr/cjc-db-01/listener/alert/log.xml Listening Endpoints Summary...   (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=172.16.6.137)(PORT=1521)))   (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=172.16.6.150)(PORT=1521))) Services Summary... Service "cjc" has 2 instance(s).   Instance "cjc", status UNKNOWN, has 1 handler(s) for this service...   Instance "cjc1", status READY, has 1 handler(s) for this service... Service "cjc1" has 2 instance(s).   Instance "cjc1", status UNKNOWN, has 1 handler(s) for this service...   Instance "cjc1", status READY, has 1 handler(s) for this service... Service "cjc1_DGB" has 1 instance(s).   Instance "cjc1", status READY, has 1 handler(s) for this service... Service "cjc1_DGMGRL" has 1 instance(s).   Instance "cjc1", status UNKNOWN, has 1 handler(s) for this service... Service "cjcXDB" has 1 instance(s).   Instance "cjc1", status READY, has 1 handler(s) for this service... The command completed successfully

测试VIP切换

停主库

SQL> shutdown immediate

查看VIP已经飘到从库

ip a : enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000     link/ether 08:00:27:5c:ac:40 brd ff:ff:ff:ff:ff:ff     inet 172.16.6.138/16 brd 172.16.255.255 scope global noprefixroute enp0s3        valid_lft forever preferred_lft forever     inet 172.16.6.150/32 scope global enp0s3        valid_lft forever preferred_lft forever

连接测试

SQL> conn system/oracle@172.16.6.150:1521/cjc Connected. 
 sqlplus system/oracle@172.16.6.150:1521/cjc

set lin 200 pages 100 set lin 200 pages 100 col FLASHBACK_ON for a10 col current_scn for 99999999999999 col open_mode for a20 col SWITCHOVER_STATUS for a20 col PROTECTION_MODE for a20 select current_scn,protection_mode,database_role,force_logging,FLASHBACK_ON,open_mode,switchover_status from v$database;   CURRENT_SCN PROTECTION_MODE      DATABASE_ROLE    FOR FLASHBACK_ OPEN_MODE  SWITCHOVER_STATUS --------------- -------------------- ---------------- --- ---------- -------------------- -------------------- 1219061 MAXIMUM PERFORMANCE  PHYSICAL STANDBY YES YES     READ ONLY WITH APPLY NOT ALLOWED

只是VIP飘了，数据库角色没有变

需要执行手动切换或配置自动切换后才能进行使用

通过dg broker进行手动切换

DGMGRL> switchover to cjc2; Performing switchover NOW, please wait... Operation requires a connection to instance "cjc2" on database "cjc2" Connecting to instance "cjc2"... Connected. New primary database "cjc2" is opening... Operation requires startup of instance "cjc1" on database "cjc1" Starting instance "cjc1"... ORACLE instance started. Database mounted. Database opened. Switchover succeeded, new primary is "cjc2"

检查，切换成功

[oracle@cjc-db-01 ~]$ sqlplus system/oracle@172.16.6.150:1521/cjc SQL*Plus: Release 11.2.0.4.0 Production on Wed Sep 7 16:28:14 2022 Copyright (c) 1982, 2013, Oracle.  All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production With the Partitioning, OLAP, Data Mining and Real Application Testing options SQL> show parameter name NAME     TYPE VALUE ------------------------------------ ----------- ------------------------------ cell_offloadgroup_name     string db_file_name_convert     string cjc1, cjc2 db_name      string cjc db_unique_name     string cjc2 global_names     boolean FALSE instance_name     string cjc2 lock_name_space      string log_file_name_convert     string cjc1, cjc2 processor_group_name     string service_names     string cjc2,cjc

通过VIP连接已经是cjc2了

CURRENT_SCN PROTECTION_MODE      DATABASE_ROLE    FOR FLASHBACK_ OPEN_MODE  SWITCHOVER_STATUS --------------- -------------------- ---------------- --- ---------- -------------------- -------------------- 1240051 MAXIMUM PERFORMANCE  PRIMARY      YES YES     READ WRITE   SESSIONS ACTIVE

回切

回切前需要先启动从库的keepalived服务

[root@cjc-db-01 keepalived]# systemctl start keepalived.service

开始回切

DGMGRL> switchover to cjc1 Performing switchover NOW, please wait... New primary database "cjc1" is opening... Operation requires startup of instance "cjc2" on database "cjc2" Starting instance "cjc2"... ORACLE instance started. Database mounted. Database opened. Switchover succeeded, new primary is "cjc1"

通过VIP连接数据库

SQL> show parameter instance_name NAME     TYPE VALUE ------------------------------------ ----------- ------------------------------ instance_name     string cjc1 
 sqlplus system/oracle@172.16.6.150:1521/cjc 
     CURRENT_SCN PROTECTION_MODE      DATABASE_ROLE    FOR FLASHBACK_ OPEN_MODE  SWITCHOVER_STATUS --------------- -------------------- ---------------- --- ---------- -------------------- -------------------- 1260596 MAXIMUM PERFORMANCE  PRIMARY      YES YES     READ WRITE   SESSIONS ACTIVE

测试 FSFO+keepalived自动切换

启动自动切换

DGMGRL> enable fast_start failover; DGMGRL> show configuration 
 DGMGRL> show configuration 
 Configuration - cjcdgbroker 
   Protection Mode: MaxPerformance   Databases:     cjc1 - Primary database     cjc2 - (*) Physical standby database 
 Fast-Start Failover: ENABLED 
 Configuration Status: SUCCESS

主库

enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000     link/ether 08:00:27:29:44:a2 brd ff:ff:ff:ff:ff:ff     inet 172.16.6.137/16 brd 172.16.255.255 scope global noprefixroute enp0s3        valid_lft forever preferred_lft forever     inet 172.16.6.150/32 scope global enp0s3

主库执行非正常关库操作

SQL> shutdown abort ORACLE instance shut down.

查看fsfo.log日志，成功切换到cjc2

16:38:20.23  Wednesday, September 07, 2022 Initiating Fast-Start Failover to database "cjc2"... Performing failover NOW, please wait... Failover succeeded, new primary is "cjc2" 16:38:24.18  Wednesday, September 07, 2022

vip 已经切换到138上

enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000     link/ether 08:00:27:5c:ac:40 brd ff:ff:ff:ff:ff:ff     inet 172.16.6.138/16 brd 172.16.255.255 scope global noprefixroute enp0s3        valid_lft forever preferred_lft forever     inet 172.16.6.150/32 scope global enp0s3        valid_lft forever preferred_lft forever     inet6 fe80::2d75:f64c:8022:6f66/64 scope link tentative noprefixroute dadfailed         valid_lft forever preferred_lft forever     inet6 fe80::f061:4b23:c393:ca5/64 scope link noprefixroute         valid_lft forever preferred_lft forever

主库也是切换到138 cjc2上

DGMGRL> show configuration 
 Configuration - cjcdgbroker 
   Protection Mode: MaxPerformance   Databases:     cjc2 - Primary database       Warning: ORA-16829: fast-start failover configuration is lagging 
     cjc1 - (*) Physical standby database (disabled)       ORA-16661: the standby database needs to be reinstated 
 Fast-Start Failover: ENABLED 
 Configuration Status: WARNING

恢复原主库137 cjc1

SQL> startup ORACLE instance started. 
 Total System Global Area 1135747072 bytes Fixed Size    2252544 bytes Variable Size  754974976 bytes Database Buffers  369098752 bytes Redo Buffers    9420800 bytes Database mounted. ORA-16649: possible failover to another database prevents this database from being opened

报错可以忽略，后台自动执行闪回数据库，重新open数据库

可以看到已经加会到备库

DGMGRL> show configuration 
 Configuration - cjcdgbroker 
   Protection Mode: MaxPerformance   Databases:     cjc2 - Primary database     cjc1 - (*) Physical standby database 
 Fast-Start Failover: ENABLED 
 Configuration Status: SUCCESS

普通关闭数据库，理论上主备角色是不会自动切换的，VIP会发生漂移

[oracle@cjc-db-01 ~]$ sqlplus system/oracle@172.16.6.150:1521/cjc SQL*Plus: Release 11.2.0.4.0 Production on Wed Sep 7 16:44:40 2022 Copyright (c) 1982, 2013, Oracle.  All rights reserved. 
 Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production With the Partitioning, OLAP, Data Mining and Real Application Testing options 
 
 SQL> show parameter instance_name 
 NAME     TYPE VALUE ------------------------------------ ----------- ------------------------------ instance_name     string cjc2

启动keepalived

[root@cjc-db-01 keepalived]# systemctl start keepalived.service

主库关闭实例

查看IP

enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000     link/ether 08:00:27:5c:ac:40 brd ff:ff:ff:ff:ff:ff     inet 172.16.6.138/16 brd 172.16.255.255 scope global noprefixroute enp0s3        valid_lft forever preferred_lft forever     inet 172.16.6.150/32 scope global enp0s3        valid_lft forever preferred_lft forever     inet6 fe80::2d75:f64c:8022:6f66/64 scope link tentative noprefixroute dadfailed         valid_lft forever preferred_lft forever     inet6 fe80::f061:4b23:c393:ca5/64 scope link noprefixroute         valid_lft forever preferred_lft forever

SQL> shutdown immediate Database closed. Database dismounted. ORACLE instance shut down.

主从角色并没有自动切换

DGMGRL> show configuration 
 Configuration - cjcdgbroker 
   Protection Mode: MaxPerformance   Databases:     cjc2 - Primary database     cjc1 - (*) Physical standby database 
 Fast-Start Failover: ENABLED 
 Configuration Status: ORA-01034: ORACLE not available ORA-16625: cannot reach database "cjc2" DGM-17017: unable to determine configuration status

但是VIP进行切换了

[oracle@cjc-db-01 ~]$ sqlplus system/oracle@172.16.6.150:1521/cjc SQL> show parameter instance_name NAME     TYPE VALUE ------------------------------------ ----------- ------------------------------ instance_name     string cjc1 
 set lin 200 pages 100 set lin 200 pages 100 col FLASHBACK_ON for a10 col current_scn for 99999999999999 col open_mode for a20 col SWITCHOVER_STATUS for a20 col PROTECTION_MODE for a20 select current_scn,protection_mode,database_role,force_logging,FLASHBACK_ON,open_mode,switchover_status from v$database; CURRENT_SCN PROTECTION_MODE      DATABASE_ROLE    FOR FLASHBACK_ OPEN_MODE  SWITCHOVER_STATUS --------------- -------------------- ---------------- --- ---------- -------------------- -------------------- 1261913 MAXIMUM PERFORMANCE  PHYSICAL STANDBY YES YES     READ ONLY WITH APPLY SWITCHOVER PENDING

手动切换

启动主库实例

startup

启动主库keepalived

systemctl start keepalived.service

重启从库keepalived，目的是将VIP切换到主库

[root@cjc-db-01 keepalived]# systemctl stop keepalived.service  [root@cjc-db-01 keepalived]# systemctl start keepalived.service

关闭自动切换

DGMGRL> disable fast_start failover; Disabled. DGMGRL>  show configuration 
 Configuration - cjcdgbroker 
   Protection Mode: MaxPerformance   Databases:     cjc2 - Primary database     cjc1 - Physical standby database 
 Fast-Start Failover: DISABLED 
 Configuration Status: SUCCES

回切

DGMGRL> switchover to cjc1 Performing switchover NOW, please wait... New primary database "cjc1" is opening... Operation requires startup of instance "cjc2" on database "cjc2" Starting instance "cjc2"... ORACLE instance started. Database mounted. Database opened. Switchover succeeded, new primary is "cjc1"

DGMGRL> show configuration 
 Configuration - cjcdgbroker 
   Protection Mode: MaxPerformance   Databases:     cjc1 - Primary database     cjc2 - Physical standby database 
 Fast-Start Failover: DISABLED 
 Configuration Status: SUCCESS

[oracle@cjc-db-01 ~]$ sqlplus system/oracle@172.16.6.150:1521/cjc 
 SQL*Plus: Release 11.2.0.4.0 Production on Wed Sep 7 17:29:02 2022 
 Copyright (c) 1982, 2013, Oracle.  All rights reserved. 
 
 Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production With the Partitioning, OLAP, Data Mining and Real Application Testing options 
 
 SQL> show parameter instance_name 
 NAME     TYPE VALUE ------------------------------------ ----------- ------------------------------ instance_name     string cjc1 
 
     CURRENT_SCN PROTECTION_MODE      DATABASE_ROLE    FOR FLASHBACK_ OPEN_MODE  SWITCHOVER_STATUS --------------- -------------------- ---------------- --- ---------- -------------------- -------------------- 1304918 MAXIMUM PERFORMANCE  PRIMARY      YES YES     READ WRITE   SESSIONS ACTIVE