Yaping's Weblog

August 30, 2008

Physical Data Guard创建及恢复

Filed under: Oracle — Yaping @ 4:25 am
Tags:

准备1

案例一2

案例二3

案例三4

维护7

激活Data Guard Database. 7

通过RESETLOGS的恢复8

附件A11

附件B12

参考文档:13

 

 

 

该文档是布置完standby后为客户写的参考手册,所有测试时在9204/Linux上完成,RMAN备份未使用catalog。该测试用的Data Guard是最简单的类型,与full backup recoveryDatabase无本质区别。

准备

注:该部分主要针对用于案例一

l         Primary Database设为archive log mode

SQL>alter system set log_archive_dest_1=location=/u01/oradata/prod/archive’

SQL>alter system set log_archive_start=true;

SQ>satartup mount force

SQL>alter database archivelog;

SQL>alter databae open;

 

l         Primary Database 设为 FORCE LOGGING

SQL>alter database force logging;

 

l         Primary SiteData Guard Site设置tnsnames.ora文件,见附件A

 

l         设置初始化参数

修改Primary Database的初始化参数文件,然后scpData Guard Site

 

以下参数是在Primary Database上的:

log_archive_dest_1=location=/u01/oradata/prod/archive’

log_archive_dest_2=location=/nfs/oradata/prod/archive’

log_archive_dest_3=’service=prod_stby’

standby_file_management=auto

 

Primary Site上创建pfile,并把它scpData Guard Site

 

Data Guard Database修改以下参数:

#log_archive_dest_2=location=/nfs/oradata/prod/archive’

# log_archive_dest_3=’service=prod_stby’

standby_archive_dest=/u01/oradata/prod/archive

fal_client=prod_stby

fal_server=prod_prim

 

注:fal_client, fal_server配置在standby端。 fal_server对应的service namestandbytnsnames.ora里,fal_client对应的service nameprimarytnsnames.ora里。

 

l         Primary Site创建Data Guard的控制文件, scpData Guard Site

SQL>alter database create standby controlfile as ‘/tmp/stby.ctl’;

 

l         Data Guard Site创建password file

$orapwd file=orapwprod password=sys_user_password entries=5;

 

l         Primary Site上最近的日常全备份的数据文件scpData Guard Site上。

 

l         Data Guard Site上创建必要的文件目录。

 

案例一

说明:这里讨论的是采用RMAN备份的资料来创建Data Guard DatabasePrimary/Data Guard Site在不同的服务器上,并且文件目录结构一致。

 

l         打开Data Guard Database实例:

SQL>startup nomount

 

l         连接rman,创建Data Guard Database,这里采用了restore后进行recover

的方式,文件路径与Primary Database一致:

 

$rman target sys/sys_user_password@prod_prim auxiliary /

RMAN>run {

  allocate auxiliary channel aux1 device type disk;

  allocate auxiliary channel aux2 device type disk;

  duplicate target database for standby nofilenamecheck dorecover;

}

 

如果在该过程中出现下面的错误,请在Primary Database归档并备份当前日志文件,并scpData Guard Site

RMAN-05507: standby controlfile checkpoint (5965178905648) is more recent than duplication point in time (5965178904203)

 

如果该过程没有显示错误信息,说明Data Guard Database已经成功创建了。

 

 

案例

说明:Primary DatabaseData Guard Database在同一台服务器,参数的设置稍有些不一样。

 

l         假设在Primary/Data Guard的文件目录结构如下:

 

Primary

Data Guard

Datafile

/u/oradata/chen/db

/u/oradata/chen/stby

Redo log

/u/oradata/chen/db

/u/oradata/chen/stby

Control file

/u/oradata/chen/db

/u/oradata/chen/stby

Backup directory

/u/backup/chen

/u/backup/chen

 

l         初始化参数文件:

Primary Database spfilechen.ora

*.log_archive_dest_1=’LOCATION=/u/oradata/chen/arch’

*.log_archive_dest_2=’service=chen_stby dependency=log_archive_dest_1 optional’

*.standby_file_management=’AUTO’

 

Data Guard Databse inittest.ora

*.db_file_name_convert=’/u/oradata/chen/db/’,’/u/oradata/chen/stby/’

*.fal_client=’CHEN_STBY’

*.fal_server=’CHEN_PRIM’

*.instance_name=’test’

*.lock_name_space=’test’

*.log_archive_dest_1=’LOCATION=/u/oradata/chen/arch’

*.log_file_name_convert=’/u/oradata/chen/db/’,’/u/oradata/chen/stby/’

*.standby_archive_dest=’/u/oradata/chen/arch’

 

说明:Primary Database初始化参数log_archive_dest_2使用了dependency属性,因此不必为Data Guard再生成一份归档日志文件,相应地,在Data Guard Databaselog_archive_dest_1standby_archive_dest也要设置正确。

 

l         创建Data Guard

$rman target / auxiliary sys/sys_user_password@chen_stby

RMAN>run {

  allocate auxiliary channel aux1 device type disk;

  allocate auxiliary channel aux2 device type disk;

  duplicate target database for standby nofilenamecheck dorecover;

}

 

 

案例

 

说明:下面介绍用RMAN备份,创建Data Guard时不连接Primary Database,并且文件的目录结构与Primary Database不一致。

 

l         假设在Primary/Data Guard的文件目录结构如下:

 

Primary

Data Guard

Datafile

/oradata/prod

/u/oradata/prod

Redo log

/oradata/prod, /opt/app/oracle/oradata/prod

/u/oradata/prod,

/opt/app/oracle/oradata/prod

Control file

/oradata/prod,

/opt/app/oracle/oradata/prod

/u/oradata/prod

/opt/app/oracle/oradata/prod

Backup directory

/u01/backup/prod

/u/backup/prod

 

准备过程与前面的方法一样,以下过程无特别说明,都是在Data Guard Site上处理。

 

l         加载实例

SQL>STARTUP NOMOUNT;

 

l         文件恢复

 

设置DBID

RMAN> SET DBID 2330521740;

 

恢复control file

RMAN> RUN {

ALLOCATE CHANNEL c1 DEVICE TYPE disk;

RESTORE CONTROLFILE FROM ‘/u/backup/prod/controlfile_20060405.bak’;

ALTER DATABASE MOUNT;

}

 

恢复数据文件:

SQL>declare

  devtype varchar2(256);

  done boolean;

begin

  devtype:=sys.dbms_backup_restore.deviceallocate (type=>”,ident=>’t1′);

  sys.dbms_backup_restore.restoresetdatafile;  sys.dbms_backup_restore.restoredatafileto(dfnumber=>03,toname=>’/u/oradata/prod/drsys01.dbf’);  sys.dbms_backup_restore.restoredatafileto(dfnumber=>05,toname=>’/u/oradata/prod/tools01.dbf’);  sys.dbms_backup_restore.restoredatafileto(dfnumber=>06,toname=>’/u/oradata/prod/users01.dbf’);  sys.dbms_backup_restore.restoredatafileto(dfnumber=>08,toname=>’/u/oradata/prod/statspack01.dbf’);  sys.dbms_backup_restore.restorebackuppiece(done=>done,handle=>’/u/backup/prod/db_20060405_342_1_586985480′,params=>null);

  sys.dbms_backup_restore.devicedeallocate;

end;

/

 

SQL>declare

  devtype varchar2(256);

  done boolean;

begin

  devtype:=sys.dbms_backup_restore.deviceallocate (type=>”,ident=>’t1′);

  sys.dbms_backup_restore.restoresetdatafile;  sys.dbms_backup_restore.restoredatafileto(dfnumber=>01,toname=>’/u/oradata/prod/system01.dbf’);  sys.dbms_backup_restore.restoredatafileto(dfnumber=>02,toname=>’/u/oradata/prod/undotbs01.dbf’);  sys.dbms_backup_restore.restoredatafileto(dfnumber=>04,toname=>’/u/oradata/prod/indx01.dbf’);

sys.dbms_backup_restore.restoredatafileto(dfnumber=>07,toname=>’/u/oradata/prod/xdb01.dbf’);

sys.dbms_backup_restore.restorebackuppiece(done=>done,handle=>’/u/backup/prod/db_20060405_343_1_586985480′,params=>null);

  sys.dbms_backup_restore.devicedeallocate;

end;

/

 

dbms_backup_restore的说明可以参考$ORACLE_HOME/rdbms/admin/dbmsbkrs.sqlprvtbkrs.plb文件。

恢复文件时,单独处理每个备份片,否者可能会报备份里没有要恢复的文件的错误。可以通过备份的日志文件或者连接RMAN,用下面命令来获取每个备份片里对应的文件:

RMAN>list backup of database;

 

恢复archived log files

SQL>declare

  devtype varchar2(256);

  done boolean;

begin

  devtype:=sys.dbms_backup_restore.deviceallocate(type=>”,ident=>’t1′);  sys.dbms_backup_restore.restoresetarchivedlog(destination=>’/u/oradata/prod/arch’);

sys.dbms_backup_restore.restoreArchivedLogRange(low_change=>415);

sys.dbms_backup_restore.restorebackuppiece(done=>done,handle=>’/u/backup/prod/arch_20060405_341_1_586985454′,params=>null);

  sys.dbms_backup_restore.devicedeallocate;

end;

/

 

可以通过备份的日志文件或者连接RMAN,用下面命令来获取每个备份片具体包含的archived log file

RMAN> list backup of archivelog all/list backup of archivelog from sequence 300;

 

l         文件重命名

alter database rename file ‘/oradata/prod/system01.dbf’ to

‘/u/oradata/prod/system01.dbf’;

alter database rename file ‘/oradata/prod/undotbs01.dbf’ to

‘/u/oradata/prod/undotbs01.dbf’;

alter database rename file ‘/oradata/prod/drsys01.dbf’ to

‘/u/oradata/prod/drsys01.dbf’;

alter database rename file ‘/oradata/prod/indx01.dbf’ to

 ‘/u/oradata/prod/indx01.dbf’;

alter database rename file ‘/oradata/prod/tools01.dbf’ to

‘/u/oradata/prod/tools01.dbf’;

alter database rename file ‘/oradata/prod/users01.dbf’ to

‘/u/oradata/prod/users01.dbf’;

alter database rename file ‘/oradata/prod/xdb01.dbf’ to

‘/u/oradata/prod/xdb01.dbf’;

alter database rename file ‘/oradata/prod/statspack01.dbf’ to

‘/u/oradata/prod/statspack01.dbf’;

 

l         Recover database

RMAN>RECOVER DATABASE UNTIL SEQUENCE 457 thread 1;

 

至此,我们相当于恢复了一个普通的数据库,下面将把该数据库设成Data Guard Database

Primary Database上生成standby control file,替换掉Data Guard Sitecontrol file

 

Data Guard Database的初始化参数里增加下面参数:

db_file_name_convert=’/oradata/prod/’,’/u/oradata/prod/’

log_file_name_convert’/oradata/prod/’,’/u/oradata/prod/’

 

加载Data Guard Database

SQL>startup nomount

SQL>alter database mount standby database;

SQL> alter database recover automatic standby database;

 

 

维护

l         加载数据库:

SQL>startup nomount;

SQL>alter database mount standby database;

 

l         打开/取消Data Guard Databasemanaged recovery

SQL>alter database recover managed standby database disconnect from session;

SQL>alter database recover managed standby database cancel;

 

l         只读打开,并增加临时文件,如果处于恢复模式,则先取消恢复:

SQL>alter database open read only;

 

SQL>alter tablespace temp add tempfile ‘/oradata/prod/temp01.dbf’ size 1025m;

SQL>alter tablespace temp add tempfile ‘/oradata/prod/temp02.dbf’ size 1025m;

 

注:在目前版本,physical standby 不支持分布式查询。

激活Data Guard Database

有几种方法可以把Data Guard Database变成Primary Database,在激活Data Guard Database时,只要有可能,就应该避免数据丢失。

 

l         通过发布ACTIVATE命令激活:

该方法激活Data Guard Database后,需要resetlog,是最方便的,激活后要尽快做个备份。如果可能的话,归档所有的Primary Database redo log,并在Data Guard Database上恢复。

 

SQL>startup nomount

SQL>alter database mount standby database;

SQL>alter database activate standby database;

 

l         通过重新创建控制文件来激活:

如果Primary Database出现故障,存在无法被归档的redo log资料,并且那些未被归档的redo log文件没有被破坏,可以考虑该方法,能够做到资料不丢失。与ACTIVATE相比,该方法应是首选。

 

1  Primary Databaseredo log拷贝到Data Guard Siteredo log目录上;

 

2  Data Guard Database上:

alter database backup controlfile to trace;

取出创建controlfile sql,使用NORESETLOGS

 

3         创建新的控制文件。

 

4         recover database

 

5         打开数据库。

 

l         Primary DatabaseData Guard Database进行角色互换,这种方法一般在计划的维护中用到,在做角色转换时要保证网络稳定,否则可能出现问题需要手工解决。

如果计划的时间比较紧,可以考虑如下处理(在primary site):

1          switch redo log

2          issue checkpoint

3          stop listener

4          startup force

5          switchover

6          start listener

 

激活Data Guard Database后,如果没有添加临时数据文件的,需要添加临时数据文件。

 

 

通过RESETLOGS的恢复

采用case 2创建的Data Guard做测试,在激活该Data Guard后,在里面创建新的数据库对象,数据文件。

通过resetlogs恢复的关键是:

l         拥有RESETLOGS前后的控制文件,可以是备份的控制文件;

l         知道RESETLOGSSCN

l         RESETLOGS前的数据库备份文件;

l         恢复时必要的archived logs

 

l         准备:

获取RESETLOGSSCN

可以通过几种方法来获取该SCN

l         查询v$database

SQL>col RESETLOGS_CHANGE# format 9999999999999

SQL>select RESETLOGS_CHANGE# from v$database;

RESETLOGS_CHANGE#

—————–

5965186246274

 

l         通过RMAN列出

RMAN> list incarnation of database;

using target database controlfile instead of recovery catalog

List of Database Incarnations

DB Key  Inc Key DB Name  DB ID            CUR Reset SCN  Reset Time

——- ——- ——– —————- — ———- ———-

1       1       PROD 2330521740       NO  1          29-DEC-05

2       2       PROD 2330521740       YES 5965186246274 08-APR-06

 

l         dump出控制文件/system文件

SQL>alter session set events ‘immediate trace name controlf level 3’;

dump文件的DATABASE ENTRY部分,有如下内容:

Incmplt recovery scn: 0x0000.00000000

Resetlogs scn: 0x056c.e0cea682 Resetlogs Timestamp  04/08/2006 13:32:47

Prior resetlogs scn: 0x0000.00000001 Prior resetlogs Timestamp  12/29/2005 14:45:32

 

转换一下:

Hex(0x056c.e0cea682) = Dec(5965186246274)

 

l         通过查看alert_SID.log文件

ALTER DATABASE ACTIVATE [PHYSICAL] STANDBY DATABASE

RESETLOGS after incomplete recovery UNTIL CHANGE 5965186246273

Resetting resetlogs activation ID 2330494092 (0x8ae8848c)

 

注意:通过前三种方法获取的SCN都比从alert_SID.log中获得的SCN1,在alert_SID.log里的SCN才是真实的,因此通过前三种方法获得的SCN需要减去1

 

l         模拟一些事务:

创建了两个表空间和用户chen,并在chen里创建了几个表。

SQL>create tablespace test datafile ‘/u/oradata/prod/test01.dbf’ size 129m uniform size 1m;

SQL>create tablespace data datafile ‘/u/oradata/prod/data01.dbf’ size 257m uniform size 1m;

SQL>create user chen identified by chen default tablespace users temporary tablespace temp;

SQL>grant create session,create table to chen;

SQL>alter user chen quota unlimited on users;

SQL>alter user chen quota unlimited on test;

SQL>alter user chen quota unlimited on data;

chen里创建表,加载数据。

 

SQL>select table_name,tablespace_name from user_tables;

TABLE_NAME                     TABLESPACE_NAME

—————————— ——————————

T1                             USERS

T2                             USERS

T3                             TEST

T4                             USERS

T5                             DATA

 

SQL>@filelist.sql

Datafiles info:

 

TABLESPACE_NAME  FILE_NAME                                          TOTAL    FREE     USED     PCT_FREE   MAX_BLKS STATUS     AUT

—————- ————————————————– ——– ——– ——– ———- ——– ———- —

DATA             /u/oradata/prod/data01.dbf                     257      170      87       66.15      21760    AVAILABLE  NO

TEST             /u/oradata/prod/test01.dbf                     129      42       87       32.56      5376     AVAILABLE  NO

USERS            /u/oradata/prod/users01.dbf                    1000     482      518      48.2       61688    AVAILABLE  YES

。。。。。。

 

Tempfiles info:

 

TABLESPACE_NAME  FILE_NAME                                          TOTAL    FREE     USED     PCT_FREE   MAX_BLKS STATUS     AUT

—————- ————————————————– ——– ——– ——– ———- ——– ———- —

TEMP             /u/oradata/prod/temp01.dbf                     513      513      0        100                 AVAILABLE  NO

 

Controlfiles info:

 

FILE_NAME

————————————————–

/u/oradata/prod/control01.ctl

/u/oradata/prod/control02.ctl

 

Redo log info:

THREAD# GROUP#  KBYTES    ARCH    STATUS      SEQ#      TYPE    MEMBER                                  PCT_USED

1       1       102400    YES     INACTIVE    11        ONLINE  /u/oradata/prod/redo01_a.log

1       2       102400    YES     INACTIVE    10        ONLINE  /u/oradata/prod/redo02_a.log

1       3       102400    NO      CURRENT     12        ONLINE  /u/oradata/prod/redo03_a.log        89.54%

 

当前的redo log已经写了89.54%的内容了。

 

SQL>shutdown abort

 

保存一下现有的control file, redo log

 

l         恢复:

RESETLOGS前的恢复

这部分恢复与case 2的前面过程差不多,参照case 2。在数据文件restore后,发布如下命令:

SQL>recover database until change 5965186246273 using backup controlfile;

 

这个执行完后,关闭数据库,把前面备份的在RESETLOGS后的control fileredo log恢复到对应的位置。

 

RESETLOGS后的恢复:

 

SQL>startup mount

SQL>recover database;

将提示如下错误:

ORA-01110: data file 9: ‘/u/oradata/prod/test01.dbf’

ORA-01157: cannot identify/lock data file 9 – see DBWR trace file

ORA-01110: data file 9: ‘/u/oradata/prod/test01.dbf’

 

这是因为后来创建的数据文件,当前的控制文件里有该文件信息,而在磁盘上没有的缘故,因为创建该文件后的archived log都存在,可以从archived log里恢复出来,现在把该文件建起来:

SQL>alter database create datafile ‘/u/oradata/prod/test01.dbf’ as

‘/u/oradata/prod/test01.dbf’;

SQL>alter database create datafile ‘/u/oradata/prod/data01.dbf’ as

‘/u/oradata/prod/data01.dbf’;

 

SQL> recover database;

SQL>alter database open;

 

至此,恢复完成了,可以检查一下数据是否完整。

附件A

 

Standby site tnsnames.ora

prod_prim =

  (DESCRIPTION =

    (ADDRESS_LIST =

      (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.8.9)(PORT = 1521))

    )

    (CONNECT_DATA =

      (SID = prod)

    )

  )

 

Primary site tnsnames.ora

prod_stby = (DESCRIPTION_LIST =

          (DESCRIPTION=

             (ADDRESS_LIST=

              (ADDRESS=(PROTOCOL=tcp) (HOST=192.168.8.119) (PORT=1521))

)

             (CONNECT_DATA=(SID=prod))

           )

)

 

 

附件B

检查Standby DB恢复。主库半小时切换一次日志,redo log size 50M,传输速率约200kB/s,考虑到远程传输日志及恢复时间,假设Standby DB40分钟内恢复是正常的。

check_stby_recovery.sh

#!/bin/bash

 

cd `dirname $0`/..

. conf/define

 

TIME_LAG=2400

RESENDLAG=1800

RESENDTIMES=3

SQLPLUS=${ORACLE_HOME}/bin/sqlplus

SPOOLLOG=${BASEDIR}/spool/stb_rcv.log

TMPLOG=${BASEDIR}/spool/stb.${ORACLE_SID}

 

RES=`${SQLPLUS} -s /nolog <<EOF

conn / as sysdba

set feedback off

set termout  off

set pagesize 0

set line 130

col seq format a10

spool ${SPOOLLOG}

select ‘#’||SEQUENCE# seq,’#’||REGISTRAR,’#’||APPLIED,’#’||to_char(COMPLETION_TIME,’yyyy/mm/dd hh24:mi:ss’)||’         #’||trunc((sysdate-COMPLETION_TIME)*1440) from v\\$archived_log where SEQUENCE# = (select max(SEQUENCE#) from v\\$archived_log) and COMPLETION_TIME < sysdate – ${TIME_LAG}/86400;

spool off

exit

EOF`

 

ERRCNT=`echo ${RES}|grep -c ORA-`

if [ ${ERRCNT} -gt 0 ] ; then

   NOW=`date +%s`

   if [ ! -f $TMPLOG ]; then

      echo “1:$NOW” > $TMPLOG

      /bin/echo ${RES}|/usr/bin/email -s “Data Guard ${ORACLE_SID} on `hostname` may have problems, pls check.” ${MAILTO}

   else

      LASTSEND=`tail -1 $TMPLOG|cut -d: -f1`

      LASTTIME=`tail -1 $TMPLOG|cut -d: -f2`

      if [ $LASTSEND -ge $RESENDTIMES ]; then

         exit

      else

         LAG=`expr $NOW – $LASTTIME`

         if [ $LAG -ge $RESENDLAG ]; then

              LASTSEND=`expr $LASTSEND + 1`

              echo “$LASTSEND:$NOW” >> $TMPLOG

              /bin/echo ${RES}|/usr/bin/email -s “Data Guard ${ORACLE_SID} on `hostname` may have problems, pls check.” ${MAILTO}

              exit

         fi

      fi

   fi

else

   if [ -f $TMPLOG ]; then

      rm -f $TMPLOG

   fi

fi

 

CNT=`/bin/grep -c “#” ${SPOOLLOG}`

if [ ${CNT} -gt 0 ] ; then

   /bin/cat ${SPOOLLOG}|/usr/bin/email -s “Data Guard ${ORACLE_SID} on `hostname` recovery delay, pls check!” ${MAILTO}

fi

 

参考文档:

Oracle8i备份与恢复手册

Oracle® Data Guard —— Concepts and Administration

Oracle9i Recovery Manger User’s Guide

Oracle 9i RMAN备份与恢复技术——配置和使用Oracle恢复管理器

 

Advertisements

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Create a free website or blog at WordPress.com.

%d bloggers like this: