首页 » 技术分享 » ora-16433 重建控制文件恢复

ora-16433 重建控制文件恢复

 

11.2.0.3版本,非归档,大概是rm掉current的log,然后重建
controlfille后恢复导致一系列问题,并最终出现ora-600 2662错误,虽然这个错误很常见,但是你发现推进scn
也是无法open,感觉有点怪,远程了女神的电脑,操作不便,最后将文件压缩传过来,我在自己的vmware进行了恢复。

由于环境的差异,所以解压后我先进行rename file操作,如下:

SELECT name FROM v$datafile;
 
NAME
------------------------------------------------
/u01/app/oracle/oradata/travel/system01.dbf
.........
/u01/app/oracle/oradata/travel/users01.dbf
 
SELECT member FROM v$logfile;
 
MEMBER
-------------------------------------------------
/u01/app/oracle/oradata/travel/redo03.log
/u01/app/oracle/oradata/travel/redo02.log
/u01/app/oracle/oradata/travel/redo01.log
 
ALTER DATABASE RENAME file '/u01/app/oracle/oradata/travel/system01.dbf' TO '/home/oracle/travel/travel/system01.dbf';
 
DATABASE altered.
 
ALTER DATABASE RENAME file '/u01/app/oracle/oradata/travel/sysaux01.dbf' TO '/home/oracle/travel/travel/sysaux01.dbf';
 
DATABASE altered.
 
ALTER DATABASE RENAME file '/u01/app/oracle/oradata/travel/undotbs01.dbf' TO '/home/oracle/travel/travel/undotbs01.dbf';
 
DATABASE altered.
 
ALTER DATABASE RENAME file '/u01/app/oracle/oradata/travel/users01.dbf' TO '/home/oracle/travel/travel/users01.dbf';
 
DATABASE altered.
 
ALTER DATABASE RENAME file '/u01/app/oracle/oradata/travel/redo01.log' TO '/home/oracle/travel/travel/redo01.log';
 
DATABASE altered.
 
ALTER DATABASE RENAME file '/u01/app/oracle/oradata/travel/redo02.log' TO '/home/oracle/travel/travel/redo02.log';
 
DATABASE altered.
 
ALTER DATABASE RENAME file '/u01/app/oracle/oradata/travel/redo03.log' TO '/home/oracle/travel/travel/redo03.log';   
 
DATABASE altered.
 
SELECT name FROM v$datafile;
 
NAME
-------------------------------------------------
/home/oracle/travel/travel/system01.dbf
/home/oracle/travel/travel/sysaux01.dbf
/home/oracle/travel/travel/undotbs01.dbf
/home/oracle/travel/travel/users01.dbf
 
SELECT member FROM v$Logfile;
 
MEMBER
-------------------------------------------------
/home/oracle/travel/travel/redo03.log
/home/oracle/travel/travel/redo02.log
/home/oracle/travel/travel/redo01.log
 

此时,进行recover,会报错ORA-16433,如下:

 recover DATABASE;
ORA-00283: recovery SESSION canceled due TO errors
ORA-16433: The DATABASE must be opened IN READ/WRITE mode.
 
recover DATABASE USING backup controlfile until cancel;
ORA-00283: recovery SESSION canceled due TO errors
ORA-16433: The DATABASE must be opened IN READ/WRITE mode.

 ALTER DATABASE OPEN;
ALTER DATABASE OPEN
*
ERROR at line 1:
ORA-01113: file 1 needs media recovery
ORA-01110: DATA file 1: '/home/oracle/travel/travel/system01.dbf'
 

关于该错误,通过oerr命令可以大概了解一下其含义,如下:

[oracle@11gR2_primary ~]$ oerr ora 16433
16433, 00000, "The database must be opened in read/write mode."
// *Cause:  An attempt was made to open the database in read-only mode after an
//          operation that requires that the database be opened in read/write
//          mode.
// *Action: Open the database in read/write mode. The database can then be
//          opened in read-only mode.
 
通过这个错误,我们可以得出一个信息,数据库可以以read only模式打开,换句话讲,你查询如下几个值都是一样的:
select checkpoint_change# from v$database;   --来自controlfile
select checkpoint_change# from v$datafile;   --来自controlfile
select checkpoint_change# from v$datafile_orader;  --来自datafile header
 

其实,通常来讲,只要上述几个值一样,那么我们的db应该都是可以直接open打开的,然而,这里却不行,read only都不行,我试过。
 然后下面就开始我的恢复工作。

第一次尝试恢复:

---重建controlfile
 startup nomount  pfile=/home/oracle/travel/travel/a.ora
ORACLE instance started.
 
Total System Global Area  626327552 bytes
Fixed Size                  2230952 bytes
Variable Size             184550744 bytes
Database Buffers          436207616 bytes
Redo Buffers                3338240 bytes
CREATE CONTROLFILE REUSE DATABASE "TRAVEL" RESETLOGS  NOARCHIVELOG

  2      MAXLOGFILES 16
  3      MAXLOGMEMBERS 3
  4      MAXDATAFILES 100
  5      MAXINSTANCES 8
  6      MAXLOGHISTORY 292
  7  LOGFILE
  8    GROUP 1 '/home/oracle/travel/travel/redo01.log'  SIZE 50M BLOCKSIZE 512,
  9    GROUP 2 '/home/oracle/travel/travel/redo02.log'  SIZE 50M BLOCKSIZE 512,
 10    GROUP 3 '/home/oracle/travel/travel/redo03.log'  SIZE 50M BLOCKSIZE 512
 11  -- STANDBY LOGFILE
 12  DATAFILE
 13    '/home/oracle/travel/travel/system01.dbf',
 14    '/home/oracle/travel/travel/sysaux01.dbf',
 15    '/home/oracle/travel/travel/undotbs01.dbf',
 16    '/home/oracle/travel/travel/users01.dbf'
 17  CHARACTER SET AL32UTF8
 18  ;
 
Control file created.
 
;
 
开始开始进行recover:
; alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/home/oracle/travel/travel/system01.dbf'
 
; recover database until cancel;
ORA-00283: recovery session canceled due to errors
ORA-01610: recovery using the BACKUP CONTROLFILE option must be done
 
; recover database using backup controlfile until cancel;
ORA-00279: change 244977 generated at 01/19/2013 01:56:54 needed for thread 1
ORA-00289: suggestion : /oracle/product/11.2.0/db_1/dbs/arch1_1_805082211.dbf
ORA-00280: change 244977 for thread 1 is in sequence #1
 
Specify log: {=suggested | filename | AUTO | CANCEL}
cancel
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/home/oracle/travel/travel/system01.dbf'
 
ORA-01112: media recovery not started
 

因为本身是非归档,所以这个不完全恢复的步骤无非是为了后面可以进行open resetlog。然后停库,加入隐含参数进行open:

; shutdown immediate
ORA-01109: DATABASE NOT OPEN
 
DATABASE dismounted.
ORACLE instance shut down.
; startup nomount  pfile=/home/oracle/travel/travel/b.ora
ORACLE instance started.
 
Total System Global Area  626327552 bytes
Fixed SIZE                  2230952 bytes
Variable SIZE             184550744 bytes
DATABASE Buffers          436207616 bytes
Redo Buffers                3338240 bytes
; ALTER DATABASE mount
  2  ;
 
DATABASE altered.
 
; ALTER DATABASE OPEN resetlogs;
ALTER DATABASE OPEN resetlogs
*
ERROR at line 1:
ORA-01092: ORACLE instance TERMINATED. Disconnection forced
ORA-00600: internal error code, arguments: [2662], [0], [244985], [0],
[244998], [4194432], [], [], [], [], [], []
Process ID: 3641
SESSION ID: 1 Serial NUMBER: 5
 

此时alert log对应的错误如下:

Fri Dec 14 21:58:39 2012
SMON: enabling cache recovery
Errors in file /oracle/diag/diag/rdbms/travel/travel/trace/travel_ora_3641.trc  (incident=4937):
ORA-00600: internal error code, arguments: [2662], [0], [244985], [0], [244998], [4194432], [], [], [], [], [], []
Incident details in: /oracle/diag/diag/rdbms/travel/travel/incident/incdir_4937/travel_ora_3641_i4937.trc
Fri Dec 14 21:58:43 2012
Dumping diagnostic data in directory=[cdmp_20121214215843], requested by (instance=1, osid=3641), summary=[incident=4937].
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /oracle/diag/diag/rdbms/travel/travel/trace/travel_ora_3641.trc:
ORA-00600: internal error code, arguments: [2662], [0], [244985], [0], [244998], [4194432], [], [], [], [], [], []
Errors in file /oracle/diag/diag/rdbms/travel/travel/trace/travel_ora_3641.trc:
ORA-00600: internal error code, arguments: [2662], [0], [244985], [0], [244998], [4194432], [], [], [], [], [], []
Error 600 happened during db open, shutting down database
USER (ospid: 3641): terminating the instance due to error 600
Instance terminated by USER, pid = 3641
ORA-1092 signalled during: alter database open resetlogs...
opiodr aborting process unknown ospid (3641) as a result of ORA-1092
 

甚至手工执行如下操作都无法open数据库,如下:

; ALTER SESSION SET events '10015 trace name ADJUST_SCN level 14';
 
SESSION altered.
 
; ALTER DATABASE OPEN;
ALTER DATABASE OPEN
*
ERROR at line 1:
ORA-01113: file 1 needs media recovery
ORA-01110: DATA file 1: '/home/oracle/travel/travel/system01.dbf'
 
最后dunmp controlfile和CURRENT redo logfile,发现low cache rba信息居然是最大值:
; oradebug setmypid
Statement processed.
; ALTER system  SET events 'immediate trace name controlf level 4'   ;
 
System altered.
 
; ALTER system  dump logfile '/home/oracle/travel/travel/redo01.log';
 
SQL> oradebug tracefile_name
/oracle/diag/diag/rdbms/travel/travel/trace/travel_ora_4229.trc
;
 

++++++++ controlfile dump 片段

***************************************************************************
CHECKPOINT PROGRESS RECORDS
***************************************************************************
 (size = 8180, compat size = 8180, section max = 11, section in-use = 0,
  last-recid= 0, old-recno = 0, last-recno = 0)
 (extent = 1, blkno = 2, numrecs = 11)
THREAD #1 - status:0x2 flags:0x0 dirty:0
low cache rba:(0xffffffff.ffffffff.ffff) on disk rba:(0x1.3.0)  ---low cache rba为无穷大
on disk scn: 0x0000.0003bcf7 12/14/2012 21:58:39
resetlogs scn: 0x0000.0003bcf2 12/14/2012 21:58:36
heartbeat: 802069494 mount id: 2869233386
THREAD #2 - status:0x0 flags:0x0 dirty:0
low cache rba:(0x0.0.0) on disk rba:(0x0.0.0)
on disk scn: 0x0000.00000000 01/01/1988 00:00:00
resetlogs scn: 0x0000.00000000 01/01/1988 00:00:00
heartbeat: 0 mount id: 0
THREAD #3 - status:0x0 flags:0x0 dirty:0
low cache rba:(0x0.0.0) on disk rba:(0x0.0.0)
on disk scn: 0x0000.00000000 01/01/1988 00:00:00
resetlogs scn: 0x0000.00000000 01/01/1988 00:00:00
heartbeat: 0 mount id: 0
 
+++++++redo01.log dump 片段
DUMP OF REDO FROM FILE '/home/oracle/travel/travel/redo01.log'
 Opcodes *.*
 RBAs: 0x000000.00000000.0000 thru 0xffffffff.ffffffff.ffff
 SCNs: scn: 0x0000.00000000 thru scn: 0xffff.ffffffff
 Times: creation thru eternity
 FILE HEADER:
        Compatibility Vsn = 186646528=0xb200000
        Db ID=2872261344=0xab333ae0, Db Name='TRAVEL'
        Activation ID=2872292516=0xab33b4a4
        Control Seq=233=0xe9, File size=102400=0x19000
        File Number=1, Blksiz=512, File Type=2 LOG
 descrip:"Thread 0001, Seq# 0000000001, SCN 0x00000003bcee-0xffffffffffff"
 thread: 1 nab: 0xffffffff seq: 0x00000001 hws: 0x3 eot: 1 dis: 0
 resetlogs count: 0x2ffc9463 scn: 0x0000.0003bcee (244974)
 prev resetlogs count: 0x2ffc7a41 scn: 0x0000.0003bcea (244970)
 Low  scn: 0x0000.0003bcee (244974) 01/19/2013 01:56:51
 Next scn: 0xffff.ffffffff 01/01/1988 00:00:00
 Enabled scn: 0x0000.0003bcee (244974) 01/19/2013 01:56:51
 Thread closed scn: 0x0000.0003bcee (244974) 01/19/2013 01:56:51
 Disk cksum: 0x3467 Calc cksum: 0x3467
 Terminal recovery stop scn: 0x0000.00000000
 Terminal recovery  01/01/1988 00:00:00
 Most recent redo scn: 0x0000.00000000
 Largest LWN: 0 blocks
 End-of-redo stream : No
 Unprotected mode
 Miscellaneous flags: 0x800000
 Thread internal enable indicator: thr: 0, seq: 0 scn: 0x0000.00000000
 Zero blocks: 0
 Format ID is 2
 redo log key is 853c461da2eec7ed4b45ce75b8c27d7
 redo log key flag is 5
 Enabled redo threads: 1
 
REDO RECORD - Thread:1 RBA: 0x000001.00000002.0010 LEN: 0x0070 VLD: 0x05
SCN: 0x0000.0003bcf2 SUBSCN:  1 01/19/2013 01:56:54
(LWN RBA: 0x000001.00000002.0010 LEN: 0001 NST: 0001 SCN: 0x0000.0003bcee)  ,
CHANGE #1 MEDIA RECOVERY MARKER SCN:0x0000.00000000 SEQ:0 OP:17.3 ENC:0
END OF REDO DUMP
 

可以看到实际上current redo的LWN RBA是对的,为1.2.10,是小于on disk rba (0×1.3.0)的。那么这里为什么会出现这个情况?
 猜测可能是写紊乱了。到这里也就比较明白了,仍然是conrolfile有问题。

既然如此,那么我直接将controlfile文件rm掉,然后再次重建。如下:

; startup nomount pfile='/home/oracle/travel/travel/a.ora
ORACLE instance started.
 
Total System Global Area  626327552 bytes
Fixed Size                  2230952 bytes
Variable Size             184550744 bytes
Database Buffers          436207616 bytes
Redo Buffers                3338240 bytes
; CREATE CONTROLFILE REUSE DATABASE "TRAVEL" RESETLOGS  NOARCHIVELOG
  2      MAXLOGFILES 16
  3      MAXLOGMEMBERS 3
  4      MAXDATAFILES 100
  5      MAXINSTANCES 8
  6      MAXLOGHISTORY 292
  7  LOGFILE
  8    GROUP 1 '/home/oracle/travel/travel/redo01.log'  SIZE 50M BLOCKSIZE 512,
  9    GROUP 2 '/home/oracle/travel/travel/redo02.log'  SIZE 50M BLOCKSIZE 512,
 10    GROUP 3 '/home/oracle/travel/travel/redo03.log'  SIZE 50M BLOCKSIZE 512
 11  -- STANDBY LOGFILE
 12  DATAFILE
 13    '/home/oracle/travel/travel/system01.dbf',
 14    '/home/oracle/travel/travel/sysaux01.dbf',
 15    '/home/oracle/travel/travel/undotbs01.dbf',
 16    '/home/oracle/travel/travel/users01.dbf'
 17  CHARACTER SET AL32UTF8
 18  ;
 
Control file created.
; recover database until cancel using backup controlfile;
ORA-00279: change 244985 generated at 12/14/2012 23:05:06 needed for thread 1
ORA-00289: suggestion : /oracle/product/11.2.0/db_1/dbs/arch1_1_802047904.dbf
ORA-00280: change 244985 for thread 1 is in sequence #1
 
Specify log: {=suggested | filename | AUTO | CANCEL}
AUTO
ORA-00308: cannot open archived log
'/oracle/product/11.2.0/db_1/dbs/arch1_1_802047904.dbf'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3
 
ORA-00308: cannot open archived log
'/oracle/product/11.2.0/db_1/dbs/arch1_1_802047904.dbf'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3
 
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/home/oracle/travel/travel/system01.dbf'
 

; select file#,CHECKPOINT_CHANGE#,LAST_CHANGE# from v$datafile order by  1;
 
     FILE# CHECKPOINT_CHANGE# LAST_CHANGE#
---------- ------------------ ------------
         1             244985
         2             244985
         3             244985
         4             244985
 
; select file#,CHECKPOINT_CHANGE# from v$datafile_header order by 1;
 
     FILE# CHECKPOINT_CHANGE#
---------- ------------------
         1             244985
         2             244985
         3             244985
         4             244985
 
; select CHECKPOINT_CHANGE# from v$database;
 
CHECKPOINT_CHANGE#
------------------
                0
 

此时停库,然后在pfile中加入隐含参数:
*._allow_resetlogs_corruption=TRUE
 *._allow_error_simulation=TRUE

接着再次进行mount,并进行scn 推进:

; ALTER DATABASE OPEN resetlogs;
ALTER DATABASE OPEN resetlogs
*
ERROR at line 1:
ORA-01194: file 1 needs more recovery TO be consistent
ORA-01110: DATA file 1: '/home/oracle/travel/travel/system01.dbf'
 
shutdown immediate
ORA-01109: DATABASE NOT OPEN
 
DATABASE dismounted.
ORACLE instance shut down.
 startup nomount pfile='/home/oracle/travel/travel/b.ora
ORACLE instance started.
 
Total System Global Area  626327552 bytes
Fixed Size                  2230952 bytes
Variable Size             184550744 bytes
Database Buffers          436207616 bytes
Redo Buffers                3338240 bytes
 alter database mount
  2  ;
 
Database altered.
 
alter session set events '10015 trace name ADJUST_SCN level 10';
 
Session altered.
alter database open;
alter database open
*
ERROR at line 1:
ORA-01589: must use RESETLOGS or NORESETLOGS option for database open
 
alter database open noresetlogs;
alter database open noresetlogs
*
ERROR at line 1:
ORA-01588: must use RESETLOGS option for database open
 
alter database open resetlogs;
 
Database altered.
 
show parameter name
 
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
db_file_name_convert                 string
db_name                              string      travel
db_unique_name                       string      travel
global_names                         boolean     FALSE
instance_name                        string      travel
lock_name_space                      string
log_file_name_convert                string
processor_group_name                 string
service_names                        string      travel
select open_mode from v$database;
 
OPEN_MODE
--------------------
READ WRITE
 
alter system switch logfile;
 
System altered.
 

 

这里说明一下是我的alert log里面还出现了数据字典不一致的问题,因为毕竟是强制open的,如下:

Sun Jan 20 00:25:34 2013
Errors in file /oracle/diag/diag/rdbms/travel/travel/trace/travel_m000_4681.trc:
ORA-25153: Temporary Tablespace is Empty
Sun Jan 20 00:25:38 2013
Errors in file /oracle/diag/diag/rdbms/travel/travel/trace/travel_j004_4692.trc  (incident=21817):
ORA-00600: internal error code, arguments: [kdsgrp1], [], [], [], [], [], [], [], [], [], [], []
Incident details in: /oracle/diag/diag/rdbms/travel/travel/incident/incdir_21817/travel_j004_4692_i21817.trc
Sun Jan 20 00:25:43 2013
Dumping diagnostic data in directory=[cdmp_20130120002543], requested by (instance=1, osid=4692 (J004)), summary=[incident=21817].
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Sun Jan 20 00:25:47 2013
 
这个ora-600错误很常见了,通过是index信息不一致。
 
其trace内容如下:
* kdsgrp1-1: *************************************************
            row 0x0040589e.4b continuation at
            0x0040589e.4b file# 1 block# 22686 slot 75 not found
KDSTABN_GET: 0 ..... ntab: 1
curSlot: 75 ..... nrows: 75
kdsgrp - dump CR block dba=0x0040589e
Block header dump:  0x0040589e
 Object id on Block? Y
 seg/obj: 0x12  csc: 0x00.3bca9  itc: 1  flg: O  typ: 1 - DATA
     fsl: 0  fnx: 0x40589f ver: 0x01
 
 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x0001.01a.000000a1  0x00c05e97.001d.0d  --U-    1  fsc 0x0053.0003bcab
bdba: 0x0040589e
data_block_dump,data header at 0x7a26a044
===============
tsiz: 0x1fb8
hsiz: 0xa8
pbl: 0x7a26a044
     76543210
flag=--------
ntab=1
nrow=75
frre=-1
fsbo=0xa8
...........
 

seg/obj: 0×12 这是obj$对象,属于bootstrap$核心对象,涉及到这类的对象处理相对麻烦,针对这类情况,数据库open后都建议
 把数据导出,然后重建库,我这里就不在继续描述这个ora-600错误了,我博客也有类似的例子。

最后来个小节:

1. oracle通过系统checkpoint scn,datafile checkpoint scn,start scn三者之间的比较来判断数据文件是否需要进行介质恢复.
2. 在redo 线程打开的情况下,即数据库open的情况下,stop scn会被设置为无穷大,当正常关闭时,stop scn等于datafile scn.
这里需要注意的是,stop scn是存放在controlfile中的,网上部分资料说是存在datafile header中,这个说法是错误的。
3. oracle在open之前是先判断是否进行介质恢复,然后再是判断是否进行instance recovery。
4. 关于4种scn的关系如下:

system checkpoint scn — 存放在controlfile中
datafile checkpoint scn –存放在controlfile中
start scn —存放在datafile header中
stop scn —存放在controlfile中

system scn,datafile checkpoint scn,start scn,这3种scn用于判断数据文件是否需要进行介质恢复。这3个相等这不需要介质恢复。
 如何这4个都相等,那么就不需要进行实例恢复。stop scn是用于判断是否进行实例恢复的。

5. 如果stop scn比其他的几个scn要大,那么就需要进行instance recover,需要进行扫描redo,实例恢复的起点是low cache rba,终点
 是redo log的最末端。

 

转载自原文链接, 如需删除请联系管理员。

原文链接:ora-16433 重建控制文件恢复,转载请注明来源!

0