ORA-08103: object no longer exists

“ORA-08103: object no longer exists”,一位客户那里遇到了这个问题,其实这个问题还是比较经典的。之前遇到过,不过与这次的情况有些区别。以前遇到的案例类似如下(note:[8103.1]):

Cause

ORA-8103 is caused by an invalid block type. The block header has an invalid block type or the block type inside the block is not expected; e.g. a data block (Type=6) was expected but the actual block information is not a data block (Type!=6).

ORA-8103 is also caused by an unexpected data_object_id where it is changing for the involved objects while the affected SQL statement is executed.

These two causes might be due to an expected behavior or other problems.

Tables are being dropped/truncated while a SQL statement for those tables is still in execution。
In other words the object has been deleted by another session since the operation began。
Look if dba_objects.data_object_id is changing for the affected object while queries are being executed.
data_object_id is changed by DDL statements like:
———————–
truncate table
alter index .. rebuild
alter table .. move
etc.
———————–

更确切的现象是,当数据库中的对象遭遇多条并行SQL语句执行,同时某条语句做了更改,例如:truncate等,其中某条查询的语句会报出”ORA-08103: object no longer exists”

而这次客户的问题与上述现象不符,因为环境相对安静,仅仅是一条涉及某张表A的select无法执行,全表扫面的select * from能够查询,最关键的是analyze table报了错。
analyze table报错,基本可以断定是block corruption的问题,而且很可能是在memory级别,例如[note 81301.1]提到的下面现象:

Block Corruption.

Hardware, IO subsystem or OS problems may cause block corruptions overwriting the Block Type in the block header causing the error ORA-8103.
The block is healthy but it is an “OLD/STALE” block. These kind of corruptions might be caused by LOST IO/LOST WRITE or a bug in external non-oracle tools that migrate file systems while the database is OPEN.   In that case the data_object_id for the affected object could have changed but the actual block does not reflect it.
Note that the block may also be temporarily corrupted in the buffer cache (SGA Memory).

如果是block corruption,可以按照如下的思路设置event来诊断。

1)ANALYZE TABLE A VALIDATE STRUCTURE; <===do not use ‘CASCADE’
2)ANALYZE INDEX xxxx VALIDATE STRUCTURE; <===analyze index on the table one by one
3)If you found which index or table is corrupted, please get trace on that index or table:
alter session set max_dump_file_size=unlimited;
alter session set db_file_multiblock_read_count=1;
alter session set events ‘immediate trace name trace_buffer_on level 1048576’;
alter session set events ‘10200 trace name context forever, level 1’;
alter session set events ‘8103 trace name ERRORSTACK level 3’;
ANALYZE TABLE A VALIDATE STRUCTURE;

然后在udump中拿到相应的trace文件,里面查找”corrupt”关键字,如下:

——————————————————————————————
BH (0000000ACBFB9F08) file#: 7 rdba: 0x01d2c622 (7/1230370) class: 1 ba: 0000000ACB7FE000
set: 66 blksize: 8192 bsi: 0 set-flg: 0 pwbcnt: 0
dbwrid: 5 obj: 52656 objn: 52656 tsn: 6 afn: 7
hash: [ee90b078,ee90b078] lru: [cbfbab38,cbfb93d8]
lru-flags: hot_buffer
obj-flags: object_ckpt_list
ckptq: [cbfba048,c4fd54f8] fileq: [cbfba058,c4fd5508] objq: [cbfba108,c4fd55b8]
use: [dbf29bf8,dbf29bf8] wait: [NULL]
st: XCURRENT md: SHR tch: 0
flags: buffer_dirty gotten_in_current_mode block_written_once
redo_since_read
LRBA: [0x0.0.0] HSCN: [0x0.1b7da2] HSUB: [1]
buffer tsn: 6 rdba: 0x01d2c622 (7/1230370)
scn: 0x0000.001b7da2 seq: 0x01 flg: 0x02 tail: 0x7da20601
frmt: 0x01 chkval: 0x0000 type: 0x2b=unknown
Hex dump of corrupt header 4 = CORRUPT

<===================================CORRUPT. type: 0x2b=unknown
——————————————————————————————-
之所以能够full scan的查询出正常结果,很可能这个corrupt仅仅是发生在SGA中,alter system flush buffer_cache,将buffer刷出去也许即可解决了。
这次的案例是客户reboot OS之后,查询就恢复正常了,这里强调的是,遇到类似的问题,可以借助set events来诊断。

alter session set max_dump_file_size=unlimited;
alter session set db_file_multiblock_read_count=1;
alter session set events ‘immediate trace name trace_buffer_on level 1048576’;
alter session set events ‘10200 trace name context forever, level 1’;
alter session set events ‘8103 trace name ERRORSTACK level 3’;

–EOF–