没有见过这么奇怪的Oracle问题!17认证网

正规官方授权
更专业・更权威

没有见过这么奇怪的Oracle问题!

前几天冯老板说北区有个客户遇到一个很奇怪的问题,大意是北区同事在对客户的Oracle数据库进行duplicate复制时,每次跑到99%就挂掉了。

听到这么神奇的事儿,瞬间就激发了我的兴趣。

由于当天晚上没有提供任何日志和数据给我,因此全靠猜测了。

我们知道rman duplicate是基于TNS的,那么也就是要走Oracle SQL*net协议。为什么运行一段时间会话就会断开呢?

最开始同事以为是防火墙的问题,通过在sqlnet.ora中设置dcd参数,发现实际上并不能解决问题。

看上去还是要再深入理解一下SQL*net的基本原理才有可能找到解决方案。

编辑搜图

基于前期同事的一些基本信息反馈,我判断问题可能出在TTC或者Oracle  Net层。

这里为了解释判断,我贴一下oracle sqlnet管理手册的官方说明:

Presentation Layer

Character set differences can occur if the client and database server run on different operating systems. The presentation layer resolves any differences. It is optimized for each connection to perform conversion when required. The presentation layer used by client/server applications is Two-Task Common (TTC). TTC provides character set and data type conversion between different character sets or formats on the client and database server. At the time of initial connection, TTC is responsible for evaluating differences in internal data and character set representations and determining whether conversions are required for the two computers to communicate.

Oracle Net Foundation Layer

The Oracle Net foundation layer is responsible for establishing and maintaining the connection between the client application and database server, as well as exchanging messages between them. The Oracle Net foundation layer can perform these tasks because of Transparent Network Substrate (TNS) technology. TNS provides a single, common interface for all industry-standard OSI transport and network layer protocols. TNS enables peer-to-peer application connectivity, where two or more computers can communicate with each other directly, without the need for any intermediary devices. On the client side, the Oracle Net foundation layer receives client application requests and resolves all generic computer-level connectivity issues, such as:

■ The location of the database server or destination

■ How many protocols are involved in the connection

■ How to handle interrupts between client and database server based on the capabilities of each On the server side, the Oracle Net foundation layer performs the same tasks as it does on the client side. It also works with the listener to receive incoming connection requests.

In addition to establishing and maintaining connections, the Oracle Net foundation layer communicates with naming methods to resolve names and uses security services to ensure secure connections.

因为是rman跑了一段时间后面就中断了,而且客户已经打通了中间防火墙等等,因此从解释来看,当时我就怀疑是不是Oracle Net在处理中断时出现了问题。

然后根据这个思路搜了一下sqlnet.ora的参数说,发现了如下的文档。 

Data exception or break is a function in Oracle NET that allows a transaction to be interrupted before it is completed. It returns both the client and the server to a condition from which they can continue. A break such as Ctrl-c can be sent as part of the normal data stream (inband), or as a separate asynchronous message (outband). An outband break is much faster and interrupts the flow of data. Out Of Band Breaks (OOB) are enabled by default provided the underlying protocol supports sending urgent data.

 If the parameter DISABLE_OOB is set to OFF then it enables Oracle Net to send and receive “break” messages using urgent data provided by the underlying protocol. If turned on, disables the ability to send and receive “break” messages using urgent data provided by the underlying protocol. Out of band breaks are communication breaks that occur on the underlying network level. 

On occasions these break packets can cause the client and server process to become out of sync. By setting DISABLE_OOB=ON you can force both the client and server to use in-band break. Oracle TWO_TASK layer has break/reset logic to make sure that both the client and server are in sync. 

The break/reset logic works effectively if Operating System support OOB (out of band breaks), otherwise there might be TWO_TASK error followed by ORA-3113. So, setting the parameter DISABLE_OOB=ON in order to avoid these TWO_TASK/ORA-3113 errors in the above situation makes sense. DISABLE_OOB is set in the sqlnet.ora file.

当时晚上就建议同事调整上述参数 应该就能解决。 

没想到第二天同事拉微信群,发了相关的错误,更进一步验证了我的猜测。

编辑搜图

实际上sqlnet.ora中设置disable_oob=on之后,再次执行rman 复制操作,据同事反馈没有再出现过中断的情况了。

老实说,这个问题我也是一次遇到,简单记录一下!

更为详细的信息可以参考:Net Services Administrator’s Guide

想了解更多行业资讯

扫码关注👇

了解更多考试相关

扫码添加上智启元官方客服微信👇

未经允许不得转载:17认证网 » 没有见过这么奇怪的Oracle问题!
分享到:0

评论已关闭。

400-663-6632
咨询老师
咨询老师
咨询老师