硬件说明:
操作系统版本:ORACLE LINUX 6.3 64位
数据库版本:11.2.0.3 64位
问题说明:
在检查数据库的alert日志的时候,发现大量的12170和TNS-12535的错误;
Fatal NI connect error 12170.
VERSION INFORMATION:
TNS for Linux: Version 11.2.0.3.0 - Production
Oracle Bequeath NT Protocol Adapter for Linux: Version 11.2.0.3.0 - Production
TCP/IP NT Protocol Adapter for Linux: Version 11.2.0.3.0 - Production
Time: 06-APR-2014 10:46:14
Tracing not turned on.
Tns error struct:
ns main err code: 12535
TNS-12535: TNS:operation timed out
ns secondary err code: 12560
nt main err code: 505
TNS-00505: Operation timed out
nt secondary err code: 110
nt OS err code: 0
Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=110.80.1.83)(PORT=50226))
Sun Apr 06 10:46:24 2014
问题解决:
在metalink平台上面查找,该症状和文档 ID (1628949.1)描述的症状完全一样,根据文档的内容整理如下:
1、出现问题的版本
Oracle Net Services - Version 11.2.0.3 to 12.1.0.1 [Release 11.2 to 12.1]Information in this document applies to any platform.
2、出现错误的症状或报错格式如下:
Fatal NI connect error 12170.VERSION INFORMATION:TNS for 64-bit Windows: Version 11.2.0.3.0 - ProductionOracle Bequeath NT Protocol Adapter for 64-bit Windows: Version 11.2.0.3.0 - ProductionWindows NT TCP/IP NT Protocol Adapter for 64-bit Windows: Version 11.2.0.3.0 - ProductionTime: 22-FEB-2014 12:45:09Tracing not turned on.Tns error struct:ns main err code: 12535TNS-12535: TNS:operation timed outns secondary err code: 12560nt main err code: 505TNS-00505: Operation timed outnt secondary err code: 60nt OS err code: 0***Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=121.23.142.141)(PORT=45679))
The PORT field here is the ephemeral port assigned to the client for this connection. This does not correspond to the listener port.
3、问题的原因
The alert.log message indicates that a connection was terminated AFTER it was established to the instance. In this case, it was terminated 2 hours and 3 minutes after the listener handed the connection to the database.
This would indicate an issue with a firewall where a maximum idle time setting is in place.
The connection would not necessarily be "idle". This issue can arise during a long running query or when using JDBC Thin connection pooling. If there is no data 'on the wire' for lengthy
periods of time for any reason, the firewall might terminate the connection.
4、解决方法:
The following parameter, set at the **RDBMS_HOME/network/admin/sqlnet.ora, can resolve this kind of problem. DCD or SQLNET.EXPIRE_TIME can mimic data transmission between the server and the client during long periods of idle time. SQLNET.EXPIRE_TIME=n Where <n> is a non-zero value set in minutes.
进入ORACLE_HOME/network/admin目录下,添加sqlnet.ora文件,增加一行SQLNET.EXPIRE_TIME=10
5、补充说明SQLNET.EXPIRE_TIME
To specify a time interval, in minutes, to send a check to verify that client/server connections are active. The following usage notes apply to this parameter:
- Setting a value greater than 0 ensures that connections are not left open indefinitely, due to an abnormal client termination.
- If the probe finds a terminated connection, or a connection that is no longer in use, then it returns an error, causing the server process to exit.
- This parameter is primarily intended for the database server, which typically handles multiple connections at any one time.
-
Limitations on using this terminated connection detection feature are:
- It is not allowed on bequeathed connections.
- Though very small, a probe packet generates additional traffic that may downgrade network performance.
- Depending on which operating system is in use, the server may need to perform additional processing to distinguish the connection probing event from other events that occur. This can also result in degraded network performanc
6、做完以上操作后,重启数据库的监听;
其他补充
原因
1、网络攻击,例如:半开连接攻击
Server gets a connection request from a malcious client which is not supposed to connect to the database,in which case the error thrown is the correct behavior.You can get the client address for which the error was thrown via sqlnet log file.
2、Client在default 60秒内没有完成认证
The server receives a valid client connection request but the client tabkes a long time to authenticate more than the default 60 seconds.
3、DB负载太高
The DB server is heavily loaded due to which it cannot finish the client logon within the timeout specified.
WANGING:inbound connection timed out (ORA-3136)
解决方法:
其实这个参数跟监听的一个参数有关:SQLNET.INBOUND_CONNECT_TIMEOUT
这个参数从9i开始引入,指定了客户端连接服务器并且提供认证信息的超时时间,如果超过这个时间客户端没有提供正确的认证信息,服务器会自动中止连接请求,同时会记录试图连接的IP地址和ORA-12170:TNS:Connect timeout occurred错误。
这个参数的引入,主要是防止DoS攻击,恶意攻击者可以通过不停的开启大量连接请求,占用服务器的连接资源,使得服务器无法提供有效服务。在10.2.0.1起,该参数默认设置为60秒。
但是,这个参数的引入也导致了一些相关的Bug。比如:
Bug 5594769 - REMOTE SESSION DROPPED WHEN LOCAL SESSION SHARED AND INBOUND_CONNECT_TIMEOUT SET
Bug 5249163 - CONNECTS REFUSED BY TNSLSNR EVERY 49 DAYS FOR INBOUND_CONNEC_TIMEOUT SECONDS
该参数可以通过设置为0来禁用,在服务端:
1)、设置sqlnet.ora文件:SQLNET.INBOUND_CONNECT_TIMEOUT=0;
2)、设置listener.ora文件:INBOUND_CONNECT_TIMEOUT_listenername=0;
3)、然后reload或者重启监听。