Ivan Kartik - Oracle and Linux Blog

How to not put company in to a risk (from DBA perspective)

Let me start this post with following quotation "People are definitely a company's greatest asset. A company is only as good as the people it keeps".
I fully agree with these sentences because regarding current topic it's definitely true and appropriate.

Internet is a great thing, it's a perfect source of information. Back to 1998 in times when I started with my first steps with Oracle Database there wasn't so many public knowledge sources about it. Software had been shipped on CDs and it had been very rare to see online documentation. Not many books in the book store, no Blogs and of course no Oracle ACE program as well :-). Those days are just memories now and by the time Internet became great source of information or software like Oracle Database XE thus place where one can get great portion of knowledge or even lessons learnt from other people thus it's very easy to leverage the experience. The side effects of this transformation of Internet is that many times people take information found there as a single point of truth, many people have slow down their skills development, because many solutions for potential problems can be easily found there moreover less experienced people don't study things closely they found or they are not so careful. The potential result could be at least unpleasant.


1. Your company may become under-licensed

I hear very often that "Oracle is too expensive" or "We pay a lot of money for it". Honestly I agree, it is expensive but also has lot to offer for that price. But most of the time when I hear such complains, I always ask questions like "Does it need to be?", "Do you really need all stuff (options, packs or other features) you pay for?". But there is another potential problem I'd like to point out in this paragraph. Every company has (should have) department or specific group of people responsible for Software Asset Management (or precisely Software Licenses Management). They maintain the evidence of license contracts, current price holds and usually they should have software repository which holds all information about used software and purchased licenses. What does it with IT staff like DBAs? What does it have with Oracle as technology itself? A quite lot.

In old times DBA's world and scope was everything what was stored or has happened within a database. Experienced DBA knew that not just DB configuration and application code have influenced database performance, so they felt that it has been necessary to see things in full picture thus to have solid understanding of full stack from the bottom to the top which means to understand things around storage layer, OS layer, network layer, DB layer (of course) and application layer as well (SQL and PL/SQL). This knowledge is and always has been influential to the fast identification and resolution of issues. In fact exactly this approach came later known as DBA 2.0 - The Next Generation DBA initiative back in 2008 (perhaps too late but at least...).
Database area has moved further since 2008 (now probably it's time for DBA 3.0) and in these days modern DBA should be familiar with current trends, e.g. Cloud technologies, automation and also probably the most underestimated thing in DBA's world - Licensing.
As I stated in the beginning of this article amount of information on Internet growth over years dramatically and it is possible to find many Cookbooks, HOWTOs or even scripts.

Consider this example taken from known web page found on Internet listed in top 3 results of search engine) according to page rank it's obvious that this page might have huge number of visitors:

---------------------------------
If you need to create a snapshot manually, because you don’t like the one-hour interval, or if you disabled taking snapshots at all:

EXEC DBMS_WORKLOAD_REPOSITORY.create_snapshot;

Create a report:

@$ORACLE_HOME/rdbms/admin/awrrpt.sql

This script will ask you for the format of the report (html or plain text), the snapshot ID for start and end of the report (an overview of the last n day’s snapshots is given) and for a report name, that’s used for the report file name (.html or .lst).

Stay happy with your databases,
---------------------------------

Now what's the catch here? Let me ask you couple of questions:

- Did you know that AWR feature is part of Diagnostic pack which is separately licensed pack?
- Does have your Oracle Database Diagnostic pack licensed?
- Does your instance control usage of Diagnostic pack?

If your answer was 3x "No" (or "No" to last two questions) your database might be under-licensed at this moment. In this particular case the absence of this knowledge and using the default configuration settings resulted to under-licensed environment. Answer for the second question should be given by department or team responsible for Software Asset Management (mentioned at the top of this article).
Each new version of the database comes with new and interesting features, not just for administrators but also for developers. It's almost impossible to expect from developer that he/she would know how the features are licensed but this is the point which DBA should be aware of, moreover only DBA is able to control usage of extra paid features.
Well perhaps, in this case the script is free of charge but database software and especially feature which is being used is not free at all.
Let's say that your DB environment has (just) 4xCPU (Intel) cores while list price (CPU metric) for Diagnostic pack is approx. 5000 USD (check your price hold), that means approx. 10000 USD (CapEx) and (approx.) 2200+ USD (OpEx every year) "damage" for your company. The web page states "Stay happy with your databases", unfortunately in this perspective it sounds more like sarcasm. Although Diagnostic pack is very useful pack but this is not definitely the best way how to tell to the management "We need it!".

I checked several Blogs listed in top 20 results given by my favourite search engine. Almost of them were from well known people in Oracle community (most of them had been written by members of the Oracle ACE program, thus more less trusted persons in the Oracle community) but only one of them contained disclaimer or warning about usage of extra paid option/pack. But also this particular page provides scripts where the disclaimer/warning is missing among other remarks in top of the script.

Knowledge is key here and study of new features shouldn't be limited to features only but I believe that study of licensing implications should be part of your standard practice. This is probably bad message for a Google/Stackoverflow DBAs, which are (very often) lazy and refusing continuous self-education, relying on the knowledge of the other's instead on their own. Unfortunately during my consultancies I often see that database environments are not licensed correctly, let's say 7 of 10 environments and that's really bad ratio.
Note that Licensing is really complex topic (don't hesitate to hire some consultant well experienced in this area) but to understand the basics could protect you from potential (non-technical) problems and your company as well.


2. Your system may become accidentaly unstable, unaccessible or even compromized

This paragraph is about different type of hidden danger while the source (Internet of course) remains the same. I'm not going to talk about it's generally bad idea to have accessible listener/dispatcher port accessible from Internet, about using of default passwords or other basic things. Let me explain the issue by example. I have created simple demo which is using the same command as in previous paragraph, you can find it on this link: http://ivan.kartik.sk/demos/pjdemo.html

Have you ever heard about "PasteJacking"? So, this is it. Now consider less skilled DBA with lack of knowledge, with bad (or not so good) stress factor in stressful situation such as performance degradation, outage of mission or business critical database. He finds a "quick-win solution", a script published in some article on the Internet. What he will do in this situation?
Now consider the script which may contain DROP commands or other sophisticated evil code...

Know the source and even if source is page of well known person in the community or industry doesn't necessarily mean that his pages weren't compromised thanks to bug in publishing system he is using.
You should always be careful and check and fully understand what you are going to "paste" to the SQLPlus, shell, etc.

Another story is using "solution" which is not appropriate for the problem. Some time ago some guys were facing to some issue where the flushing of buffer cache was a workaround that helped. Of course they found this workaround on Internet. You may guess what happened during another completely different issue although with similar symptoms, but unfortunately not the same. Yes, flushing of buffer cache was the first thing that guys tried in order to "fix" the problem. It didn't work for that particular issue moreover it has created another one and outstanding issue suddenly became even worse.
Well, glucose is commonly used in medicine but don't try give it to a patient with diabetes when he fainted due to lack of insulin.

Rock solid knowledge and adopting a good practices or habits is the key how to not put your database in to problems or even danger. With rock solid knowledge you will understand Blogs and articles published on Internet as a additional information sources and not as a easy problem solvers. Knowledge is the asset for thus skilled people make company's greatest asset.

Let me finish this post with a funny quote: "Don't believe everything you read on or copying from the Internet. Abraham Lincoln"

Installation of Oracle Database 12c Release 2 on Redhat 7, Oracle Linux 7

Here is short installation HOWTO for Oracle Database 12c Release 2 for Redhat 7, Oracle Linux 7 (and possibly CentOS 7 which is not officially certified and supported distribution). This post covers Oracle Database software installation (only). In case you want to use ASM or other Grid features and you don't have Grid Infrastructure 12c R2 installed refer to my previous post: http://ivan.kartik.sk/index.php?controller=post&action=view&id_post=76 If you don't need ASM or GI features you can continue with this part directly.
This post also contains some steps which were used during Oracle Grid Infrastructure 12c R2 installation, so you can skip those if you followed my GI installation HOWTO.

1. OS preparation

# - stands for "root" action
$ - stands for "oracle" (owner of Oracle software)

Note: You can skip this step if you have installed the Grid Infrastructure Software.

# groupadd dba
# groupadd oinstall
# useradd -g oinstall -G dba oracle

Create ORACLE_HOME directory. (Note: In my example I use /opt/oracle/12201 directory, change it to your preferred location):

# mkdir -p /opt/oracle/12201
# chown -R oracle:oinstall /opt/oracle

2. Install required packages

Note: You can skip this step completely once you have installed the Grid Infrastructure Software (as described in my previous post related to Grid Infrastructure installation).

First install required packages (use this command if your computer has access to Internet or you have registered and enabled your media as yum repository).

Add/set kernel parameters to/in /etc/sysctl.d/99-sysctl.conf file:

fs.file-max = 6815744
kernel.sem = 250 32000 100 128
kernel.shmmni = 4096
kernel.shmall = 1073741824
kernel.shmmax = 4398046511104
kernel.panic_on_oops = 1
net.core.rmem_default = 262144
net.core.rmem_max = 4194304
net.core.wmem_default = 262144
net.core.wmem_max = 1048576
net.ipv4.conf.all.rp_filter = 2
net.ipv4.conf.default.rp_filter = 2
fs.aio-max-nr = 1048576
net.ipv4.ip_local_port_range = 9000 65500

Execute sysctl -p command to apply settings:

# sysctl -p

Add/set limits to/in /etc/security/limits.d/99-oracle-rdbms.conf file:

oracle   soft   nofile    1024
oracle   hard   nofile    65536
oracle   soft   nproc    16384
oracle   hard   nproc    16384
oracle   soft   stack    10240
oracle   hard   stack    32768
oracle   hard   memlock    134217728
oracle   soft   memlock    134217728

4. Software download and install

Download installation archive (linuxx64_12201_database.zip) from http://www.oracle.com/technetwork/database/enterprise-edition/downloads/oracle12c-linux-12201-3608234.html

Unpack the installation archive and start the installer:

$ unzip linuxx64_12201_database.zip
$ cd database
$ ./runInstaller

Next steps after installation

In case you want startup your Oracle Database(s) automatically and you don't have Oracle Grid Infrastructure installed, you can follow the steps described in the post Configuring automatic startup of Oracle Database under systemd on RHEL 7/OEL 7/CentOS 7 at http://ivan.kartik.sk/index.php?controller=post&action=view&id_post=65

 

 

 

Installation of Oracle Grid Infrastructure 12c Release 2 on Redhat Linux 7 and Oracle Linux 7

I prepared short installation HOWTO Oracle Grid Infrastructure 12c Release 2 for Redhat 7, Oracle Linux 7 (and Centos 7). Installation on first two (supported) Linux distros was almost fine, just some workaround were needed. But you can expect some problems on (unsupported) Centos distro where some workarounds were needed in order to have fully working product. These specific workarounds you can at the end of this post. This post also tries to answer some potential questions you may have.

Oracle 12cR2 brought interesting new feature called Oracle ASM Filter Driver (ASMFD) which is in fact replacement for ASMLIB. As official documentation states ASMFD simplifies configuration and management of disk devices for Oracle ASM (just like AMSLIB does) moreover ASMFD is using a "Filter Driver" that rejects any invalid I/O request in order to protect ASM disks by filtering out non-Oracle I/O operations which could cause accidental overwrites thus corruption. Note that ASMFD cannot be used simultaneously with ASMLIB.

So (logically) this article covers installation of Oracle GI 12.2.0.1 aka 12cR2 for x86-64 Linux using ASMFD feature. I've used "Minimal" version of Linux installation.

1. OS preparation

# - stands for "root" user action
$ - stands for "oracle" user (owner of Oracle software) action

# groupadd dba
# groupadd oinstall
# useradd -g oinstall -G dba oracle

Create $GI_HOME directory. (Note: In my example I use /opt/oraclecrs/12201 directory, change it to your preferred location):

# mkdir -p /opt/oraclecrs/12201
# chown -R oracle:oinstall /opt/oraclecrs

2. Install required packages

Install required packages (use this command if your computer has access to Internet or you have registered and enabled your media as yum repository). Note: Some packages are required by Database installation which I expect it will be the next step once Oracle GI has been installed. If you are using Oracle Linux or in case of Redhat if have configured OL public YUM repository then you can use "yum install oracle-database-server-12cR2-preinstall" and it will install all required packages and also perform Linux parameters configuration described in next step (So, you can skip this step and step#3 as well). Note, in case of Redhat this package will install also UEK kernel.

# yum install unzip xorg-x11-utils xorg-x11-apps xorg-x11-utils compat-libcap1 libstc++-devel \ 
gcc-c++ glibc-devel nfs-utils-1.2.3-15 ksh libaio-devel sysstat smartmontools

3. Set OS kernel parameters and user limits

Add/set kernel parameters to/in /etc/sysctl.d/99-sysctl.conf

fs.file-max = 6815744
kernel.sem = 250 32000 100 128
kernel.shmmni = 4096
kernel.shmall = 1073741824
kernel.shmmax = 4398046511104
kernel.panic_on_oops = 1
net.core.rmem_default = 262144
net.core.rmem_max = 4194304
net.core.wmem_default = 262144
net.core.wmem_max = 1048576
net.ipv4.conf.all.rp_filter = 2
net.ipv4.conf.default.rp_filter = 2
fs.aio-max-nr = 1048576
net.ipv4.ip_local_port_range = 9000 65500

Execute sysctl -p command:

# sysctl -p

Add/set limits to/in /etc/security/limits.d/99-oracle-rdbms.conf:

oracle   soft   nofile    1024
oracle   hard   nofile    65536
oracle   soft   nproc    16384
oracle   hard   nproc    16384
oracle   soft   stack    10240
oracle   hard   stack    32768
oracle   hard   memlock    134217728
oracle   soft   memlock    134217728

4. Software download and deploy

Create install directory in /home/oracle for downloaded software:

$ mkdir /home/oracle/install

Download linuxx64_12201_grid_home.zip file from http://www.oracle.com/technetwork/database/enterprise-edition/downloads/oracle12c-linux-12201-3608234.html to the /home/oracle/install directory.

Deploy GRID software:

$ cd /opt/oraclecrs/12201
$ unzip -q /home/oracle/install/linuxx64_12201_grid_home.zip

Note: Since Oracle 12cR2 you can unzip archive within $GI_HOME.

5. Install package cvuqdisk RPM package

Install $GI_HOME/cv/rpm/cvuqdisk-1.0.10-1.rpm package from $GI_HOME. Note in case of Oracle Linux with internet connection or Redhat with configured OL public YUM repository you can use "yum install cvuqdisk" command.

# yum localinstall /opt/oraclecrs/12201/cv/rpm/cvuqdisk-1.0.10-1.rpm

6. ASM disk configuration (ASMFD)

Now we need to prepare (stamp) the disks that will be used for Oracle ASM. Note in my case /dev/sdb and /dev/sdc are dedicated for ASM.

# /opt/oraclecrs/12201/bin/asmcmd afd_label DATA_0000 /dev/sdb --init
# /opt/oraclecrs/12201/bin/asmcmd afd_label DATA_0001 /dev/sdc --init

Just simple check if disks are prepared:

# /opt/oraclecrs/12201/bin/asmcmd afd_lslbl /dev/sdb
--------------------------------------------------------------------------------
Label                     Duplicate  Path
================================================================================
DATA_0000                             /dev/sdb

# /opt/oraclecrs/12201/bin/asmcmd afd_lslbl /dev/sdc
--------------------------------------------------------------------------------
Label                     Duplicate  Path
================================================================================
DATA_0001                             /dev/sdc

Possible error:

# ./bin/asmcmd afd_label DATA_0000 /dev/sdb
/opt/oraclecrs/12201/bin/kfod: line 22: /scratch/app/user/product/12.2.0/grid/bin/kfod.bin: No such file or directory

Solution:

Modify $GI_HOME/bin/kfod (in my case /opt/oraclecrs/12201/bin/kfod ) and replace path in OHOME variable to correct one.
Change this entry: OHOME=/scratch/app/user/product/12.2.0/grid to this: OHOME=/opt/oraclecrs/12201

 

7. Run gridSetup in order to configure Grid infrastructure

$ cd /opt/oraclecrs/12201
$ ./gridSetup.sh

 8. Use "Configure Oracle ASM Filter Driver" option during ASM disk group creation

 undefined

When your installation was finished successfully you can log on to ASM instance in order to check v$asm_disk.

 

SQL> select path, name, library, header_status from v$asm_disk;

PATH            NAME            LIBRARY                                                          HEADER_STATU
--------------- --------------- ---------------------------------------------------------------- ------------
AFD:DATA_0000   DATA_0000       AFD Library - Generic , version 3 (KABI_V3)                      MEMBER
AFD:DATA_0001   DATA_0001       AFD Library - Generic , version 3 (KABI_V3)                      MEMBER

As you can see AFD Library (thus ASMFD) is being used. Output is very similar to output from ASM Library (ASMLIB).
Just for comparison there is output from ASMLIB configuration:

SQL> select path, name, library, header_status from v$asm_disk;

PATH            NAME            LIBRARY                                                          HEADER_STATU
--------------- --------------- ---------------------------------------------------------------- ------------
ORCL:DATA_0000  DATA_0000       ASM Library - Generic Linux, version 2.0.12 (KABI_V2)            MEMBER
ORCL:DATA_0001  DATA_0001       ASM Library - Generic Linux, version 2.0.12 (KABI_V2)            MEMBER

 

Possible Errors and Solutions:

Problem:
[INS-41223] ASM Filter Driver is not supported on this platform

AFD-620: AFD is not supported on this operating system version: ...
AFD-9201: Not Supported

Solution: Well, message is clear, you are using unsupported Linux distribution (e.g.: Centos). If you have a good reason to play with Centos see "Centos specific important steps" bellow, otherwise I would suggest to use certified Linux distribution (e.g.: Redhat Enteprise Linux, Oracle Linux).


Problem:
[INS-41223] ASM Filter Driver is not supported on this platform
Action - To proceed, do not specify or select the Oracle ASM Filter Driver option.
Additional Information:
AFD-9202: AFD can not be installed/loaded because ASMLib is installed.
AFD-9201: Not Supported

Solution: Again, message is very clear. You can not use ASMFD while ASMLIB is in use. So you have to choose one. If there is reason for using ASMLIB then don't check the "Configure Oracle ASM Filter Driver" option during ASM disk group creation.

Problem:
bin/kfod: line 22: /scratch/app/user/product/12.2.0/grid/bin/kfod.bin: No such file or directory

Solution:
Edit $GI_HOME/bin/kfod script and replace path in OHOME=/scratch/app/user/product/12.2.0/grid to correct one.

 

Centos specific important steps:

So you are stubborn, well then... :-) Couple of tweaks have to be perfomed prior "gridSetup" execution (before step #7).

1. Edit $GI_HOME/lib/osds_acfslib.pm file and add following line:

($release =~ /^centos-release/) ||    # Centos Linux

- after line 359 (as next line after "enterprise-release" condition)
- after line 911 (as next line "enterprise-release" condition, originally at 910 line)
- after line 1146 (as next line "enterprise-release" condition, originally at 1144 line)

2. Edit /etc/redhat-release file

# cp /etc/redhat-release /etc/redhat-release.old
# sed -e 's/Centos/Redhat/' /etc/redhat-release.old > /etc/redhat-release

 Now you can execute gridSetup.

 

Next step after installation:

Installation of Oracle Database 12c Release 2 Database Software - http://ivan.kartik.sk/index.php?controller=post&action=view&id_post=78

 

Oracle Database 12c Release 2 available for download

Pretty long awaited second release of 12c (On-Premise) version is (apparently) officially available for download at: http://www.oracle.com/technetwork/database/enterprise-edition/downloads/index.html

Currently Oracle Solaris SPARC, Oracle Solaris x86-64 and (of course) Linux x86-64 versions are available.
As always I suggest to read a New features Guide in order to explore new things brought by current release: http://docs.oracle.com/database/122/NEWFT/toc.htm

Enjoy learning and exploring...

 

Oracle Database 12cR2 online documentation available

Following the Cloud-first strategy Oracle has announced the general availability of Oracle Database 12c Release 2 via the new Oracle Exadata Express Cloud Service. Software for on-premises installation has not been released yet but if you want to learn what's new or what is coming in the 12cR2 you can browse documentation which is available at: http://docs.oracle.com/en/database/

ACFS issue with latest UEK3 kernels on OEL 7/RHEL 7

One of my colleagues has been facing to weird behaviour of ACFS during installation of two nodes RAC on RHEL 7 and OEL7 (started on RHEL, then tried OEL). ACFS volume creation (during mkfs execution) one of the nodes has either hung or thrown error like this:

mkfs.acfs: version                   = 12.1.0.2.0
mkfs.acfs: on-disk version           = 39.0
mkfs.acfs: volume                    = /dev/asm/goldengate-32
mkfs.acfs: CLSU-00100: operating system function: ioctl failed with error data: 1
mkfs.acfs: CLSU-00101: operating system error message: Operation not permitted
mkfs.acfs: CLSU-00103: error location: OI_0
mkfs.acfs: ACFS-00546: failed to change on-disk signature
mkfs.acfs: ACFS-01004: /dev/asm/goldengate-32 was not formatted.

After couple of days of his struggling I decided to help him to find out the problem because my colleague is very experienced thus I knew that he has done everything correctly (well, definitely it wasn't his first rodeo). Moreover I have done similar installation just few days before, without any problem. And that was interesting. Firstly I checked logs and of course configuration (udev, multipath, etc.), it had been correct and traced (using one of my closest friends since 2000) that process with this result:

2216  stat("/dev/asm/goldengate-34", {st_mode=S_IFBLK|0770, st_rdev=makedev(251, 17409), ...}) = 0
2216  open("/dev/ofsctl", O_RDWR)       = 9
2216  ioctl(9, 0xffffffffc1387015, 0x7ffdbee52880) = -1 EPERM (Operation not permitted)

 And this was really weird as permissions for these devices are defined by Udev configuration which is created during installation.

Gathered all important information about environment I've created an action plan. Installation of the same environment on my virtual machine. Well I'd rather say almost the same as I decided to not install the clustered environment in first step and of course used hardware was different and hypervisor has been as well. Version of all software was pretty much same as in original (my colleague's) environment, except one so important thing - Linux kernel. I decided to use original (non UEK) kernel first.
Installation went smoothly, ACFS had smooth configuration and later functionality too, so two options (or unanswered question) remained - whether it's kernel related or RAC related problem where the first option was my preference according to previous tracing.
So I've installed exact version (3.8.13-118.6.2.el7uek) of Unbreakable Kernel provided by Oracle just as my colleague did. And here is the result (Note that it was an intention to perform all steps manually):

$ uname -a
Linux el1 3.8.13-118.6.2.el7uek.x86_64 #2 SMP Thu May 19 13:15:51 PDT 2016 x86_64 x86_64 x86_64 GNU/Linux
$ acfsdriverstate supported
ACFS-9200: Supported

# acfsroot install
ACFS-9300: ADVM/ACFS distribution files found.
ACFS-9307: Installing requested ADVM/ACFS software.
ACFS-9308: Loading installed ADVM/ACFS drivers.
ACFS-9321: Creating udev for ADVM/ACFS.
ACFS-9323: Creating module dependencies - this may take some time.
ACFS-9154: Loading 'oracleoks.ko' driver.
ACFS-9154: Loading 'oracleadvm.ko' driver.
ACFS-9154: Loading 'oracleacfs.ko' driver.
ACFS-9327: Verifying ADVM/ACFS devices.
ACFS-9156: Detecting control device '/dev/asm/.asm_ctl_spec'.
ACFS-9156: Detecting control device '/dev/ofsctl'.
ACFS-9309: ADVM/ACFS installation correctness verified.

# lsmod | grep ora
oracleacfs           3498177  0
oracleadvm            594197  0
oracleoks             503994  2 oracleacfs,oracleadvm

$ asmcmd volinfo --all
no volumes found
$ asmcmd volcreate -G GG GOLDENGATE -s 4G
$ asmcmd volinfo --all
Diskgroup Name: GG

         Volume Name: GOLDENGATE
         Volume Device: /dev/asm/goldengate-34
         State: ENABLED
         Size (MB): 4096
         Resize Unit (MB): 64
         Redundancy: UNPROT
         Stripe Columns: 8
         Stripe Width (K): 1024
         Usage:
         Mountpath:

$ mkfs -t acfs /dev/asm/goldengate-34
mkfs.acfs: version                   = 12.1.0.2.0
mkfs.acfs: on-disk version           = 39.0
mkfs.acfs: volume                    = /dev/asm/goldengate-34
mkfs.acfs: CLSU-00100: operating system function: ioctl failed with error data: 1
mkfs.acfs: CLSU-00101: operating system error message: Operation not permitted
mkfs.acfs: CLSU-00103: error location: OI_0
mkfs.acfs: ACFS-00546: failed to change on-disk signature
mkfs.acfs: ACFS-01004: /dev/asm/goldengate-34 was not formatted.

Bingo! I've got the same problem on my test machine, so I was able to reproduce the problem. Now it's clear that is not related to a clustered configuration. Ok, let's see the trace output:

2061  stat("/dev/asm/goldengate-34", {st_mode=S_IFBLK|0770, st_rdev=makedev(251, 17409), ...}) = 0
2061  open("/dev/ofsctl", O_RDWR)       = 9
2061  ioctl(9, 0xffffffffc1387015, 0x7ffd26305310) = -1 EPERM (Operation not permitted)

and permissions:

$ ll /dev/asm/.asm_ctl_spec
brwxrwx--- 1 root dba 251, 0 Jun  1 23:13 /dev/asm/.asm_ctl_spec
$ ll /dev/asm/goldengate-34
brwxrwx--- 1 root dba 251, 17409 Jun  1 23:24 /dev/asm/goldengate-34
$ ll /dev/ofsctl
brw-rw-r-- 1 root dba 250, 0 Jun  1 23:24 /dev/ofsctl

Output from tracing was the same and permissions are the same as defined in Udev rules and also the are correct. So according to this result I decided to remove ACFS related configuration and check the configuration and functionality of ACFS again using several older UEK3 kernels. Here is the result from the test of one of them (3.8.13-118.4.2.el7uek.x86_64):

$ uname -a
Linux el1 3.8.13-118.4.2.el7uek.x86_64 #2 SMP Tue Mar 22 20:46:48 PDT 2016 x86_64 x86_64 x86_64 GNU/Linux
$ acfsdriverstate supported
ACFS-9200: Supported

# acfsroot install
ACFS-9300: ADVM/ACFS distribution files found.
ACFS-9307: Installing requested ADVM/ACFS software.
ACFS-9308: Loading installed ADVM/ACFS drivers.
ACFS-9321: Creating udev for ADVM/ACFS.
ACFS-9323: Creating module dependencies - this may take some time.
ACFS-9154: Loading 'oracleoks.ko' driver.
ACFS-9154: Loading 'oracleadvm.ko' driver.
ACFS-9154: Loading 'oracleacfs.ko' driver.
ACFS-9327: Verifying ADVM/ACFS devices.
ACFS-9156: Detecting control device '/dev/asm/.asm_ctl_spec'.
ACFS-9156: Detecting control device '/dev/ofsctl'.
ACFS-9309: ADVM/ACFS installation correctness verified.
# lsmod | grep ora
oracleacfs           3498177  0
oracleadvm            594197  0
oracleoks             503994  2 oracleacfs,oracleadvm

$ asmcmd volinfo --all
no volumes found
$ asmcmd volcreate -G GG GOLDENGATE -s 4G
$ asmcmd volinfo --all
Diskgroup Name: GG

         Volume Name: GOLDENGATE
         Volume Device: /dev/asm/goldengate-34
         State: ENABLED
         Size (MB): 4096
         Resize Unit (MB): 64
         Redundancy: UNPROT
         Stripe Columns: 8
         Stripe Width (K): 1024
         Usage:
         Mountpath:

$ mkfs -t acfs /dev/asm/goldengate-34
mkfs.acfs: version                   = 12.1.0.2.0
mkfs.acfs: on-disk version           = 39.0
mkfs.acfs: volume                    = /dev/asm/goldengate-34
mkfs.acfs: volume size               = 4294967296  (   4.00 GB )
mkfs.acfs: Format complete.

# mount -t acfs /dev/asm/goldengate-34 /mnt/
# mount | grep mnt
/dev/asm/goldengate-34 on /mnt type acfs (rw,relatime,device,rootsuid,ordered)
# touch /mnt/testfile
# ll /mnt/testfile
-rw-r--r-- 1 root root 0 Jun  2 00:52 /mnt/testfile

 It works, we were able to format, mount and use the ACFS volume thus now we know that downgrade the UEK3 kernel is a workaround which solves the issue with ACFS until time when bug will be fixed by newer release of UEK3 kernel.

 A bonus case: What if the Sys Admin will upgrade kernel thus we use the "buggy" version of UEK3 kernel on already configured ACFS volumes? I did several repeating tests on already configured and previously working ACFS volume. Let see what will happen:

1st run:

$ uname -a
Linux el1 3.8.13-118.6.2.el7uek.x86_64 #2 SMP Thu May 19 13:15:51 PDT 2016 x86_64 x86_64 x86_64 GNU/Linux

# acfsload start
ACFS-9391: Checking for existing ADVM/ACFS installation.
ACFS-9392: Validating ADVM/ACFS installation files for operating system.
ACFS-9393: Verifying ASM Administrator setup.
ACFS-9308: Loading installed ADVM/ACFS drivers.
ACFS-9154: Loading 'oracleoks.ko' driver.
ACFS-9154: Loading 'oracleadvm.ko' driver.
ACFS-9154: Loading 'oracleacfs.ko' driver.
ACFS-9327: Verifying ADVM/ACFS devices.
ACFS-9156: Detecting control device '/dev/asm/.asm_ctl_spec'.
ACFS-9156: Detecting control device '/dev/ofsctl'.
ACFS-9322: completed

$ asmcmd volenable -G GG GOLDENGATE
$ asmcmd volinfo --all
Diskgroup Name: GG

         Volume Name: GOLDENGATE
         Volume Device: /dev/asm/goldengate-34
         State: ENABLED
         Size (MB): 4096
         Resize Unit (MB): 64
         Redundancy: UNPROT
         Stripe Columns: 8
         Stripe Width (K): 1024
         Usage: ACFS
         Mountpath: /mnt

# mount -t acfs /dev/asm/goldengate-34 /mnt/
# mount | grep mnt
/dev/asm/goldengate-34 on /mnt type acfs (rw,relatime,device,rootsuid,ordered)
# touch /mnt/testfile2
# ll /mnt/testfile*
-rw-r--r-- 1 root root 0 Jun  2 00:52 /mnt/testfile
-rw-r--r-- 1 root root 0 Jun  2 01:18 /mnt/testfile2

Result of 1st run - working.

2nd run:

# Kernel panic - not syncing: Holding spin lock
rid: 1925, comm: mount Tainted: PF 0 3.8.13-118.6.2.e17uek.x86_64 #2
all Trace:
[<ffffffff81574fb0>] panic+Oxc8/0x1d7
[<ffffffffa0484a7a>] __KsPanic+0x9a/Oxa0 [oracleoks]
[<ffffffffa072275d>] ? OfsDoMountRecovery+Oxldd/Ox2b0 [oracleacfs]
[<ffffffffa0722a5d>] OfsDoPhaselRecovery+0x22d/Ox4e0 [oracleacfs]
[<ffffffffa0722e28>] OfsWaitForRecoveryToComplete+0x118/0x3c0 [oracleacfs]
[<ffffffffa0667446>] OfsSetupVolume+Oxc6/0x2ff0 [oracleacfs]
[<ffffffffa048507c>] ? KsSleepEvent+Ox8c/Oxce [oracleoks]
[<ffffffff81081e90>] ? wake_up_bit+Ox30/0x30
[<ffffffffa04851bc>] ? KsCreateSystemThread+Ox10c/Ox1b0 [oracleoks]
[<ffffffffa066b6dd>] OfsMountVolume+0x136d/Ox27e0 [oracleacfs]
[<ffffffffa0766e8f>] ofs_fill_sb+0x6f/Ox7e0 [oracleacfs]
[<ffffffff8118af58>] mount_bdev+0x1b8/0x200
[<ffffffffa0766e20>] ? ofs_parse_flags+0x140/0x140 [oracleacfs]
[<ffffffffa0763ec7>] ofs_mount+Ox107/0x2f0 [oracleacfs]
[<ffffffff8118b8e9>] mount_fs+0x39/0x1b0
[<ffffffff811a5dc7>] ? alloc_vfsmnt+Oxd7/0x1b0
[<ffffffff811a5f3f>] vfs_kern_mount+Ox5f/Oxf0
[<ffffffff811a8250>] do_mount+0x220/0xaf0
[<ffffffff8114343b>] ? strndup_user+Ox4b/Oxf0
[<ffffffff811a8ba3>] sys_mount+0x83/0xc0
[<ffffffff81587179>] system_call_fastpath+0x16/0x1b

Result of the 2nd run was kernel panic, machine hung of course.

3rd run:

# acfsload start
ACFS-9391: Checking for existing ADVM/ACFS installation.
ACFS-9392: Validating ADVM/ACFS installation files for operating system.
ACFS-9393: Verifying ASM Administrator setup.
ACFS-9308: Loading installed ADVM/ACFS drivers.
ACFS-9327: Verifying ADVM/ACFS devices.
ACFS-9156: Detecting control device '/dev/asm/.asm_ctl_spec'.
ACFS-9156: Detecting control device '/dev/ofsctl'.
acfsutil plogconfig: CLSU-00100: operating system function: ioctl failed with error data: 22
acfsutil plogconfig: CLSU-00101: operating system error message: Invalid argument
acfsutil plogconfig: CLSU-00103: error location: OI_0
acfsutil plogconfig: ACFS-03500: Unable to access kernel persistent log entries.
ACFS-9225: Failed to start OKS persistent logging.
ACFS-9322: completed

Result of 3rd run: finished with CLSU-00100: operating system function: ioctl failed with error data: 22 , CLSU-00101: operating system error message: Invalid argument, CLSU-00103: error location: OI_0 error messages. After several tries of manual startup it has started successfully.

I did seven more runs and final result was 4x working, 5x  kernel panic, 1x error message during startup, so there is 60% chance that your system won't work.

Final conclusions

  • problem is not related to clustered environment
  • based on several tests (not shown in this article) 11gR2 (11.2.0.4 + APR 2016 PSU) and 12cR1 (12.1.0.2 + Apr 2016 PSU) are affected by this issue
  • problem is related to specific UEK3 kernel versions, to be precise: 3.8.13-118.6.1.el7uek and 3.8.13-118.6.2.el7uek
  • (currently) last properly working version of UEK3 is 3.8.13-118.4.2.el7uek
  • supported version of default kernels (non UEK) is one of possible solution/workaround
  • using older UEK3 kernel is a temporary workaround, not solution until the bug will be fixed. There are two reasons to have updated kernel (security, bugfix)

 

Update:

Updated kernel 3.8.13-118.8.1.el7uek.x86_64 has been released and this bug doesn't occur with this version (and probably later versions which will come in future).

 Hope that helps...

 

Upgrade to 12c: Put a spotlight on Database client

It's been almost one year I had touched for last time draft version of this article and thanks to couple days of free time I could finish it so I'm happy that it will not remain as unfinished symphony :-). Everybody talks today about upgrade Oracle Database to 12c or discussing many new features of 12c, no matter if they are interesting or not so interesting, really important or not so important, extra paid or not.
You've decided that you can rely on new version, it seems stable, you gained sufficient information about new and obsoleted/deprecated stuff and made couple of tests of upgrade process on your testing environment. Everything looks fine, so you proceed to real upgrade. Database is upgraded and everything seems to be OK but your applications and colleagues are reporting problems with access to freshly upgraded database. Well... database upgrade is not just about database upgrade.

Case 1: ORA-28040: No matching authentication protocol

This error is thrown when used version of authentication protocol by client does not meet minimum authentication protocol allowed for clients.
In 12cR1 comes with default settings for minimal version of authentication protocol set to 11, thus all clients which are using older version (older than 11g) will produce above error.

Workaround: Add following line to sqlnet.ora (on server side)

SQLNET.ALLOWED_LOGON_VERSION_CLIENT=8


Case 2: ORA-28040: No matching authentication protocol or ORA-03134: Connections to this server version are no longer supported

This error is thrown when you are using database link to older release for the same reason as in previous case.

Workaround: Add following line to sqlnet.ora

SQLNET.ALLOWED_LOGON_VERSION_SERVER=8

Case 3: Segmentation fault

Solution:
Password versions problem. Upgrade your client (at least to 10gR1 version) as it's obviously older than old.

If you hit one of above problems it means that you didn't pay (so much) attention (or you weren't so precise) to clients and 3rd party software that is connecting to your database and the quality of your preparation (I will discuss the preparation phase in one of future posts) for upgrade is... questionable because you probably didn't (and you definitely should) read this:

Client / Server Interoperability Support Matrix for Different Oracle Versions (Doc ID 207303.1)
and this:
Database Net Services Reference - https://docs.oracle.com/database/121/NETRF/sqlnet.htm#NETRF006

Note that above parameters in sqlnet.ora are just bloody workaround, it's not a solution. By setting of lower version by these parameters you've just bought a uncertain amount of time. Again I suggest to check the MOS Doc ID 207303.1.

Getting more information about Database client

In this part we take a look at some possibilities how to get more information about clients in order to identify who and what is using old client and which version client is used for particular session.
For that purpose we can use (G)V$SESSION_CONNECT_INFO (view based on X$KSUSECON) which since 11gR1 version is a great source of additional information about connections. This view (and X$ table) has been present even in earlier versions of Oracle Database (AFAIK I've seen it in 9i) but since 11gR1 provides more detailed thus useful information. For our report we will use it in combination with (G)V$SESSION view.

SYS@ORA11GR2:SQL>
SELECT
      /* a.sid, a.serial#, */ b.username,
      a.client_version, a.osuser,
      b.machine, b.module, b.program,
      a.client_charset , a.client_driver,
      substr(a.client_connection, 0, 4) client_conn,
      b.service_name,
      substr(b.server,0,1) server
FROM gv$session_connect_info a, gv$session b
WHERE a.sid = b.sid
      and a.serial# = b.serial#
      and b.module <> 'DBMS_SCHEDULER'
      and network_service_banner like 'TCP%'
 14  ORDER BY client_version, username;

USERNAME  CLIENT_VER OSUSER     MACHINE       MODULE                       PROGRAM                        CLIENT_CHARSET  CLIENT_DRI CLIE SERVICE_NAME  S
--------- ---------- ---------- ------------- ---------------------------- ------------------------------ --------------- ---------- ---- ------------- -
TEST1     11.1.0.7.0 oracle     lab01         JDBC Thin Client             JDBC Thin Client               Unknown         jdbcthin   Hete MYSERV        N
TEST1     11.2.0.2.0 updater    lab01         JDBC Thin Client             JDBC Thin Client               Unknown         jdbcthin   Hete ORA11GR2.TEST N
TEST2     11.2.0.2.0 oracle     lab01         JDBC Thin Client             JDBC Thin Client               Unknown         jdbcthin   Hete ORA11GR2.TEST D
TEST2     11.2.0.3.0 root       lab01         java@lab01 (TNS V1-V3)       java@lab01 (TNS V1-V3)         UTF8            jdbcoci    Hete ORA11GR2.TEST N
TEST3     11.2.0.4.0 kartik     MYDOM\LAB10   ngator.exe                   ngator.exe                     EE8MSWIN1250               Hete ORA11GR2.TEST D
TEST3     12.1.0.2.0 oracle     lab00         oracle@lab00 (TNS V1-V3)     oracle@lab00 (TNS V1-V3)       EE8ISO8859P2               Homo ORA11GR2.TEST N
TEST4     12.1.0.2.0 oracle     lab00         oracle@lab00 (TNS V1-V3)     oracle@lab00 (TNS V1-V3)       AL32UTF8                   Hete ORA11GR2.TEST D
TEST4     Unknown    oracle     oel01         httpd@oel01 (TNS V1-V3)      httpd@oel01 (TNS V1-V3)        Unknown                    Hete MYSERV        N
TEST5     Unknown    gtw        oel01         dbgtw.bin@oel01 (TNS V1-V3)  dbgtw.bin@oel01 (TNS V1-V3)    Unknown                    Homo ORA11GR2.TEST D
TEST5     Unknown    apache     oel02         httpd@oel02 (TNS V1-V3)      httpd@oel02 (TNS V1-V3)        Unknown                    Hete ORA11GR2.TEST N
TEST6     Unknown    apache     lab01         httpd@lab01 (TNS V1-V3)      httpd@lab01 (TNS V1-V3)        Unknown                    Hete ORA11GR2.TEST N
TEST6     Unknown    apache     oel02         httpd@oel02 (TNS V1-V3)      httpd@oel02 (TNS V1-V3)        Unknown                    Hete ORA11GR2.TEST N
TEST7     Unknown    kartik     mybook        SQL*Plus                     sqlplus@mybook (TNS V1-V3)     Unknown                    Homo ORA11GR2.TEST D

 According to above output we can easily identify servers which are using older version of Oracle Database client, older than 11gR1 to be precise. That session have the "Unknown" value in second column. And also we have other useful informations about client usage. For example, when we look at columns MACHINE, MODULE, PROGRAM, CLIENT_CONNECTION (CLIE column in above output) we know that sessions from OEL01 server are established from an Apache server which is running application (we can guess that there is some fastcgi, PHP, PL/SQL or other Apache module) and some other (probably standalone) application and both of them are using OCI connection. OK, this report is fine but it covers only current sessions but we can have clients/application/users which are connecting iregulary or occasionally.
So we need to be sure that we can identify the most of older clients as we can but this good concept for further step which will be automatic gathering of information.

For purpose of recording information about old clients we may use AFTER LOGON trigger.

CREATE TABLE systools.t_oclient_info
(
    version        varchar2(40),
    username       varchar2(30),
    osuser         varchar2(30),
    machine        varchar2(64),
    hostip         varchar2(15),
    module         varchar2(48),
    terminal       varchar2(30),
    oci_library    varchar2(27),
    client_charset varchar2(40),
    client_driver  varchar2(9),
    client_conn    varchar2(13),
    service_name   varchar2(64)
) TABLESPACE users;

CREATE OR REPLACE TRIGGER al_tr_log_oclient_info
AFTER LOGON ON DATABASE
WHEN (USER != 'SYS')
BEGIN
  INSERT INTO systools.t_oclient_info
  SELECT a.client_version,
         sys_context('USERENV', 'SESSION_USER') username,
         sys_context('USERENV', 'OS_USER') osuser,
         sys_context('USERENV', 'HOST') machine,
         sys_context('USERENV', 'IP_ADDRESS', 15),
         sys_context('USERENV', 'MODULE') module,
         sys_context('USERENV', 'TERMINAL') terminal,
         a.client_oci_library oci_library,
         a.client_charset,
         a.client_driver,
         substr(a.client_connection, 0, 4) client_conn,
         sys_context('USERENV', 'SERVICE_NAME') service_name
  FROM v$session_connect_info a
  WHERE  a.sid = sys_context('USERENV', 'SID')
         and a.network_service_banner like 'TCP%'
         and a.client_version = 'Unknown'
         and not exists (SELECT null
                          FROM systools.t_oclient_info
                          WHERE osuser = sys_context('USERENV', 'OS_USER')
                            and username = sys_context('USERENV', 'SESSION_USER')
                            and hostip = sys_context('USERENV', 'IP_ADDRESS', 15)
                            and module = sys_context('USERENV', 'MODULE'));
END;
/

Note that On-logon trigger can produce/add some overhead (in case of frequently connecting applications), so it's important to keep it as minimal as it's possible and do only necessary work. Logging is executed for all users except SYS, you might consider to reduce number of users if you are sure that no old client is connecting to particular user.
I've used information from SYS_CONTEXT instead of joining the V$SESSION view. The side effect of this change is that we lost PROGRAM column but at the same time we've got HOSTIP column with an IP address.
Also you should keep the T_OCICLIENT_INFO table as small as possible and that's the reason why I added the "not exists" condition (you can remove it, so no checking for existing entry, just pure insert for each session) and also you can limit number of rows in table by removing "OSUSER=sys_context('USERENV', 'OS_USER')" from conditions. Also keep on mind that only 10gR2 and older clients will be logged (as the client version number is returned for 11gR1). And finally note that this is a my way, perhaps it might not be a best way but from my point this was effective solution, as always you can choose your own way.

Now you need to give some time in order to log most (ideally and hopefully all) of the clients.

SYS@ORA11GR2:SQL> select * from systools.t_oclient_info;

VERSION    USERNAME  OSUSER     MACHINE   HOSTIP          MODULE                         TERMINAL   OCI_LIBRARY  CLIENT_CHARSET  CLIENT_DRI CLIE SERVICE_NAME
---------- --------- ---------- --------- --------------- ------------------------------ ---------- ------------ --------------- ---------- ---- -------------
Unknown    TEST7     kartik     mybook    192.168.56.1    sqlplus@mybook (TNS V1-V3)     pts/6      Unknown      Unknown                    Homo ORA11GR2.TEST
Unknown    TEST4     oracle     oel01     192.168.56.101  httpd@oel01 (TNS V1-V3)                                Unknown                    Hete MYSERV
Unknown    TEST5     gtw        oel01     192.168.56.101  dbgtw.bin@oel01 (TNS V1-V3)    pts/5      Unknown      Unknown                    Homo ORA11GR2.TEST
Unknown    TEST5     apache     oel02     192.168.56.102  httpd@oel02 (TNS V1-V3)                                Unknown                    Hete ORA11GR2.TEST
Unknown    TEST6     apache     oel02     192.168.56.102  httpd@oel02 (TNS V1-V3)                                Unknown                    Hete ORA11GR2.TEST
Unknown    TEST6     apache     lab01     192.168.56.111  httpd@lab01 (TNS V1-V3)                                Unknown                    Hete ORA11GR2.TEST

 Deep diving aka Oracle Client version identification and tracing for "Unknown"s

OK, we have got identified all clients older than 11gR1 and I hope (or believe) that previous part is sufficient for you. But there could be some reasons when you might need to know exact version of Oracle database client.
For instance you don't have access to remote computer which is connecting to your database and it might be very time consuming to gather this information or you have some legacy application
which can work with (or supports) the 12cR1 database but it's not so easy to replace the client or there could be another issue or reason typical for corporate environment (My story was that somebody didn't do his job by providing required information/action and I couldn't wait with the upgrade any longer so I had to know what to expect after upgrade thus prevent some unexpected behaviour due to old client).
By knowing of exact version we can better predict and address potential implications, issues or bugs (for example by checking existing bug database or knowledge base at MOS site) for that client version.

Question is how we can detect version remotely? What exactly we are looking for? And thanks to my little research (for security related reason) more than ten years ago I knew where to start. For better understanding we have to get answer for those two questions and to answer these questions we can use output from good old SQLNet (server) tracing. I've used 10gR2 client and 12cR1 database for that purpose. In produced trace file we will focus on handshake/negotiation phase where can find the needed hint(s) (Note I'm not going to explain whole structure of trace information nor detailed description of packets, it's not an topic and also it would need separate article for full topic coverage and that's not my intention).

undefined

So we've got first interesting line/entry:
nsconneg: vsn=313, lov=300, opt=0xc01, sdu=2048, tdu=32767, ntc=0x7f08

Where VSN represents the protocol version which can help to identify release of the client. So we can easily find that:

Version         Dec      Hex
-----------------------------
8               310     0x136
9iR1            311     0x137
9iR2            312     0x138
10gR1/10gR2     313     0x139
11gR1/11gR2     314     0x13a
12cR1           315     0x13b

But this method is not sufficient as:
1. This information is not reliable as the value may change. For instance 11.2.0.4 JDBC (in this case provided by Instant client package) contained following entry:
nsconneg: vsn=310, lov=300, opt=0xc41, sdu=8192, tdu=32767, ntc=0x4f98

2. This information doesn't provide exact version number.


So we have the look further and we can see that there is another interesting part wich comes with other two lines and this is the information we are looking for:
nau_rpv: Version received for Linux: Version 10.2.0.5.0 - UNKNOWN
nau_rpv: Connection version for Linux: Version 10.2.0.5.0 - UNKNOWN

Now let's check the trace file again and the corresponding packet in order to find out the location of this information.

undefined

If you look at the above picture you can see the TNS packet and the version in red rectangles. Also note the "DEADBEEF" sequence which is the way how we can identify the right packet in order to get this information. Using utility I've created years ago I translated the Hex value from red rectangles to Dec value and version value in order to check the version:

$ ./oravsn.sh a200500

HEX: A200500 VSNUM: 169870592 VER: 10.2.0.5.0

 And you can do the same thing by different way for example by using the Oracle Database:

SQL> select sys_op_version(to_number('a200500', 'xxxxxxxxx')) version from dual;

VERSION
----------------------------------------
10.2.0.5.0

SQL> select to_number('b200400', 'xxxxxxxxx') vsnum from dual;

     VSNUM
----------
 186647552

 Now we can use this information as input to search in the MOS bug database or knowledge base to find out existing or known issues associated with this version.

Maybe you ask why Oracle clients version 10gR2 and below have "Unknown" value instead of version number. Answer for this question can be found in trace file as well but we have to use 11gR1 client or later. For this purpose I've created another trace file using the 11gR2 client.

undefined

As you can see at the picture above there is SESSION_CLIENT_VERSION sent with the value 186647552 which is decimal representation of Oracle version.

$ ./oravsn.sh 186647552

HEX: b200400 VSNUM: 186647552 VER: 11.2.0.4.0

When you check SQLNet server trace files for clients older than 11gR1 you will find that this information is missing there and that's the reason for an "Unknown" value.

Automatic information gathering for "Unknown"s

Now we have to solve the question how to collect these information automatically and what tools we need to use for that purpose.
Firstly we need to think about the rules and known facts. For instance:
a) overhead must be as low as possible
b) we need to limit information gathering for specific hosts only
c) output should be as minimal as possible
d) Database is running on Linux

Based on above points it's clear that using of server side sqlnet tracing is pointless as it conflicts with a, b and c.
As the Database is running on Linux box with the 2.6 kernel so I chose the solution based on combination of iptables/netfilter (local firewall) and wireshark. Of course in case of different OS I would have to choose different solution.

So let's see the setup:

# iptables -N ORATRACE
# iptables -A ORATRACE -p tcp -m string --hex-string "|deadbeef|" --algo kmp --from 50 --to 110 -j NFLOG --nflog-group 15
# iptables -A ORATRACE -p tcp -j ACCEPT
# iptables -A INPUT -s 192.168.56.1 -p tcp -m state --state ESTABLISHED -m tcp --dport 1521 -j ORATRACE
# iptables -A INPUT -s 192.168.56.101 -p tcp -m state --state ESTABLISHED -m tcp --dport 1521 -j ORATRACE
# iptables -A INPUT -s 192.168.56.102 -p tcp -m state --state ESTABLISHED -m tcp --dport 1521 -j ORATRACE

I've created a new ORATRACE chain. All established connections from three specific hosts (logged in systools.t_oclient_info - see the first section of the article) which are connection to port 1521 (where the Oracle Listener is listening) are redirected to ORATRACE chain. Each packet which comes to ORATRACE chain is checked (using of Knuth–Morris–Pratt string searching algorithm) for presence of the "DEADBEEF" sequence. If this sequence has been found packet is logged using NFLOG. Note that this setting is not permanent use iptables-save in order to save this configuration.

Using a Wireshark tool we read the specific NFLOG group (15) and write it to file in Hexadecimal form as a text to file:

# tshark -i nflog:15 -x >> /home/oracle/oracle_clients.log &

Now we can periodically check the oracle_clients.log file and if there is existing entry for specific source IP address we can remove specific rule for that address from the iptables (You can eventually write and setup script for that purpose).
Now we check what we've got.

At first content of one of logged packet:

 24.460554 192.168.56.1 -> 192.168.56.103 TNS 292 Request, Data (6), SNS

0000  02 00 00 0f 08 00 01 00 08 00 01 02 05 00 0a 00   ................
0010  00 00 00 00 08 00 04 00 00 00 00 02 10 00 08 00   ................
0020  00 06 20 02 0a 00 27 00 00 00 45 81 06 00 0f 00   .. ...'...E.....
0030  00 01 00 00 06 00 11 00 00 0e 00 00 12 00 10 00   ................
0040  08 00 27 1e 87 94 0a 00 27 00 00 00 08 00 00 00   ..'.....'.......
0050  d4 00 09 00 45 00 00 d0 13 b4 40 00 40 06 34 bb   ....E.....@.@.4.
0060  c0 a8 38 01 c0 a8 38 67 c4 d5 05 f1 83 88 54 66   ..8...8g......Tf
0070  8d 96 78 7b 80 18 00 e5 0b 57 00 00 01 01 08 0a   ..x{.....W......
0080  04 f4 41 41 01 ae a9 05 00 9c 00 00 06 00 00 00   ..AA............
0090  00 00 de ad be ef 00 92 0a 20 04 00 00 04 00 00   ......... ......
00a0  04 00 03 00 00 00 00 00 04 00 05 0a 20 04 00 00   ............ ...
00b0  08 00 01 00 00 76 4f fd 8f 35 88 00 12 00 01 de   .....vO..5......
00c0  ad be ef 00 03 00 00 00 04 00 04 00 01 00 01 00   ................
00d0  02 00 01 00 03 00 00 00 00 00 04 00 05 0a 20 04   .............. .
00e0  00 00 02 00 03 e0 e1 00 02 00 06 fc ff 00 02 00   ................
00f0  02 00 00 00 00 00 04 00 05 0a 20 04 00 00 0c 00   .......... .....
0100  01 00 11 06 10 0c 0f 0a 0b 08 02 01 03 00 03 00   ................
0110  02 00 00 00 00 00 04 00 05 0a 20 04 00 00 03 00   .......... .....
0120  01 00 03 01

And now we use grep for shortening the output in order for better reading:

$ cat oracle_clients.log | egrep 'TNS|be ef'| grep -v 00c0
 21.740456 192.168.56.102 -> 192.168.56.103 TNS 292 Request, Data (6), SNS
0090  00 00 de ad be ef 00 92 09 20 08 00 00 04 00 00   ......... ......
 24.460554 192.168.56.1 -> 192.168.56.103 TNS 292 Request, Data (6), SNS
0090  00 00 de ad be ef 00 92 0a 20 04 00 00 04 00 00   ......... ......
 27.850304 192.168.56.101 -> 192.168.56.103 TNS 292 Request, Data (6), SNS
0090  00 00 de ad be ef 00 92 0a 10 05 00 00 04 00 00   ................

$ ./oravsn.sh a200400

HEX: a200400 VSNUM: 169870336 VER: 10.2.0.4.0

$ ./oravsn.sh a100500

HEX: a100500 VSNUM: 168822016 VER: 10.1.0.5.0

$ ./oravsn.sh 9200800

HEX: 9200800 VSNUM: 153094144 VER: 9.2.0.8.0

According to above output host 192.168.56.1 is connecting using 10.2.0.4 version of client, host 192.168.56.101 using 10.1.0.5 and third host 192.168.56.102 shows is using 9.2.0.8 version of client (which makes him #1 candidate for client upgrade).

 Final thoughts

Be aware of that as Oracle Database comes with new features by each release, it also comes with changes in SQLNet protocol (since 11gR1 it has changed significantly), changes of default settings (SDU for example) and changes in authentication. Keep on mind these facts when you plan to upgrade your Database so you have to put a spotlight on Database client as well. It's possible to upgrade clients after upgrade of Database but I suggest to do that as first step (before Database upgrade) whenever it's possible. Otherwise you should be at least aware of possible consequences and implications of using older client version.

 

Home ← Older posts