Ivan Kartik - Oracle and Linux Blog

How to not put company in to a risk (from DBA perspective)

Let me start this post with following quotation "People are definitely a company's greatest asset. A company is only as good as the people it keeps".
I fully agree with these sentences because regarding current topic it's definitely true and appropriate.

Internet is a great thing, it's a perfect source of information. Back to 1998 in times when I started with my first steps with Oracle Database there wasn't so many public knowledge sources about it. Software had been shipped on CDs and it had been very rare to see online documentation. Not many books in the book store, no Blogs and of course no Oracle ACE program as well :-). Those days are just memories now and by the time Internet became great source of information or software like Oracle Database XE thus place where one can get great portion of knowledge or even lessons learnt from other people thus it's very easy to leverage the experience. The side effects of this transformation of Internet is that many times people take information found there as a single point of truth, many people have slow down their skills development, because many solutions for potential problems can be easily found there moreover less experienced people don't study things closely they found or they are not so careful. The potential result could be at least unpleasant.


1. Your company may become under-licensed

I hear very often that "Oracle is too expensive" or "We pay a lot of money for it". Honestly I agree, it is expensive but also has lot to offer for that price. But most of the time when I hear such complains, I always ask questions like "Does it need to be?", "Do you really need all stuff (options, packs or other features) you pay for?". But there is another potential problem I'd like to point out in this paragraph. Every company has (should have) department or specific group of people responsible for Software Asset Management (or precisely Software Licenses Management). They maintain the evidence of license contracts, current price holds and usually they should have software repository which holds all information about used software and purchased licenses. What does it with IT staff like DBAs? What does it have with Oracle as technology itself? A quite lot.

In old times DBA's world and scope was everything what was stored or has happened within a database. Experienced DBA knew that not just DB configuration and application code have influenced database performance, so they felt that it has been necessary to see things in full picture thus to have solid understanding of full stack from the bottom to the top which means to understand things around storage layer, OS layer, network layer, DB layer (of course) and application layer as well (SQL and PL/SQL). This knowledge is and always has been influential to the fast identification and resolution of issues. In fact exactly this approach came later known as DBA 2.0 - The Next Generation DBA initiative back in 2008 (perhaps too late but at least...).
Database area has moved further since 2008 (now probably it's time for DBA 3.0) and in these days modern DBA should be familiar with current trends, e.g. Cloud technologies, automation and also probably the most underestimated thing in DBA's world - Licensing.
As I stated in the beginning of this article amount of information on Internet growth over years dramatically and it is possible to find many Cookbooks, HOWTOs or even scripts.

Consider this example taken from known web page found on Internet listed in top 3 results of search engine) according to page rank it's obvious that this page might have huge number of visitors:

---------------------------------
If you need to create a snapshot manually, because you don’t like the one-hour interval, or if you disabled taking snapshots at all:

EXEC DBMS_WORKLOAD_REPOSITORY.create_snapshot;

Create a report:

@$ORACLE_HOME/rdbms/admin/awrrpt.sql

This script will ask you for the format of the report (html or plain text), the snapshot ID for start and end of the report (an overview of the last n day’s snapshots is given) and for a report name, that’s used for the report file name (.html or .lst).

Stay happy with your databases,
---------------------------------

Now what's the catch here? Let me ask you couple of questions:

- Did you know that AWR feature is part of Diagnostic pack which is separately licensed pack?
- Does have your Oracle Database Diagnostic pack licensed?
- Does your instance control usage of Diagnostic pack?

If your answer was 3x "No" (or "No" to last two questions) your database might be under-licensed at this moment. In this particular case the absence of this knowledge and using the default configuration settings resulted to under-licensed environment. Answer for the second question should be given by department or team responsible for Software Asset Management (mentioned at the top of this article).
Each new version of the database comes with new and interesting features, not just for administrators but also for developers. It's almost impossible to expect from developer that he/she would know how the features are licensed but this is the point which DBA should be aware of, moreover only DBA is able to control usage of extra paid features.
Well perhaps, in this case the script is free of charge but database software and especially feature which is being used is not free at all.
Let's say that your DB environment has (just) 4xCPU (Intel) cores while list price (CPU metric) for Diagnostic pack is approx. 5000 USD (check your price hold), that means approx. 10000 USD (CapEx) and (approx.) 2200+ USD (OpEx every year) "damage" for your company. The web page states "Stay happy with your databases", unfortunately in this perspective it sounds more like sarcasm. Although Diagnostic pack is very useful pack but this is not definitely the best way how to tell to the management "We need it!".

I checked several Blogs listed in top 20 results given by my favourite search engine. Almost of them were from well known people in Oracle community (most of them had been written by members of the Oracle ACE program, thus more less trusted persons in the Oracle community) but only one of them contained disclaimer or warning about usage of extra paid option/pack. But also this particular page provides scripts where the disclaimer/warning is missing among other remarks in top of the script.

Knowledge is key here and study of new features shouldn't be limited to features only but I believe that study of licensing implications should be part of your standard practice. This is probably bad message for a Google/Stackoverflow DBAs, which are (very often) lazy and refusing continuous self-education, relying on the knowledge of the other's instead on their own. Unfortunately during my consultancies I often see that database environments are not licensed correctly, let's say 7 of 10 environments and that's really bad ratio.
Note that Licensing is really complex topic (don't hesitate to hire some consultant well experienced in this area) but to understand the basics could protect you from potential (non-technical) problems and your company as well.


2. Your system may become accidentaly unstable, unaccessible or even compromized

This paragraph is about different type of hidden danger while the source (Internet of course) remains the same. I'm not going to talk about it's generally bad idea to have accessible listener/dispatcher port accessible from Internet, about using of default passwords or other basic things. Let me explain the issue by example. I have created simple demo which is using the same command as in previous paragraph, you can find it on this link: http://ivan.kartik.sk/demos/pjdemo.html

Have you ever heard about "PasteJacking"? So, this is it. Now consider less skilled DBA with lack of knowledge, with bad (or not so good) stress factor in stressful situation such as performance degradation, outage of mission or business critical database. He finds a "quick-win solution", a script published in some article on the Internet. What he will do in this situation?
Now consider the script which may contain DROP commands or other sophisticated evil code...

Know the source and even if source is page of well known person in the community or industry doesn't necessarily mean that his pages weren't compromised thanks to bug in publishing system he is using.
You should always be careful and check and fully understand what you are going to "paste" to the SQLPlus, shell, etc.

Another story is using "solution" which is not appropriate for the problem. Some time ago some guys were facing to some issue where the flushing of buffer cache was a workaround that helped. Of course they found this workaround on Internet. You may guess what happened during another completely different issue although with similar symptoms, but unfortunately not the same. Yes, flushing of buffer cache was the first thing that guys tried in order to "fix" the problem. It didn't work for that particular issue moreover it has created another one and outstanding issue suddenly became even worse.
Well, glucose is commonly used in medicine but don't try give it to a patient with diabetes when he fainted due to lack of insulin.

Rock solid knowledge and adopting a good practices or habits is the key how to not put your database in to problems or even danger. With rock solid knowledge you will understand Blogs and articles published on Internet as a additional information sources and not as a easy problem solvers. Knowledge is the asset for thus skilled people make company's greatest asset.

Let me finish this post with a funny quote: "Don't believe everything you read on or copying from the Internet. Abraham Lincoln"

Oracle Database 12cR2 online documentation available

Following the Cloud-first strategy Oracle has announced the general availability of Oracle Database 12c Release 2 via the new Oracle Exadata Express Cloud Service. Software for on-premises installation has not been released yet but if you want to learn what's new or what is coming in the 12cR2 you can browse documentation which is available at: http://docs.oracle.com/en/database/

ACFS issue with latest UEK3 kernels on OEL 7/RHEL 7

One of my colleagues has been facing to weird behaviour of ACFS during installation of two nodes RAC on RHEL 7 and OEL7 (started on RHEL, then tried OEL). ACFS volume creation (during mkfs execution) one of the nodes has either hung or thrown error like this:

mkfs.acfs: version                   = 12.1.0.2.0
mkfs.acfs: on-disk version           = 39.0
mkfs.acfs: volume                    = /dev/asm/goldengate-32
mkfs.acfs: CLSU-00100: operating system function: ioctl failed with error data: 1
mkfs.acfs: CLSU-00101: operating system error message: Operation not permitted
mkfs.acfs: CLSU-00103: error location: OI_0
mkfs.acfs: ACFS-00546: failed to change on-disk signature
mkfs.acfs: ACFS-01004: /dev/asm/goldengate-32 was not formatted.

After couple of days of his struggling I decided to help him to find out the problem because my colleague is very experienced thus I knew that he has done everything correctly (well, definitely it wasn't his first rodeo). Moreover I have done similar installation just few days before, without any problem. And that was interesting. Firstly I checked logs and of course configuration (udev, multipath, etc.), it had been correct and traced (using one of my closest friends since 2000) that process with this result:

2216  stat("/dev/asm/goldengate-34", {st_mode=S_IFBLK|0770, st_rdev=makedev(251, 17409), ...}) = 0
2216  open("/dev/ofsctl", O_RDWR)       = 9
2216  ioctl(9, 0xffffffffc1387015, 0x7ffdbee52880) = -1 EPERM (Operation not permitted)

 And this was really weird as permissions for these devices are defined by Udev configuration which is created during installation.

Gathered all important information about environment I've created an action plan. Installation of the same environment on my virtual machine. Well I'd rather say almost the same as I decided to not install the clustered environment in first step and of course used hardware was different and hypervisor has been as well. Version of all software was pretty much same as in original (my colleague's) environment, except one so important thing - Linux kernel. I decided to use original (non UEK) kernel first.
Installation went smoothly, ACFS had smooth configuration and later functionality too, so two options (or unanswered question) remained - whether it's kernel related or RAC related problem where the first option was my preference according to previous tracing.
So I've installed exact version (3.8.13-118.6.2.el7uek) of Unbreakable Kernel provided by Oracle just as my colleague did. And here is the result (Note that it was an intention to perform all steps manually):

$ uname -a
Linux el1 3.8.13-118.6.2.el7uek.x86_64 #2 SMP Thu May 19 13:15:51 PDT 2016 x86_64 x86_64 x86_64 GNU/Linux
$ acfsdriverstate supported
ACFS-9200: Supported

# acfsroot install
ACFS-9300: ADVM/ACFS distribution files found.
ACFS-9307: Installing requested ADVM/ACFS software.
ACFS-9308: Loading installed ADVM/ACFS drivers.
ACFS-9321: Creating udev for ADVM/ACFS.
ACFS-9323: Creating module dependencies - this may take some time.
ACFS-9154: Loading 'oracleoks.ko' driver.
ACFS-9154: Loading 'oracleadvm.ko' driver.
ACFS-9154: Loading 'oracleacfs.ko' driver.
ACFS-9327: Verifying ADVM/ACFS devices.
ACFS-9156: Detecting control device '/dev/asm/.asm_ctl_spec'.
ACFS-9156: Detecting control device '/dev/ofsctl'.
ACFS-9309: ADVM/ACFS installation correctness verified.

# lsmod | grep ora
oracleacfs           3498177  0
oracleadvm            594197  0
oracleoks             503994  2 oracleacfs,oracleadvm

$ asmcmd volinfo --all
no volumes found
$ asmcmd volcreate -G GG GOLDENGATE -s 4G
$ asmcmd volinfo --all
Diskgroup Name: GG

         Volume Name: GOLDENGATE
         Volume Device: /dev/asm/goldengate-34
         State: ENABLED
         Size (MB): 4096
         Resize Unit (MB): 64
         Redundancy: UNPROT
         Stripe Columns: 8
         Stripe Width (K): 1024
         Usage:
         Mountpath:

$ mkfs -t acfs /dev/asm/goldengate-34
mkfs.acfs: version                   = 12.1.0.2.0
mkfs.acfs: on-disk version           = 39.0
mkfs.acfs: volume                    = /dev/asm/goldengate-34
mkfs.acfs: CLSU-00100: operating system function: ioctl failed with error data: 1
mkfs.acfs: CLSU-00101: operating system error message: Operation not permitted
mkfs.acfs: CLSU-00103: error location: OI_0
mkfs.acfs: ACFS-00546: failed to change on-disk signature
mkfs.acfs: ACFS-01004: /dev/asm/goldengate-34 was not formatted.

Bingo! I've got the same problem on my test machine, so I was able to reproduce the problem. Now it's clear that is not related to a clustered configuration. Ok, let's see the trace output:

2061  stat("/dev/asm/goldengate-34", {st_mode=S_IFBLK|0770, st_rdev=makedev(251, 17409), ...}) = 0
2061  open("/dev/ofsctl", O_RDWR)       = 9
2061  ioctl(9, 0xffffffffc1387015, 0x7ffd26305310) = -1 EPERM (Operation not permitted)

and permissions:

$ ll /dev/asm/.asm_ctl_spec
brwxrwx--- 1 root dba 251, 0 Jun  1 23:13 /dev/asm/.asm_ctl_spec
$ ll /dev/asm/goldengate-34
brwxrwx--- 1 root dba 251, 17409 Jun  1 23:24 /dev/asm/goldengate-34
$ ll /dev/ofsctl
brw-rw-r-- 1 root dba 250, 0 Jun  1 23:24 /dev/ofsctl

Output from tracing was the same and permissions are the same as defined in Udev rules and also the are correct. So according to this result I decided to remove ACFS related configuration and check the configuration and functionality of ACFS again using several older UEK3 kernels. Here is the result from the test of one of them (3.8.13-118.4.2.el7uek.x86_64):

$ uname -a
Linux el1 3.8.13-118.4.2.el7uek.x86_64 #2 SMP Tue Mar 22 20:46:48 PDT 2016 x86_64 x86_64 x86_64 GNU/Linux
$ acfsdriverstate supported
ACFS-9200: Supported

# acfsroot install
ACFS-9300: ADVM/ACFS distribution files found.
ACFS-9307: Installing requested ADVM/ACFS software.
ACFS-9308: Loading installed ADVM/ACFS drivers.
ACFS-9321: Creating udev for ADVM/ACFS.
ACFS-9323: Creating module dependencies - this may take some time.
ACFS-9154: Loading 'oracleoks.ko' driver.
ACFS-9154: Loading 'oracleadvm.ko' driver.
ACFS-9154: Loading 'oracleacfs.ko' driver.
ACFS-9327: Verifying ADVM/ACFS devices.
ACFS-9156: Detecting control device '/dev/asm/.asm_ctl_spec'.
ACFS-9156: Detecting control device '/dev/ofsctl'.
ACFS-9309: ADVM/ACFS installation correctness verified.
# lsmod | grep ora
oracleacfs           3498177  0
oracleadvm            594197  0
oracleoks             503994  2 oracleacfs,oracleadvm

$ asmcmd volinfo --all
no volumes found
$ asmcmd volcreate -G GG GOLDENGATE -s 4G
$ asmcmd volinfo --all
Diskgroup Name: GG

         Volume Name: GOLDENGATE
         Volume Device: /dev/asm/goldengate-34
         State: ENABLED
         Size (MB): 4096
         Resize Unit (MB): 64
         Redundancy: UNPROT
         Stripe Columns: 8
         Stripe Width (K): 1024
         Usage:
         Mountpath:

$ mkfs -t acfs /dev/asm/goldengate-34
mkfs.acfs: version                   = 12.1.0.2.0
mkfs.acfs: on-disk version           = 39.0
mkfs.acfs: volume                    = /dev/asm/goldengate-34
mkfs.acfs: volume size               = 4294967296  (   4.00 GB )
mkfs.acfs: Format complete.

# mount -t acfs /dev/asm/goldengate-34 /mnt/
# mount | grep mnt
/dev/asm/goldengate-34 on /mnt type acfs (rw,relatime,device,rootsuid,ordered)
# touch /mnt/testfile
# ll /mnt/testfile
-rw-r--r-- 1 root root 0 Jun  2 00:52 /mnt/testfile

 It works, we were able to format, mount and use the ACFS volume thus now we know that downgrade the UEK3 kernel is a workaround which solves the issue with ACFS until time when bug will be fixed by newer release of UEK3 kernel.

 A bonus case: What if the Sys Admin will upgrade kernel thus we use the "buggy" version of UEK3 kernel on already configured ACFS volumes? I did several repeating tests on already configured and previously working ACFS volume. Let see what will happen:

1st run:

$ uname -a
Linux el1 3.8.13-118.6.2.el7uek.x86_64 #2 SMP Thu May 19 13:15:51 PDT 2016 x86_64 x86_64 x86_64 GNU/Linux

# acfsload start
ACFS-9391: Checking for existing ADVM/ACFS installation.
ACFS-9392: Validating ADVM/ACFS installation files for operating system.
ACFS-9393: Verifying ASM Administrator setup.
ACFS-9308: Loading installed ADVM/ACFS drivers.
ACFS-9154: Loading 'oracleoks.ko' driver.
ACFS-9154: Loading 'oracleadvm.ko' driver.
ACFS-9154: Loading 'oracleacfs.ko' driver.
ACFS-9327: Verifying ADVM/ACFS devices.
ACFS-9156: Detecting control device '/dev/asm/.asm_ctl_spec'.
ACFS-9156: Detecting control device '/dev/ofsctl'.
ACFS-9322: completed

$ asmcmd volenable -G GG GOLDENGATE
$ asmcmd volinfo --all
Diskgroup Name: GG

         Volume Name: GOLDENGATE
         Volume Device: /dev/asm/goldengate-34
         State: ENABLED
         Size (MB): 4096
         Resize Unit (MB): 64
         Redundancy: UNPROT
         Stripe Columns: 8
         Stripe Width (K): 1024
         Usage: ACFS
         Mountpath: /mnt

# mount -t acfs /dev/asm/goldengate-34 /mnt/
# mount | grep mnt
/dev/asm/goldengate-34 on /mnt type acfs (rw,relatime,device,rootsuid,ordered)
# touch /mnt/testfile2
# ll /mnt/testfile*
-rw-r--r-- 1 root root 0 Jun  2 00:52 /mnt/testfile
-rw-r--r-- 1 root root 0 Jun  2 01:18 /mnt/testfile2

Result of 1st run - working.

2nd run:

# Kernel panic - not syncing: Holding spin lock
rid: 1925, comm: mount Tainted: PF 0 3.8.13-118.6.2.e17uek.x86_64 #2
all Trace:
[<ffffffff81574fb0>] panic+Oxc8/0x1d7
[<ffffffffa0484a7a>] __KsPanic+0x9a/Oxa0 [oracleoks]
[<ffffffffa072275d>] ? OfsDoMountRecovery+Oxldd/Ox2b0 [oracleacfs]
[<ffffffffa0722a5d>] OfsDoPhaselRecovery+0x22d/Ox4e0 [oracleacfs]
[<ffffffffa0722e28>] OfsWaitForRecoveryToComplete+0x118/0x3c0 [oracleacfs]
[<ffffffffa0667446>] OfsSetupVolume+Oxc6/0x2ff0 [oracleacfs]
[<ffffffffa048507c>] ? KsSleepEvent+Ox8c/Oxce [oracleoks]
[<ffffffff81081e90>] ? wake_up_bit+Ox30/0x30
[<ffffffffa04851bc>] ? KsCreateSystemThread+Ox10c/Ox1b0 [oracleoks]
[<ffffffffa066b6dd>] OfsMountVolume+0x136d/Ox27e0 [oracleacfs]
[<ffffffffa0766e8f>] ofs_fill_sb+0x6f/Ox7e0 [oracleacfs]
[<ffffffff8118af58>] mount_bdev+0x1b8/0x200
[<ffffffffa0766e20>] ? ofs_parse_flags+0x140/0x140 [oracleacfs]
[<ffffffffa0763ec7>] ofs_mount+Ox107/0x2f0 [oracleacfs]
[<ffffffff8118b8e9>] mount_fs+0x39/0x1b0
[<ffffffff811a5dc7>] ? alloc_vfsmnt+Oxd7/0x1b0
[<ffffffff811a5f3f>] vfs_kern_mount+Ox5f/Oxf0
[<ffffffff811a8250>] do_mount+0x220/0xaf0
[<ffffffff8114343b>] ? strndup_user+Ox4b/Oxf0
[<ffffffff811a8ba3>] sys_mount+0x83/0xc0
[<ffffffff81587179>] system_call_fastpath+0x16/0x1b

Result of the 2nd run was kernel panic, machine hung of course.

3rd run:

# acfsload start
ACFS-9391: Checking for existing ADVM/ACFS installation.
ACFS-9392: Validating ADVM/ACFS installation files for operating system.
ACFS-9393: Verifying ASM Administrator setup.
ACFS-9308: Loading installed ADVM/ACFS drivers.
ACFS-9327: Verifying ADVM/ACFS devices.
ACFS-9156: Detecting control device '/dev/asm/.asm_ctl_spec'.
ACFS-9156: Detecting control device '/dev/ofsctl'.
acfsutil plogconfig: CLSU-00100: operating system function: ioctl failed with error data: 22
acfsutil plogconfig: CLSU-00101: operating system error message: Invalid argument
acfsutil plogconfig: CLSU-00103: error location: OI_0
acfsutil plogconfig: ACFS-03500: Unable to access kernel persistent log entries.
ACFS-9225: Failed to start OKS persistent logging.
ACFS-9322: completed

Result of 3rd run: finished with CLSU-00100: operating system function: ioctl failed with error data: 22 , CLSU-00101: operating system error message: Invalid argument, CLSU-00103: error location: OI_0 error messages. After several tries of manual startup it has started successfully.

I did seven more runs and final result was 4x working, 5x  kernel panic, 1x error message during startup, so there is 60% chance that your system won't work.

Final conclusions

  • problem is not related to clustered environment
  • based on several tests (not shown in this article) 11gR2 (11.2.0.4 + APR 2016 PSU) and 12cR1 (12.1.0.2 + Apr 2016 PSU) are affected by this issue
  • problem is related to specific UEK3 kernel versions, to be precise: 3.8.13-118.6.1.el7uek and 3.8.13-118.6.2.el7uek
  • (currently) last properly working version of UEK3 is 3.8.13-118.4.2.el7uek
  • supported version of default kernels (non UEK) is one of possible solution/workaround
  • using older UEK3 kernel is a temporary workaround, not solution until the bug will be fixed. There are two reasons to have updated kernel (security, bugfix)

 

Update:

Updated kernel 3.8.13-118.8.1.el7uek.x86_64 has been released and this bug doesn't occur with this version (and probably later versions which will come in future).

 Hope that helps...

 

Wrong count of CPU on HP-UX 11.31

We have two identical machines, two identical HW resources, two identical configurations but there is one significant difference. Output from machine A:


$ uname -srm
HP-UX B.11.31 ia64

SQL> select CPU_COUNT_CURRENT,CPU_CORE_COUNT_CURRENT,CPU_SOCKET_COUNT_CURRENT,CPU_COUNT_HIGHWATER from v$license;

CPU_COUNT_CURRENT CPU_CORE_COUNT_CURRENT CPU_SOCKET_COUNT_CURRENT CPU_COUNT_HIGHWATER
----------------- ---------------------- ------------------------ -------------------
               28                                              28                  28

$ machinfo
CPU info:
  8 Intel(R) Itanium 2 9100 series processors (1.6 GHz, 24 MB)
          533 MT/s bus, CPU version A1
          28 logical processors


# ioscan -kf | grep processor | wc -l
28

So everything seems to be fine (just output from v$license shows misleading info) and we've got 28 cores. No, it's not! Output from machine B:


$ uname -srm
HP-UX B.11.31 ia64

SQL> select
CPU_COUNT_CURRENT,CPU_CORE_COUNT_CURRENT,CPU_SOCKET_COUNT_CURRENT,CPU_COUNT_HIGHWATER from v$license;

CPU_COUNT_CURRENT CPU_CORE_COUNT_CURRENT CPU_SOCKET_COUNT_CURRENT CPU_COUNT_HIGHWATER
----------------- ---------------------- ------------------------ -------------------
               14                                              14                  14
$ machinfo
CPU info:
  8 Intel(R) Itanium 2 9100 series processors (1.6 GHz, 24 MB)
          533 MT/s bus, CPU version A1
          14 logical processors


# ioscan -kf | grep processor | wc -l
14

Machine A shows doubled amount of CPUs than on machine B. Both VPARs have 7x Intel Itanium 2 processors (2 cores each). So that means the vPar has 14 CPU cores. Machine B shows correct info but A doesn't. How to find correct information about CPUs in this case? We can use /sbin/vparstatus to clearly see that there are 14 cores assigned to vPar. Machine A:


[Virtual Partition Resource Summary]
                                CPU      Num   Num     Memory Granularity
Virtual Partition Name          Min/Max  CPUs  IO       ILM         CLM
==============================  =======  ====  ====  ==========  ==========
DB1                               1/ 24    14     7         512         512

Machine B:


[Virtual Partition Resource Summary]
                                CPU      Num   Num     Memory Granularity
Virtual Partition Name          Min/Max  CPUs  IO       ILM         CLM
==============================  =======  ====  ====  ==========  ==========
DB2                               1/ 24    14     6         512         512

It is very important to know correct information, at least in two cases, performance (monitoring, tuning or troubleshooting) and for correct product licensing. What is the difference between those two machines? The difference is that machine "B" has latest patches whilst machine "A" even it is patched but (according to swlist) certainly latest patches weren't used. Unfortunately I can't provide wich patch or bundle is fixing this problem for this moment but when I find it then I'll post it right here. Update: My theory was correct and the final answer is very simple - multithreading. This type of Itanium Processors have two threads for each core. You can set multithreading on partition level (this has been done on both of machines) and also on OS level by lcpu_attr and there was the difference. You can verify the settings using kctune |grep lcpu. So the conclusion is no matter what patches are installed in this case.

Home