Andrew's DBA blog

Tuesday, 20 September 2011

Oracle Database Support annoucement on the Itanium Processor

To anyone haven't heard about this news from Oracle. Oracle has an update about what will be supported and not supported on Itanium processor. For my case, we actually just upgraded RAC running on HP UX Itanium from 10.2.0.4 to 11.2.0.1. It is quiet a "big" news for us since most of the Databases are running on HP UX Itanium. See below notes :

"Oracle Database 11g Release 2 is currently shipping on HP-UX Itanium, latest available patchset level is 11.2.0.2. We will continue to ship further 11g release 2 patchsets, PSUs and CPUs on HP-UX Itanium, just as we do on other platforms. The next release of Oracle Database, currently called 12g, will not be shipped on HP-UX Itanium. Customers can either stay on an Oracle Database 11g Release, as per Oracle's Lifetime Support Policy, or move to another platform when they want to upgrade to the next major Oracle release.

Oracle Database 11g Release 2 (including all the 11.2 patchsets)will continue to be supported on HP-UX 11.31. Oracle will certify Database 11.2 on future versions of the HP-UX operating system that are generally available within the Extended Support period for 11g Release 2 as documented in the Oracle Lifetime Support Policy. Oracle does not require explicit certification on future versions of the Itanium processor provided HP supports the versions of the HP-UX operating system that Oracle Database 11g Release 2 is certified on.
..."

For more details please check this article on Oracle support site : 1307745.1

Friday, 26 August 2011

How to recover Oracle TDE wallet Password which almost "Lost"

I recently have an issue with Oracle TDE wallet password. As part of our quarterly Security patch, we are planning to patch one of the database which using this Wallet for the encryption. Means I will need to shutdown and startup the database. The problem is when we restart the database then we need to open the wallet so the tables that encrypted can be visible for the trusted connection. Just guess what ? nobody sure about what is the wallet password in production.( The previous DBA has left the company, also with the password - "Very good").

Back to the issue, now I have to make sure I can open the wallet with the correct password before I can shutdown this database. This issue as good as "This Database should be alive all the time", once it is down without the wallet password it will become useless. I actually raise SR about this issue on how to test the wallet password without affecting our production database operation. As in the documentation written, there is no way to recover the password if the password is lost. But they never mention on how to test it if somehow nobody remember the password.Anyhow Oracle support still telling me there is no way to test the password.

Well I almost give up. But I just do my own testing, I just copy the wallet file from production to my testing environment. Let say my wallet in production are located here(sqlnet.ora) :

ENCRYPTION_WALLET_LOCATION=
(SOURCE=(METHOD=FILE)(METHOD_DATA=
(DIRECTORY=/u01/app/oracle/admin/DBPROD/wallet)))

I just copy "ewallet.p12" to my testing environment directory :

ENCRYPTION_WALLET_LOCATION=
(SOURCE=(METHOD=FILE)(METHOD_DATA=
(DIRECTORY=D:\Oracle\product\10.2.0\admin\DB10G\wallet)))

The first thing is we need to test to open that wallet with any password that we think is correct. Once we can open the wallet with the correct password, we can test if the wallet is valid to open the data in production. I import a table from production that having encryption on one of the columns. Well this tricks works for me and I managed to patch my database in production and open the wallet safely.
Of course if the wallet password can't be find, my suggestion is make sure that database keep alive :)
Let's keep the wallet password safely now.

Thursday, 25 August 2011

Oracle Global Pricing and Licensing

Just in case anyone need to know Oracle products pricing can refer to this link :
http://www.oracle.com/us/corporate/pricing/price-lists/index.html

Thursday, 14 July 2011

Some Note on Firewall with Oracle RAC

If anyone implementing Firewall filtering for SQL*Net Proxy, please make sure that packet redirection is using fix port not random, because this will make the Listener redirection failed by assigning random port number.
For details please see this note on Oracle Support (361284.1).
On one of our system, this issue suddenly come up when some application that calling tnsnames to use the loadbalancing intermittently failed with ORA-12541. Initially I thought it is a connection issue, but further I do a trace from client side and found below :

....
[000001 05-APR-2011 14:55:15:106] nspsend: 01 09 00 00 06 00 00 00 |........|
[000001 05-APR-2011 14:55:15:106] nspsend: 00 00 28 44 45 53 43 52 |..(DESCR|
[000001 05-APR-2011 14:55:15:106] nspsend: 49 50 54 49 4F 4E 3D 28 |IPTION=(|
[000001 05-APR-2011 14:55:15:106] nspsend: 41 44 44 52 45 53 53 3D |ADDRESS=|
[000001 05-APR-2011 14:55:15:106] nspsend: 28 50 52 4F 54 4F 43 4F |(PROTOCO|
[000001 05-APR-2011 14:55:15:106] nspsend: 4C 3D 54 43 50 29 28 48 |L=TCP)(H|
[000001 05-APR-2011 14:55:15:106] nspsend: 4F 53 54 3D 6D 6E 65 74 |OST=mnet|
[000001 05-APR-2011 14:55:15:106] nspsend: 64 62 32 2D 76 69 70 29 |db2-vip)|
[000001 05-APR-2011 14:55:15:106] nspsend: 28 50 4F 52 54 3D 31 35 |(PORT=15|
[000001 05-APR-2011 14:55:15:106] nspsend: 32 31 29 29 28 4C 4F 41 |21))(LOA|
[000001 05-APR-2011 14:55:15:106] nspsend: 44 5F 42 41 4C 41 4E 43 |D_BALANC|
[000001 05-APR-2011 14:55:15:106] nspsend: 45 3D 79 65 73 29 28 43 |E=yes)(C|
[000001 05-APR-2011 14:55:15:106] nspsend: 4F 4E 4E 45 43 54 5F 44 |ONNECT_D|
[000001 05-APR-2011 14:55:15:106] nspsend: 41 54 41 3D 28 53 45 52 |ATA=(SER|
[000001 05-APR-2011 14:55:15:106] nspsend: 56 45 52 3D 44 45 44 49 |VER=DEDI|
[000001 05-APR-2011 14:55:15:106] nspsend: 43 41 54 45 44 29 28 53 |CATED)(S|
[000001 05-APR-2011 14:55:15:106] nspsend: 45 52 56 49 43 45 5F 4E |ERVICE_N|
[000001 05-APR-2011 14:55:15:106] nspsend: 41 4D 45 3D 43 48 4E 4E |AME=CHNN|
[000001 05-APR-2011 14:55:15:106] nspsend: 45 4C 29 28 46 41 49 4C |EL)(FAIL|
[000001 05-APR-2011 14:55:15:106] nspsend: 4F 56 45 52 5F 4D 4F 44 |OVER_MOD|
[000001 05-APR-2011 14:55:15:106] nspsend: 45 3D 28 54 59 50 45 3D |E=(TYPE=| ----> Here the client has sent connect packet to the listener on mnetdb2-vip
[000001 05-APR-2011 14:55:15:106] nspsend: 53 45 4C 45 43 54 29 28 |SELECT)(|
[000001 05-APR-2011 14:55:15:106] nspsend: 4D 45 54 48 4F 44 3D 42 |METHOD=B|
[000001 05-APR-2011 14:55:15:106] nspsend: 41 53 49 43 29 28 52 45 |ASIC)(RE|
[000001 05-APR-2011 14:55:15:106] nspsend: 54 52 49 45 53 3D 31 38 |TRIES=18|
[000001 05-APR-2011 14:55:15:106] nspsend: 30 29 28 44 45 4C 41 59 |0)(DELAY|
[000001 05-APR-2011 14:55:15:106] nspsend: 3D 35 29 29 28 43 49 44 |=5))(CID|
[000001 05-APR-2011 14:55:15:106] nspsend: 3D 28 50 52 4F 47 52 41 |=(PROGRA|
[000001 05-APR-2011 14:55:15:106] nspsend: 4D 3D 73 71 6C 70 6C 75 |M=sqlplu|
[000001 05-APR-2011 14:55:15:106] nspsend: 73 29 28 48 4F 53 54 3D |s)(HOST=|
[000001 05-APR-2011 14:55:15:106] nspsend: 6D 6E 65 74 61 70 70 33 |mnetapp3|
[000001 05-APR-2011 14:55:15:106] nspsend: 29 28 55 53 45 52 3D 6F |)(USER=o|
[000001 05-APR-2011 14:55:15:106] nspsend: 72 61 63 6C 65 29 29 29 |racle)))|
.....

[000001 05-APR-2011 14:55:15:110] nscall: connecting...
[000001 05-APR-2011 14:55:15:110] nsc2addr: entry
[000001 05-APR-2011 14:55:15:110] nsc2addr: (ADDRESS=(PROTOCOL=TCP)(HOST=mnetdb1-vip)(PORT=33949))
[000001 05-APR-2011 14:55:15:110] nttbnd2addr: entry
[000001 05-APR-2011 14:55:15:110] snlinGetAddrInfo: entry
[000001 05-APR-2011 14:55:15:110] snlinGetAddrInfo: Invalid IP address string mnetdb1-vip ----> Here the transport was unable to open to the PORT 33949 on mnetdb1-vip
[000001 05-APR-2011 14:55:15:110] snlinFreeAddrInfo: entry

At that point we notice it assign to port number 33949 which wrong port number. We only use port 1521.
The challenge now is to ask the network engineer to change the firewall rule. In our case the firewall use by many other subsytem. Well this issue never been an issue for all other application that using Weblogic.

Wednesday, 29 June 2011

Bug 9572787 Long waits for ‘enq: AM – disk offline’ following cell or disk crash (can cause further instance crashes)

If anyone is implementing Oracle 11g R2 (11.2.0.2) with extended cluster might get affected by this bugs.

In my case, this bug comes out when we tested to offline the whole second Disk array on the Failgroup 2 which we using ASM normal redundancy.

Until now we still waiting for the merge patch for this issue. Hope they will release the patch soon. For more details about this bugs please check on Oracle support portal.

Update 8 July 2011:
Oracle released the merge patch. The test result is much better, the instances was still rebooted. Only until we set the hidden parameter ,recommend by Oracle support then the instances survived. The parameter is

_lm_rcvr_hang_kill=FALSE

Do not use this parameter unless Oracle support told us to do so.

Tuesday, 28 June 2011

ORA-12514 during Switchover using DG broker with multiple IP segment

Our new system 2 node RAC Oracle 11.2.0.2 are setup with 1 instance of Standby database. Yesterday we tested to switchover to standby database using 'dgmgrl'. The switchover was ok , only that my Primary Node (Node1) was not automatically restarted instead throwing ORA-12514 - TNS:listener does not currently know of service requested in connect descriptor.

I checked all my listener for DG broker have been setup with static listener, which usually this is the cause of this error. Everything looks properly setup until I checked each instance Broker configuration "StaticConnectIdentifier"(New parameter in 11g) property :

dgmgrl > show instance verbose "PRD1"
...
StaticConnectIdentifier='(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=172.22.2.29)(PORT=1532))(CONNECT_DATA=(SERVICE_NAME=PRD_DGMGRL)(INSTANCE_NAME=PRD1)(SERVER=DEDICATED)))';

dgmgrl > show instance verbose "PRD2"
StaticConnectIdentifier='(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=172.22.2.30)(PORT=1532))(CONNECT_DATA=(SERVICE_NAME=PRD_DGMGRL)(INSTANCE_NAME=PRD2)(SERVER=DEDICATED)))';

Well, we configure all the static listener on 172.22.12.7 (VIP1 - Segment 12) and 172.22.12.9 (VIP2 - Segment 12). I remember that IP segment 172.22.2.xx was the initial VIP segment during the installation.

Then I just change it to the correct IP and port number.

dgmgrl> edit instance 'PRD1' set property StaticConnectIdentifier='(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=172.22.12.7)(PORT=1523))(CONNECT_DATA=(SERVICE_NAME=PRD_DGMGRL)(INSTANCE_NAME=PRD1)(SERVER=DEDICATED)))';

dgmgrl> edit instance 'PRD2' set property StaticConnectIdentifier='(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=172.22.12.9)(PORT=1523))(CONNECT_DATA=(SERVICE_NAME=PRD_DGMGRL)(INSTANCE_NAME=PRD2)(SERVER=DEDICATED)))';

After this changes my switchover was running smoothly without any error. Problem solved!

Thursday, 23 June 2011

Running Catalog/Catproc.sql on Primary DB with Dataguard, will this invalid my Physical standby?

A few days back, when I checked one of my newly created DB was having many invalid objects. I checked dba_registry and giving me some of the component are invalid. After a several time running utlrp.sql still didn't help to compile all the SYS invalid objects. There are a few objects of Advanced Queuing objects that keep invalid even after manual compile. I raise an SR to oracle to check if this objects are corrupted.

Then Oracle support ask me to clean up those AQ objects and run Catalog/Catproc.sql on my Primary DB. I just wondering if this will make my physical standby DB become invalid. Since this is Pre-Production environment and many people are testing on it now, then I try to google and check in oracle support if any article about this. I give up and open a new SR about this issue. They confirmed it is fine to rerun catalog/catproc.sql in my environment, everything will be replicated to the standby site.

I tested and it run's well without any issue.Of course, in my case I need to turn off cluster parameter temporary before running it and startup the db in upgrade mode(startup upgrade). Now all my missing corrupted objects are back and no more invalid component in the dba_registry. I check my dataguard for the whole day looks good.