Sunday, April 10, 2011

DB2 9.7 HADR with TSA - Part 08 - Performing some DB2 9.7 HADR failover and failback test

In this artical i will provide failover and failback command for DB2 HADR with TSA and provide DB2 HADR test cases. I will also provide HADR step that i followed for testing.

Part 0 : DB2 9.7 HADR with TSA Part 00 - Introduction







 
A) Normal Operation
When DB2 HADR is configured under TSA clustering, under normal operation below is status of TSA resources and resource group

lssam -top



B) Controlled Failover


1. Current Primary Node => mumbai
2. Current Standby Node => london
3. Node on which failover command is executed => mumbai

Note:
For failover operation, the TSA command must be executed as "root" user on the existing primary node.





C) Controlled Failback

1. Current Primary Node => london
2. Current Standby Node => mumbai
3. Node on which failover command is executed => london

Note:
For failover operation, the TSA command must be executed as "root" user on the existing primary node.


 
D) Primary Instance Failure


In this case, TSA will try to start the Primary Instance Automatically.




E) Stanby Instance Failure


In this case, TSA will try to start the Standby Instance Automatically.



F) Primary Node failure


In this test scenario, you need a third node which behave as a Tie-breaker when the communication between each of the HADR node is lost. i.e. when the communication link between HADR node is lost, then the Network Tie-breaker node is used to decide which node will be the owning the cluster resource and will reboot the remaining node.


In order to test this scenario,
1) Create the TSA domain with network quoum which refer to the IP address of third non-HADR node.
2) Then bring down the eth0 card of primary node.
3) Bringing the eth0 card on primary node will force hard boot of Primary Node and TSA will perform a force takeover of HADR on Standby node
4) Following the Primary node restart the DB take the new role as STANDBY node.

G) Stopping TSA monitoring for Database


This will stop the TSA monitoring of the HADR Databases. But, the DB2 HADR is not terminated by this operation. DB2 HADR configuration continue to work as normal. Only, automatic failover of the DB2 HADR is disabled.







H) Starting TSA Monitoring for Database


This operation will start the TSA monitoring of the DB2 HADR database and provide operation of automatic failover following the primay node failure.







I) Standby instance TSA Resource Group failure




Primary Enters into disconnected state



Standby Instance Resource Group restored
At this stage, if the Standby Resource is stopped because of some error, then TSA will try to start it. If the Standby resource is manually stopped then TSA will not try to start it.




J) Primary Instance TSA Resource Group Failure





Standby Enters into DISCONNECTEDPEER State because HADR_PEER_WINDOW=300 (seconds)
Standby enters into REMOTECATCHPENDING state after HADR_PEER_WINDOW expires
Standby continue to stay in REMOTECATCHPENDING STATE

Restore the PRIMARY instance RESOURCE group


K) Failover using DB2 TAKEOVER command



L) Failback using DB2 TAKEOVER command


1 comment:

Unknown said...

Hi,

when I tried to move the resource nothing is happening. I can switch on/off both PRimary and Standby, but I cannot move the resource...
The HADR roles got switched during cofngiruation of db2haicu, and it was the last time it worked
Do you have any suggestions ?