This blog presents a procedure that has been successfully applied on few production Exadatas.
0/ Information
- Each server we want to increase / decrease the number of cores would have to be rebooted
- I will use the rac-status.sh script to check the status of all the running resources before the maintenance and after the maintenance
- Keep in mind that there is a minimum and maximum number of cores that can be activated. It is documented here
- It is possible to have a different number of cores activated on database servers part of the same GI -- it is not recommended though
1/ Save the status of the resources before the maintenance
Before any maintenance, I like to save a status of every resource to be able to compare with an after maintenance status to ensure that everything is back to normal after a maintenance and then avoid any unpleasantness.
[root@exadatadb01]# ./rac-status.sh -a | tee -a ~/status_before_cpu_change
2/ Ensure that the ~/dbs_group file is up to date
This step is optional, the ~/dbs_group file is supposed to be quite static; I personally like to double check it before an important maintenance to not forget any node.
-- The 2 below commands should return the same output [root@exadatadb01]# ibhosts | sed s'/"//' | grep db | awk '{print $6}' | sort [root@exadatadb01]# cat ~/dbs_group -- if not, update the ~/dbs_group file [root@exadatadb01]# ibhosts | sed s'/"//' | grep db | awk '{print $6}' | sort > ~/dbs_group [root@exadatadb01]# cat ~/dbs_group
3/ Check the current configuration
Here, we are just checking the current configuration to know what we will modify.
[root@exadatadb01]# dcli -g ~/dbs_group -l root "dbmcli -e list dbserver attributes corecount, cpucount, pendingCoreCount" | awk 'BEGIN{printf("%10s%10s%10s%10s\n\n", "Node", "Cpu", "Cores", "Pending")} {printf("%10s|%10s|%10s|%10s\n", $1, $2, $3, $4)}' Node Cpu Cores Pending exadatadb01:| 36/36| 72/72| exadatadb02:| 36/36| 72/72| exadatadb03:| 36/36| 72/72| exadatadb04:| 36/36| 72/72| [root@exadatadb01]#
It is expected the Pending column to be empty at this stage.
4/ Modify the pending core count
Depending on your needs, you can modify the pending core count on one node or on all the nodes of the Exadata (you can also modify only on some nodes by updating the ~/dbs_group file accordingly)
Here, I will be modifying the number of CPUs to 16 instead of 36 so I will set the number of cores to 32 as this Exadata is a X6-2 (then 2 cores per CPU)
-- To modify the pending core count on one server [root@exadatadb01]# dbmcli -e alter dbserver pendingCoreCount = 32 force -- To modify the pending core count on all the nodes in one command [root@exadatadb01]# dcli -g ~/dbs_group -l root "dbmcli -e alter dbserver pendingCoreCount = 32 force"
5/ Verify the pending core count setting before reboot
We can see here that the Pending column contains our new setting that will be applied at next reboot.
[root@exadatadb01]# dcli -g ~/dbs_group -l root "dbmcli -e list dbserver attributes corecount, cpucount, pendingCoreCount" | awk 'BEGIN{printf("%10s%10s%10s%10s\n\n", "Node", "Cpu", "Cores", "Pending")} {printf("%10s|%10s|%10s|%10s\n", $1, $2, $3, $4)}' Node Cpu Cores Pending exadatadb01:| 36/36| 72/72| 32/32 exadatadb02:| 36/36| 72/72| 32/32 exadatadb03:| 36/36| 72/72| 32/32 exadatadb04:| 36/36| 72/72| 32/32
6/ Reboot
A reboot is needed to apply the changes. Note here that you can balance the database services to a server that won't reboot to avoid any downtime from an application perspective.
[root@exadatadb01]# reboot
7/ Verify the pending core count setting after reboot
Check that everything looks good after reboot.
[root@exadatadb01]# dcli -g ~/dbs_group -l root "dbmcli -e list dbserver attributes corecount, cpucount, pendingCoreCount" | awk 'BEGIN{printf("%10s%10s%10s%10s\n\n", "Node", "Cpu", "Cores", "Pending")} {printf("%10s|%10s|%10s|%10s\n", $1, $2, $3, $4)}' Node Cpu Cores Pending exadatadb01:| 16/36| 32/32| exadatadb02:| 16/36| 32/32| exadatadb03:| 16/36| 32/32| exadatadb04:| 16/36| 32/32|
8/ Verify the status of the resources after the reboot(s)
Here, we check the status of the resources running and compared with the status before the maintenance to be sure we are idempotent.
[root@exadatadb01]# ./rac-status.sh | tee -a ~/status_after_cpu_change [root@exadatadb01]# diff ~/status_before_cpu_change ~/status_after_cpu_change
Thanks a lot - great info
ReplyDeleteUseful info and simplified checks. Just a pointer that the Cpu and Cores heading should be swapped.
ReplyDelete