Twitter

Exadata : Increase or decrease the number of activated CPUs / cores (aka Capacity-On-Demand)

It may happen that you will need to decrease or increase the CPUs/cores activated on some Exadatas. This feature, known as "Capacity-On-Demand", is available since X4-[28].

This blog presents a procedure that has been successfully applied on few production Exadatas.

0/ Information

  • Each server we want to increase / decrease the number of cores would have to be rebooted
  • I will use the rac-status.sh script to check the status of all the running resources before the maintenance and after the maintenance
  • Keep in mind that there is a minimum and maximum number of cores that can be activated. It is documented here
  • It is possible to have a different number of cores activated on database servers part of the same GI -- it is not recommended though

1/ Save the status of the resources before the maintenance

Before any maintenance, I like to save a status of every resource to be able to compare with an after maintenance status to ensure that everything is back to normal after a maintenance and then avoid any unpleasantness.

[root@exadatadb01]# ./rac-status.sh -a | tee -a ~/status_before_cpu_change

2/ Ensure that the ~/dbs_group file is up to date

This step is optional, the ~/dbs_group file is supposed to be quite static; I personally like to double check it before an important maintenance to not forget any node.

-- The 2 below commands should return the same output
[root@exadatadb01]# ibhosts | sed s'/"//' | grep db | awk '{print $6}' | sort
[root@exadatadb01]# cat ~/dbs_group

-- if not, update the ~/dbs_group file
[root@exadatadb01]# ibhosts | sed s'/"//' | grep db | awk '{print $6}' | sort > ~/dbs_group
[root@exadatadb01]# cat ~/dbs_group

3/ Check the current configuration

Here, we are just checking the current configuration to know what we will modify.

[root@exadatadb01]# dcli -g ~/dbs_group -l root "dbmcli -e list dbserver attributes corecount, cpucount, pendingCoreCount" | awk 'BEGIN{printf("%10s%10s%10s%10s\n\n", "Node", "Cpu", "Cores", "Pending")} {printf("%10s|%10s|%10s|%10s\n", $1, $2, $3, $4)}'
      Node          Cpu       Cores    Pending

 exadatadb01:|     36/36|     72/72|          
 exadatadb02:|     36/36|     72/72|          
 exadatadb03:|     36/36|     72/72|          
 exadatadb04:|     36/36|     72/72|          
[root@exadatadb01]#

It is expected the Pending column to be empty at this stage.


4/ Modify the pending core count

Depending on your needs, you can modify the pending core count on one node or on all the nodes of the Exadata (you can also modify only on some nodes by updating the ~/dbs_group file accordingly)
Here, I will be modifying the number of CPUs to 16 instead of 36 so I will set the number of cores to 32 as this Exadata is a X6-2 (then 2 cores per CPU)

-- To modify the pending core count on one server
[root@exadatadb01]# dbmcli -e alter dbserver pendingCoreCount = 32 force

-- To modify the pending core count on all the nodes in one command
[root@exadatadb01]# dcli -g ~/dbs_group -l root "dbmcli -e alter dbserver pendingCoreCount = 32 force"

5/ Verify the pending core count setting before reboot

We can see here that the Pending column contains our new setting that will be applied at next reboot.

[root@exadatadb01]# dcli -g ~/dbs_group -l root "dbmcli -e list dbserver attributes corecount, cpucount, pendingCoreCount" | awk 'BEGIN{printf("%10s%10s%10s%10s\n\n", "Node", "Cpu", "Cores", "Pending")} {printf("%10s|%10s|%10s|%10s\n", $1, $2, $3, $4)}'

      Node          Cpu       Cores    Pending

 exadatadb01:|     36/36|     72/72|   32/32 
 exadatadb02:|     36/36|     72/72|   32/32
 exadatadb03:|     36/36|     72/72|   32/32
 exadatadb04:|     36/36|     72/72|   32/32


6/ Reboot

A reboot is needed to apply the changes. Note here that you can balance the database services to a server that won't reboot to avoid any downtime from an application perspective.

[root@exadatadb01]# reboot

7/ Verify the pending core count setting after reboot

Check that everything looks good after reboot.

[root@exadatadb01]# dcli -g ~/dbs_group -l root "dbmcli -e list dbserver attributes corecount, cpucount, pendingCoreCount" | awk 'BEGIN{printf("%10s%10s%10s%10s\n\n", "Node", "Cpu", "Cores", "Pending")} {printf("%10s|%10s|%10s|%10s\n", $1, $2, $3, $4)}'
      Node          Cpu       Cores    Pending

 exadatadb01:|     16/36|     32/32|          
 exadatadb02:|     16/36|     32/32|          
 exadatadb03:|     16/36|     32/32|          
 exadatadb04:|     16/36|     32/32|       

8/ Verify the status of the resources after the reboot(s)

Here, we check the status of the resources running and compared with the status before the maintenance to be sure we are idempotent.

[root@exadatadb01]# ./rac-status.sh | tee -a ~/status_after_cpu_change
[root@exadatadb01]# diff ~/status_before_cpu_change ~/status_after_cpu_change


2 comments:

  1. Useful info and simplified checks. Just a pointer that the Cpu and Cores heading should be swapped.

    ReplyDelete

CUDA: getting started on WSL

I have always preferred command line and vi finding it more efficient so after the CUDA: getting started on Windows , let's have a loo...