Twitter

Exadata : How to reboot a database server using its ILOM

It may happen that, during an Exadata patching for example, an Exadata database server does not reboot properly then you cannot connect any more and this makes your patch to hang forever. You'll then have to reboot it manually. There is nothing tricky here but it is always good to have the procedure handy when it happens.
For this purpose we will be using its ILOM which is the administration console each Exadata component has. Be sure to have :
  • The database server ILOM IP (usually <dbserver-name>-ilom like <mycluster>db02-ilom)
  • The ILOM root's password (in case of, the default password is welcome1)

Please note that I will describe the command line way of rebooting a server using its ILOM here and not the graphical way. Indeed, depending on how you have to connect to the servers (firewalls, jump servers, sudo, etc...) you may not be able to access the ILOM graphical interface.


Please find the procedure below :

[root@myclusterdb01 ~]# ssh myclusterdb04-ilom
Warning: Permanently added the RSA host key for IP address '10.191.84.24' to the                                                                            list of known hosts.
Password
Oracle(R) Integrated Lights Out Manager
Version 3.2.8.25 r114493
Copyright (c) 2016, Oracle and/or its affiliates. All rights reserved.
Warning: HTTPS certificate is set to factory default.
Hostname: myclusterdb04-ilom
-> reset /SYS Are you sure you want to reset /SYS (y/n)? y
Performing hard reset on /SYS
->

This would have started a hard reboot of the myclusterdb04 database server.

You can then connect to the console to have a look at what is happening  (the server boot logs) :
-> start /sp/console
Are you sure you want to start /SP/console (y/n)? y
Serial console started.  To stop, type ESC (
. . .
[INFO] /usr/sbin/ipmitool user set name 4 iu_ngtmh
[INFO] /usr/sbin/ipmitool user set password 4 ********
[INFO] Executing: /usr/bin/mstflint -y -d /proc/bus/pci/40/00.0 -i /var/log/exadatatmp/firmware/ActualFirmwareFiles/fw-ConnectX3-rel-2_35_5532-15-7046442_7092757.bin  burn

    Current FW version on flash:  2.11.1280
    New FW version:               2.35.5532

[INFO] run /usr/sbin/ipmitool cmd to set /SP/users/iu_ngtmh/role=aucro
Burning FS2 FW image without signatures - 7[INFO] export IPMI_PASSWORD=********
[INFO] /usr/sbin/ipmiflash -v -I lanplus -H 10.191.84.24 -U iu_ngtmh -E write /var/log/exadatatmp/firmware/ActualFirmwareFiles/ILOM-3_2_10_22_a_r121452-Sun_Server_X4-2.pkg force script config delaybios warning=0
Burning FS2 FW image without signatures - OK
Restoring signature                     - OK
[INFO] Waiting for the service processor to finish firmware upgrade for up to 1200 seconds.
. . .
Give few minutes to the server to reboot and you're done.

Please keep in mind that :
  • Unlike an Infiniband Switch, you do not have to use the spsh command to jump into the ILOM shell as you are using the dedicated ILOM IP address to connect to
  • Note that you have to use this weird ILOM syntax to quit the console : ESC and then "("

Hope it helps

2 comments:

  1. Hi,
    Thank you very much for this post. I hope we can use command to come out of ILOM server. Am i right.

    Thank you.

    ReplyDelete
    Replies
    1. Hi,

      Could you please clarify what you want to do ?

      Thanks,

      Delete

Exadata: ILOM hostname change

I am not sure about the root cause of this one but it was very weird finding that an ILOM had a wrong hostname (yes, wrong hostname) -- belo...