Twitter

Exadata: re-image a Cell Storage Server to 19c (OS configuration)

Now that we got the required information to re-image our lost cell from part 1, configured the system to boot on the ISO in part 2, let's now connect to the console and do the OS configuration with the installer.

5/ OS configuration


5.0/ Connect to the console

Connect to the console to see the installer logs:
[root@exadata01_db01]# ssh exadata01_cel02-ilom
Password:
Oracle(R) Integrated Lights Out Manager
Version 4.0.4.36 r128807
Copyright (c) 2019, Oracle and/or its affiliates. All rights reserved.
Warning: HTTPS certificate is set to factory default.
Hostname: exadata01_cel02-ilom
-> set /sp/cli timeout=0
Set 'timeout' to '0'
-> start /sp/console
Are you sure you want to start /SP/console (y/n)? y
Serial console started.  To stop, type ESC (
[200446.510699] usb 2-1.7: new high-speed USB device number 3 using ehci-pci
[200446.603855] usb 2-1.7: New USB device found, idVendor=0430, idProduct=a101
[200446.611631] usb 2-1.7: New USB device strings: Mfr=1, Product=2, SerialNumber=3
. . .
Note here the life saver option set /sp/cli timeout=0 to disable the console timeout which is 15 minutes by default; indeed, as the whole re-image takes 1h30 ~ 2h, it is not handy at all to be disconnected every 15 minutes, lose the history and when some steps take 20+ minutes, you would be under the impression that the installation is stuck as you cannot see anything at the console after you reconnect after a timeout so set /sp/cli timeout=0 is a must have option to use.

5.1/ Set eth0

After around 15 minutes, you will be asked to set an eth0; use the information collected before to fill this section:
IP Address of this host: 10.1.2.3
Netmask of this host: 255.255.254.0
Default gateway: 10.1.2.1
A very important thing here: nor backspace nor CTRL+H seems to work here so be very careful when entering these information as if you mess it up, you'll have to restart the whole process. If someone knows how to correct a typo here, please let me know in the comments, I haven't found how.

5.2/ DNS servers

It may take 45 minutes ~ 1 hour to reach this configuration point after the previous one:
Nameserver:
Add more nameservers (y/n) [n]: y
Nameserver: 10.200.200.4
Add more nameservers (y/n) [n]: y
Nameserver: 10.200.200.5
Add more nameservers (y/n) [n]: y
Nameserver: 10.89.1.28
Add more nameservers (y/n) [n]: n

5.3/ Timezone

1) Andorra
2) United Arab Emirates
3) Afghanistan
4) Antigua & Barbuda
5) Anguilla
6) Albania
7) Armenia
8) Angola
9) Antarctica
10) Argentina
11) Samoa (American)
12) Austria
13) Australia
14) Aruba
15) Ã…land Islands
16) Azerbaijan
Select country by number, [f]irst, [b]ack, [n]ext, [l]ast: 233
Selected country: United States (US). Now choose a zone
For an US timezone, it is 233.

5.4/ NTP

Note the screen bug here where the DNS IP erases the beginning of the line:
The current NTP server(s):
Do you want to change it (y/n) [n]: y
48.1.1qualified hostname or ip address for NTP server. Press enter if none: 10.2
Continue adding more ntp servers (y/n) [n]: n

5.5/ Configure network interfaces

Network interfaces
Name  Bonding  Speed    Status  IP address  Netmask  Gateway  Net type  Hostname
ib0                     UNCONF
ib1                     UNCONF
eth0                    UNCONF
eth1                    UNCONF
eth2                    UNCONF
eth3                    UNCONF
Select interface name to configure or press Enter to continue: ib0
Selected interface. ib0
IP address or none: 192.168.1.3
Netmask: 255.255.252.0
Fully qualified hostname or none: exadata01_cel02-priv1.domain.com
Continue configuring or re-configuring interfaces? (y/n) [y]: y

Network interfaces
Name  Bonding  Speed    Status  IP address    Netmask       Gateway  Net type  Hostname
ib0                     UP      192.168.1.3 255.255.252.0          Private   exadata01_cel02-priv1.domain.com
ib1                     UNCONF
eth0                    UNCONF
eth1                    UNCONF
eth2                    UNCONF
eth3                    UNCONF
Select interface name to configure or press Enter to continue: ib1
Selected interface. ib1
IP address or none: 192.168.1.4
Netmask: 255.255.252.0
Fully qualified hostname or none: exadata01_cel02-priv2.domain.com
Continue configuring or re-configuring interfaces? (y/n) [y]: y

Network interfaces
Name  Bonding  Speed    Status  IP address    Netmask       Gateway  Net type  Hostname
ib0                     UP      192.168.1.3 255.255.252.0          Private   exadata01_cel02-priv1.domain.com
ib1                     UP      192.168.1.4 255.255.252.0          Private   exadata01_cel02-priv2.domain.com
eth0                    UNCONF
eth1                    UNCONF
eth2                    UNCONF
eth3                    UNCONF
Select interface name to configure or press Enter to continue: eth0
Selected interface. eth0
IP address or none: 10.1.2.3
Netmask: 255.255.254.0
Gateway (IP address or none) or none: 10.1.2.1
Link speed (default,10000,25000):
[Warning]: Invalid value. Try again
Link speed (default,10000,25000): default
Fully qualified hostname or none: exadata01_cel02.domain.com
Continue configuring or re-configuring interfaces? (y/n) [y]: y

Network interfaces
Name  Bonding  Speed    Status  IP address    Netmask       Gateway     Net type   Hostname
ib0                     UP      192.168.1.3 255.255.252.0             Private    exadata01_cel02-priv1.domain.com
ib1                     UP      192.168.1.4 255.255.252.0             Private    exadata01_cel02-priv2.domain.com
eth0           default  UP      10.1.2.3 255.255.254.0 10.1.2.1 Management exadata01_cel02.domain.com
eth1                    UNCONF
eth2                    UNCONF
eth3                    UNCONF
Select interface name to configure or press Enter to continue:

5.6/ Canonical hostname

Select canonical hostname from the list below
1: exadata01_cel02-priv1.domain.com
2: exadata01_cel02-priv2.domain.com
3: exadata01_cel02.domain.com
Canonical fully qualified domain name: 3

5.7/ Default gateway

Select default gateway interface from the list below
1: eth0
Default gateway interface: 1

5.8/ A sum up

Network interfaces
Name  State  Speed    Status  IP address    Netmask       Gateway     Net type   Hostname
ib0   Linked          UP      192.168.1.3 255.255.252.0             Private    exadata01_cel02-priv1.domain.com
ib1   Linked          UP      192.168.1.4 255.255.252.0             Private    exadata01_cel02-priv2.domain.com
eth0  Linked default  UP      10.1.2.3 255.255.254.0 10.1.2.1 Management exadata01_cel02.domain.com
eth1  Linked          UNCONF
eth2  Linked          UNCONF
eth3  Linked          UNCONF
Is this correct (y/n) [y]: y

5.9/ ILOM settings

Let's checked / update the ILOM settings:
Do you want to configure basic ILOM settings (y/n) [y]: y
Loading basic configuration settings from ILOM ...
ILOM Fully qualified hostname [exadata01_cel02-ilom.domain.com]:
Inet protocol (IPv4,IPv6) [IPv4]:
ILOM IP address [10.1.2.144]:
ILOM Netmask [255.255.254.0]:
ILOM Gateway or none [10.1.2.1]:
ILOM Nameserver (multiple IPs separated by a comma) or none [10.200.200.4]:
ILOM Use NTP Servers (enabled/disabled) [enabled]:
1]:  First NTP server. Fully qualified hostname or ip address or none [10.248.1.1
ILOM Second NTP server. Fully qualified hostname or ip address or none [none]:
ILOM Vlan id or zero for non-tagged VLAN (0-4079) [0]:

Basic ILOM configuration settings:
Hostname             : exadata01_cel02-ilom.domain.com
IP Address           : 10.1.2.144
Netmask              : 255.255.254.0
Gateway              : 10.1.2.1
DNS servers          : 10.200.200.4
Use NTP servers      : enabled
First NTP server     : 10.248.1.1
Second NTP server    : none
Timezone (read-only) : America/Chicago
VLAN id              : 0
Is this correct (y/n) [y]: y

5.10/ An ignorable error

You can ignore this weird error, at that point, just . . . wait . . .:
[Info]:  Updating runtime sysctl configuration for ib1: net.ipv6.conf.ib1.disable_ipv6=1
[Info]:  Updating runtime sysctl configuration for eth0: net.ipv6.conf.eth0.disable_ipv6=1
[Info]: Adjust settings for IB interfaces in /etc/sysctl.conf
[INFO     ] /opt/oracle.cellos/cellFirstboot.sh: done
exadata01_cel02 login: [ 1948.671627]   MST::  : get_space_support_status 438: At least one SPACE is not supported
Also, you will be disconnected from the ILOM as it will be rebooted as well so just reconnect to the console as you did previously.

5.11/ All good !

It has now finished, you can login with the default welcome1 password and see your system back !
Command line is /opt/oracle.cellos/validations/bin/vldrun.pl -mode first_boot -force -quiet -all
Run validation beginfirstboot - PASSED
Run validation ipmisettings - PASSED
Run validation misceachboot - PASSED
Run validation celldstatus - PASSED
Run validation calibration - PASSED
Run validation saveconfig - BACKGROUND RUN
2019-08-01 02:05:25 -0500 The first boot completed with SUCCESS
2019-08-01 02:05:25 -0500 2019-08-01 02:05:25 -0500 [FACTORY_TEST_END] Post installation tests ended with success
2019-08-01 02:05:25 -0500 2019-08-01 02:05:25 -0500 [FACTORY_COMPLETE] Imaging ended with success 

exadata01_cel02 login: root
Password:
Last failed login: Sun Jul 14 18:42:13 CDT 2019 on ttyS0
Last login: Sun Jul 14 18:43:21 on ttyS0
[root@exadata01_cel02]# df -h
Filesystem      Size  Used Avail Use% Mounted on
devtmpfs         47G     0   47G   0% /dev
tmpfs            47G     0   47G   0% /dev/shm
tmpfs            47G  4.1M   47G   1% /run
tmpfs            47G     0   47G   0% /sys/fs/cgroup
/dev/md5        9.8G  3.2G  6.1G  34% /
/dev/md7        2.9G  1.3G  1.5G  46% /opt/oracle
/dev/md4        244M   52M  177M  23% /boot
/dev/md11       4.6G   61M  4.3G   2% /var/log/oracle
tmpfs           9.4G     0  9.4G   0% /run/user/0
/dev/sdm1       7.3G  1.8G  5.2G  25% /mnt/usb.mrdiag
[root@exadata01_cel02]#

5.12 / Password, SSH, reboot and checks

I would recommend you to modify the default passwords (root, celladmin, cellmon) to the one you want and also you have to regenerate the SSH key
[root@exadata01_cel02]# passwd
Changing password for user root.
New password:
Retype new password:
passwd: all authentication tokens updated successfully.
[root@exadata01_cel02]# ssh-keygen
. . .
[root@exadata01_cel02]#
Here, I also like to reboot to be sure everything comes back online properly.
[root@exadata01_cel02]# reboot
It is worth here doing few checks
[root@exadata01_cel02]# cellcli
CellCLI: Release 19.2.2.0.0 - Production on Thu Aug 01 02:16:05 CDT 2019

Copyright (c) 2007, 2016, Oracle and/or its affiliates. All rights reserved.

CellCLI> list griddisk

CellCLI> list physicaldisk
         252:0           NAAAAA                  normal
         252:1           NBBBBB                  normal
         252:2           NCCCCC                  normal
         252:3           NDDDDD                  normal
         252:4           NEEEEE                  normal
         252:5           NFFFFF                  normal
         252:6           NGGGGG                  normal
         252:7           NHHHHH                  normal
         252:8           NJJJJJ                  normal
         252:9           NKKKKK                  normal
         252:10          NLLLLL                  normal
         252:11          NMMMMM                  normal
         FLASH_10_1      PHLE111111111P4BGN-1    normal
         FLASH_10_2      PHLE222222222P4BGN-2    normal
         FLASH_4_1       PHLE333333333P4BGN-1    normal
         FLASH_4_2       PHLE444444444P4BGN-2    normal
         FLASH_5_1       PHLE555555555P4BGN-1    normal
         FLASH_5_2       PHLE666666666P4BGN-2    normal
         FLASH_6_1       PHLE777777777P4BGN-1    normal
         FLASH_6_2       PHLE888888888P4BGN-2    normal
         M2_SYS_0        PHYHXXXXXXFA240J        normal
         M2_SYS_1        PHYHYYYYYYYR240J        normal

CellCLI> exit
quitting
[root@exadata01_cel02]# imageinfo

Kernel version: 4.1.12-124.26.12.el7uek.x86_64 #2 SMP Wed May 8 22:25:03 PDT 2019 x86_64
Cell version: OSS_19.2.2.0.0_LINUX.X64_190513.2
Cell rpm version: cell-19.2.2.0.0_LINUX.X64_190513.2-1.x86_64

Active image version: 19.2.2.0.0.190513.2
Active image kernel version: 4.1.12-124.26.12.el7uek
Active image activated: 2019-08-01 02:05:25 -0500
Active image status: success
Active system partition on device: /dev/md24p5
Active software partition on device: /dev/md24p7

Cell boot usb partition: /dev/md25p1
Cell boot usb version: 19.2.2.0.0.190513.2

Inactive image version: undefined
Rollback to the inactive partitions: Impossible
[root@exadata01_cel02]#

6/ Bring back the disks into ASM

Now, you'll have to add the disks back into ASM; depending in how your system crashes and/or why you have to re-image your cell storage, there may be different scenario; in our case, we had to recreate the cell disk CD_00 and CD_01 (as they were missing) and add all the disks back into ASM:
-- Recreate celldisks CD_00 and CD-01
CellCLI> create celldisk CD_00_exadata01_cel02 lun=0_0
CellDisk CD_00_exadata01_cel02 successfully created

CellCLI> create celldisk CD_01_exadata01_cel02 lun=0_1
CellDisk CD_01_exadata01_cel02 successfully created

CellCLI>

-- Recreate the grid disks on these cell disks
CellCLI> create griddisk DATAC1_CD_00_exadata01_cel02 celldisk=CD_00_exadata01_cel02, size=5.6953125T
GridDisk DATAC1_CD_00_exadata01_cel02 successfully created

CellCLI> create griddisk RECOC1_CD_00_exadata01_cel02 celldisk=CD_00_exadata01_cel02, size=1.42388916015625T
GridDisk RECOC1_CD_00_exadata01_cel02 successfully created

CellCLI> create griddisk DATAC1_CD_01_exadata01_cel02 celldisk=CD_01_exadata01_cel02, size=5.6953125T
GridDisk DATAC1_CD_01_exadata01_cel02 successfully created

CellCLI> create griddisk RECOC1_CD_01_exadata01_cel02 celldisk=CD_01_exadata01_cel02, size=1.42388916015625T
GridDisk RECOC1_CD_01_exadata01_cel02 successfully created

CellCLI>

-- Added the disks back into ASM
SQL> alter diskgroup DATA add disk
  2  'o/*/DATAC1_CD_00_exadata01_cel02',
  3  'o/*/DATAC1_CD_01_exadata01_cel02',
  4  'o/*/DATAC1_CD_02_exadata01_cel02' force,
  5  'o/*/DATAC1_CD_03_exadata01_cel02' force,
  6  'o/*/DATAC1_CD_04_exadata01_cel02' force,
  7  'o/*/DATAC1_CD_05_exadata01_cel02' force,
  8  'o/*/DATAC1_CD_06_exadata01_cel02' force,
  9  'o/*/DATAC1_CD_07_exadata01_cel02' force,
 10  'o/*/DATAC1_CD_08_exadata01_cel02' force,
 11  'o/*/DATAC1_CD_09_exadata01_cel02' force,
 12  'o/*/DATAC1_CD_10_exadata01_cel02' force,
 13  'o/*/DATAC1_CD_11_exadata01_cel02' force
 14  rebalance power 32;

Diskgroup altered.

SQL> alter diskgroup RECO add disk
  2  'o/*/RECOC1_CD_00_exadata01_cel02',
  3  'o/*/RECOC1_CD_01_exadata01_cel02',
  4  'o/*/RECOC1_CD_02_exadata01_cel02' force,
  5  'o/*/RECOC1_CD_03_exadata01_cel02' force,
  6  'o/*/RECOC1_CD_04_exadata01_cel02' force,
  7  'o/*/RECOC1_CD_05_exadata01_cel02' force,
  8  'o/*/RECOC1_CD_06_exadata01_cel02' force,
  9  'o/*/RECOC1_CD_07_exadata01_cel02' force,
 10  'o/*/RECOC1_CD_08_exadata01_cel02' force,
 11  'o/*/RECOC1_CD_09_exadata01_cel02' force,
 12  'o/*/RECOC1_CD_10_exadata01_cel02' force,
 13  'o/*/RECOC1_CD_11_exadata01_cel02' force
 14  rebalance power 32;

Diskgroup altered.

SQL>



And you are all done now !


Quick links: part 1 / part 2 / part 3


8 comments:

  1. Hello,
    thank You for this clear and concise explanation.
    Two questions if I may:
    why for this 'o/*/RECOC1_CD_00_exadata01_cel02', there is no force option
    and how did You know the actual size of the disks added .
    Regards.
    Grzegorz

    ReplyDelete
    Replies
    1. Hi,

      Thanks for your comment.

      For the sizes, I checked on a surviving cell as all the disks must have the same sizes.

      There is no FORCE for CD_00 and CD_01 as I have recreated them so they were no more defined in ASM. The others were still referenced in ASM so a FORCE was needed.

      Thanks,

      Fred

      Delete
  2. Hi

    I first looked at our X7-2 storage server ,we found every Flash Card create two phy disks ,may i know the concept here ,why the every flash card two physical disk and what is needed ?

    Please clarify

    ReplyDelete
    Replies
    1. Hi Fred

      Please let us know the concept

      I first looked at our X7-2 storage server ,we found every Flash Card create two phy disks ,may i know the concept here ,why the every flash card two physical disk and what is needed ?

      Delete
  3. Thanks for sharing this valuable information

    ReplyDelete
  4. for x5-2 server i check in mos that support required exadata 12.1.2 its posible to us to upgrade to 18 or 18 version exadata software ?

    ReplyDelete
    Replies
    1. You can even install 21 on X5, check note 888828.1

      Delete

Load 2 billion rows with SQL LOADER

I worked on few Extract data / transform data (nor not) / load data into an Oracle database projects during my professional life and SQL ...