Twitter

Exadata patching : patchmgr -modify_at_prereq (dbnodeupdate.sh -M)

When you patch an Exadata and specially the database servers, you may encounter some OS dependencies issues during the pre requisites phase as we will see below.

1/ A word on patchmgr and dbnodeupdate.sh

Before heading to the OS dependencies and the -modify_at_prereq option, it is important to keep few points in mind:
  • dbnodeupdate.sh is the shell script that patches every database server individually
  • patchmgr is the orchestration tool that starts dbnodeupdate.sh in parallel across many database servers (those in the dbs_group configuration file)
  • before patchmgr, we were using dbnodeupdate.sh manually
  • dbnodeupdate.sh has a -M option to remove RPMs to resolve dependencies issues
  • patchmgr has the -modify_at_prereq option to remove RPMs to resolve dependencies issues
  • even if, like me, you always use patchmgr, you will see some error messages related to the -M option for dbnodeupdate.sh as it is the real script patching the servers
  • to sum up, when you start a ./patchmgr -dbnodes ~/dbs_group command, patchmgr will start the dbnodeupdate.sh script on each node contained in the ~/dbs_group file with the proper option
  • then a ./patchmgr -dbnodes ~/dbs_group -modify_at_prereq command will launch many dbnodeupdate.sh -M on each server specified in the ~/dbs_group file


2/ Dependencies issues during the database servers pre requisites

If patchmgr finds dependencies issues when executing the database servers pre requisites, you will have an output like the below one (it looks like it badly failed, right ?):
[root@exadatacel01 ~]# ./patchmgr -dbnodes ~/dbs_group -precheck -iso_repo /tmp/SAVE/p28666206_*_Linux-x86-64.zip -target_version 18.1.9.0.0.181006 -allow_active_network_mounts
. . .
exadatadb01# The following file lists the commands that would have been executed for removing rpms when specifying -M flag. #
exadatadb01# File: /var/log/cellos/nomodify_results.080718220008.sh. #
exadatadb01# ERROR: Found dependency issues during pre-check. Packages failing:
exadatadb01# ERROR: Package: exadata-sun-computenode-exact-12.2.1.1.7.180506-1.noarch (Fails because of required removal of Exadata rpms)
exadatadb01# ERROR: Package: oracle-ofed-release-1.0.0-23.el6.x86_64
exadatadb01# ERROR: Consult file [root@exadatacel01 ~]#/var/log/cellos/minimum_conflict_report.080718220008.txt for more information on the dependencies failing and for next steps.
exadatadb01# The following known issues will be checked for but require manual follow-up:
exadatadb01# (*) - Yum rolling update requires fix for 11768055 when Grid Infrastructure is below 11.2.0.2 BP12
In this error stack, patchmgr gives us many information:
  1. The file /var/log/cellos/nomodify_results.080718220008.sh contains the commands that would have been executed for removing rpms when specifying -M flag (then the -modify_at_prereq option)
  2. patchmgr found depedencies issues in the exadata-sun-computenode-exact-12.2.1.1.7.180506-1.noarch and the oracle-ofed-release-1.0.0-23.el6.x86_64 package
  3. patchmgr advices us to Consult the file /var/log/cellos/minimum_conflict_report.080718220008.txt for more information on the dependencies failing and for next steps
Let's start by having a look at this /var/log/cellos/minimum_conflict_report.080718220008.txt file for more information on the dependencies failing and for next steps. This file contains 3 next steps; let's have a look at them one by one:
[root@exadatadb01 ~]# cat /var/log/cellos/minimum_conflict_report.080718220008.txt
  Warning: Yum update will fail due to dependency issues. No updates have been made yet.
  THE BELOW STEPS SHOULD BE FOLLOWED FOR RESOLVING THE DEPENDENCY ISSUE - STARTING WITH STEP :
  1. Re-run the updating tool allowing known failing dependencies to be removed during precheck time:
     Use flag -M for dbnodeupdate.sh or flag -modify_at_prereq for patchmgr
So the first next step is to run the pre requisites with the -modify_at_prereq option.

Step 2 says that it if is still failing with the -modify_at_prereq option, we should manually remove the customs RPMs found on the system:
  2. IF AFTER STEP 1) CHECKS STILL FAIL, REMOVE ANY CUSTOM INSTALLED (NON-DEFAULT) RPMS (IF ANY):
#
#   Additional Exadata dbnode package overview:
#   ===========================================
#
   #################################################################################
   # File initialized at 080718_220458 (runid :080718220008) by dbnodeupdate.sh 5.180528
   # NOTE: This list contains rpms which are seen as custom rpms
   #################################################################################

   # Exadata computenode package             : exact (locked)
   # Number of additional packages installed : 3
   #
   # Additional packages installed           :
   # ============================================
   nmap.x86_64
   nc.x86_64
   iftop.x86_64
Note that, in the above example, the customs RPMs look very harmless, nothing intrusive here.

And step 3 details the failed non-customs dependencies ("non-customs dependencies" meaning "Exadata system packages dependencies issues"):
  3. RPM DEPENDENCIES FAILING:
Error: Package: exadata-sun-computenode-exact-12.2.1.1.7.180506-1.noarch (exadata_generated_080718220008)
           Requires: libibcm(x86-64) = 1.0.5-1.0.2.el6
           Installed: libibcm-1.0.5-3.el6.x86_64 (installed)
               libibcm(x86-64) = 1.0.5-3.el6
           Available: libibcm-1.0.5-1.0.2.el6.x86_64 (exadata_generated_080718220008)
--
Error: Package: oracle-ofed-release-1.0.0-23.el6.x86_64 (exadata_generated_080718220008)
           Requires: libibcm = 1.0.5-1.0.2.el6
           Installed: libibcm-1.0.5-3.el6.x86_64 (installed)
               libibcm = 1.0.5-3.el6
           Available: libibcm-1.0.5-1.0.2.el6.x86_64 (exadata_generated_080718220008)
--
Error: Package: oracle-ofed-release-1.0.0-23.el6.x86_64 (exadata_generated_080718220008)
           Requires: libibcm = 1.0.5-1.0.2.el6
           Installed: libibcm-1.0.5-3.el6.x86_64 (installed)
               libibcm = 1.0.5-3.el6
           Available: libibcm-1.0.5-1.0.2.el6.x86_64 (exadata_generated_080718220008)

   Optionally: when filing a service request, include the following information and files :
      1. Runid of the failing update : 080718220008
      2. dbnodeupdate.log file       : /var/log/cellos/dbnodeupdate.log
      3. Diagfile for this run       : /var/log/cellos/dbnodeupdate.080718220008.diag

      *  Re-run dbnodeupdate.sh when the issues are resolved.
[root@exadatadb01 ~]#
Here, the packages look very "system" indeed:
  • exadata-sun-computenode-exact-12.2.1.1.7.180506-1.noarch
  • oracle-ofed-release-1.0.0-23.el6.x86_64
  • oracle-ofed-release-1.0.0-23.el6.x86_64

Now let's have a look at the /var/log/cellos/nomodify_results.080718220008.sh file which contains "the commands that would have been executed for removing rpms when specifying -M flag":
[root@exadatadb01 ~]# cat /var/log/cellos/nomodify_results.080718220008.sh
#################################################################################
# nomodify result file (/var/log/cellos/nomodify_results.080718220008.sh) initialized at 080718_220354 (runid :080718220008) by dbnodeupdate.sh 5.180528
#################################################################################
rpm -Uvh  /var/log/exadatatmp/080718220008/libibcm-1.0.5-1.0.2.el6.x86_64.rpm --oldpackage --nodeps
rpm -e  --nodeps exadata-sun-computenode-exact
[root@exadatadb01 ~]#

So now, 2 things are crystal clear:
  1. patchmgr won't take care of your customs RPMs whether you use -modify_at_prereq or not, we have to take care ourself of the customs RPMs, fair enough
  2. patchmgr will remove the system RPMs with failed dependencies when using the -modify_at_prereq option

Point number 2 makes -modify_at_prereq looking like an awesome option, right ? it removes itself whatever is failing to make it work, awesome !

This is also what I thought until this option froze all my 12c databases when I executed the Oct 2017 database servers pre requisites.
How come ? let's start by having a look at the documentation (dbnodeupdate.sh and dbserver.patch.zip: Updating Exadata Database Server Software using the DBNodeUpdate Utility and patchmgr (Doc ID 1553103.1)) :
If you want to be sure to rely on precheck output then run precheck with the additional flag to allow rpm modifications but do understand that this is might be making changes to your system that are not expected at this time.
It is recommended to run the prereq check with the flags to allow making changes within a maintenance window right before the actual updating.
Known (not custom) conflicting rpms will be always removed when running the update regardless.
Note: what is here referred as "the additional flag" in this documentation is the -modify_at_prereq option
Note 2 : I haven't pasted the whole doc for visibility purpose

The official documentation then clearly mentions that :
  • System RPMs will be removed during the pre requisites phase if you specify -modify_at_prereq

  • Oracle recommends to execute the pre requisites within a maintenance window
  • System RPMs will be removed when running the update whether you specify or not
  • Sorry to bold and underline almost everything here but this is very important :)

    It now looks more dangerous than you thought, right ? Indeed, the -modify_at_prereq option will remove system RPMs during the pre requisites phase ! (which lead to freeze all my 12c databases as Oct 2017 removed a packaged that was used by the exafusion feature and as exafusion_enabled=1 by default, it froze all the databases . . .)
    I then strongly recommend to never use the -modify_at_prereq option.
    

    Indeed:
    • You may run your pre requisites days or weeks before upgrading your Exadata so you do not want to remove any RPMs at this time !
    • Oracle recommends to use this option within a window maintenance which confirms that there is a potential risk (I experienced it myself)
    • In real life, we do not run the pre requisites during a window maintenance, only the real upgrade happens during a window maintenance. Asking for a window maintenance to run some pre requisites would mean there is a risk to the databases which would trigger many questions from the process guys like "what can happen ?", "how long ?", "how to rollback any modification if they are harmful ?" . . . so many questions we cannot answer -- indeed, how to reinstall the RPMs with dependencies issues that would be removed by -modify_at_prereq ?
    • Would you remove a system RPM with all your databases running ? outside of a window maintenance ?
    • The documentation clearly says that the system RPMs will be removed during the upgrade regardless, so why would you want to remove them weeks before ?


    3/ So . . . how to cope with OS dependencies issues ?

    Knowing what we now know about -modify_at_prereq from the previous paragraph, how to manage the failed dependencies then ? Please find the way to go:

    - For custom RPMs:

    The safer option is to manually remove them before starting the upgrade and reinstall them after the upgrade. Having said that, I would nuance this by saying (this is from my own experience and how I proceed, nothing official here):
    • for harmless customs RPMs (like screen or those shown on my first example), don't touch them, there will be no issue
    • for more complex RPMs like a third party tool, remove it before the upgrade and reinstall it after the upgrade to be safe (but depending on the tool, I would bet to have no issue if you don't remove it; try it on a test system first)
    Note: I saw patchmgr not starting the upgrade due to custom RPMs once; it was some old Red Hat 5 packages leftovers from when we upgraded Exadata to Red Hat 6; they were then considered as custom as non Red Hat 6 but were from the previous system; we had to delete them manually before the upgrade. It was on a X2 and I suspect Oracle to not deeply test the Bundles on old X2 thus the RH 5 leftovers when we upgraded to RH 6.

    - For system RPMs:

    • It is written in the documentation: patchmgr will take care of them during the upgrade so... just do nothing !

    For the most attentive of you, you would wonder about the "If you want to be sure to rely on precheck output" in the "If you want to be sure to rely on precheck output then run precheck with the additional flag . . ." sentence. Even if this sounds like no other pre requisites could be checked if patchmgr finds a RPM dependency issue, it is not and you will also be prompted for space issue or any other issue on top of these RPMs dependencies error messages you'd get. To prove my case, here is Oracle Support's answer when I asked about this specific point:


    4/ How my badly failed example looks like now ?

    Now comes the time of a test; have a look at the pre requisites output shown in paragraph 1 and answer this question:
    "How bad does this pre requisites output looks like ?"
    If your answer is:
    "it looks perfect, we can proceed straight away, no need to do anything here, this patch will be a walk in the park !"
    Then you now know everything related to the -modify_at_prereq option, congratulations !
    Indeed, the customs RPMs won't harm and patchmgr will take care of the system one !

    As usual on this blog, this example comes from real production life and this specific pre requisites output is from a very important platform where the client can accept a 10 minutes outage maximum when patching (meaning there is no room for trying things) and I executed patchmgr successfully to patch the database servers with this exact first output without removing any RPMs. And . . . I still work for this client meaning it was indeed a walk in the park :)


    5/ Be careful though

    Be careful though, even if usually there is nothing to modify before starting the real database servers upgrade, watch failed pre requisites outputs closely, have a look at all the logfiles (the logfiles of failed pre requisites are located on the server where they failed, not in the directory where you started /patchmgr -dbnodes ~/dbs_group -precheck) to ensure that:
    • The customs RPMs are harmless
    • The system RPMs shown look like system ones (*exadata*, *sun*, *lib*, *oracle*, . . .)
    • There is no other issues shown in the error stack (you'll soon become very used to this kind of output that you may overlook another issue raised by these pre requisites like a space issue for example)
    It is really worth spending time to check these pre requisites outputs closely. This is key for smooth maintenances.


    Enjoy !

    4 comments:

    1. Perfect, thank you on your efforts , and I have one question please
      - what do you mean by the following:
      " have a look at all the logfiles (the logfiles of failed pre requisites are located on the server where they failed, not in the directory where you started /patchmgr -dbnodes ~/dbs_group -precheck)
      "
      what logs you mean ? and where to find?

      ReplyDelete
      Replies
      1. Hi,

        Sorry if this in unclear; it means that the logfiles are not located in the patchmgr directory but on the nodes themselves.

        Let's say you start the DB nodes pre requisites from cel01; patchmgr will then start the pre requisites on all the DB nodes in parallel; if they fail, you will find the logfiles on the DB nodes, not on cel01; please have a look at this error stack (have a look at the lines with "==>"):

        **************************************************************
        SUMMARY OF WARNINGS AND ERRORS FOR exadata03:

        exadata03: # The following file lists the commands that would have been executed for removing rpms when specifying -M flag. #
        ==> exadata03: # File: /var/log/cellos/nomodify_results.040618194242.sh. #
        exadata03: ERROR: Found dependency issues during pre-check. Packages failing:
        exadata03: ERROR: Package: exadata-sun-computenode-exact-12.2.1.1.7.180506-1.noarch (Fails because of required removal of Exadata rpms)
        exadata03: ERROR: Package: oracle-ofed-release-1.0.0-23.el6.x86_64
        ==> exadata03: ERROR: Consult file exadata03:/var/log/cellos/minimum_conflict_report.040618194242.txt for more information on the dependencies failing and for next steps.
        exadata03: The following known issues will be checked for but require manual follow-up:
        exadata03: (*) - Yum rolling update requires fix for 11768055 when Grid Infrastructure is below 11.2.0.2 BP12


        2018-06-04 19:48:12 -0500 :ERROR : DONE: dbnodeupdate.sh precheck on exadata03

        SUMMARY OF WARNINGS AND ERRORS FOR exadata04:

        exadata04: # The following file lists the commands that would have been executed for removing rpms when specifying -M flag. #
        ==> exadata04: # File: /var/log/cellos/nomodify_results.040618194242.sh. #
        exadata04: ERROR: Found dependency issues during pre-check. Packages failing:
        exadata04: ERROR: Package: exadata-sun-computenode-exact-12.2.1.1.7.180506-1.noarch (Fails because of required removal of Exadata rpms)
        exadata04: ERROR: Package: oracle-ofed-release-1.0.0-23.el6.x86_64
        ==> exadata04: ERROR: Consult file exadata04:/var/log/cellos/minimum_conflict_report.040618194242.txt for more information on the dependencies failing and for next steps.
        exadata04: The following known issues will be checked for but require manual follow-up:
        exadata04: (*) - Yum rolling update requires fix for 11768055 when Grid Infrastructure is below 11.2.0.2 BP12
        **********************************************************************

        I hope this clarify,

        Fred

        Delete
    2. This comment has been removed by a blog administrator.

      ReplyDelete
    3. If using dbnodeupdate.sh to patch, then you can use during the patching/upgrading step (not precheck step !) the option "-M -R". It will remove any conflicting RPM : System or Custom RPM.
      Currently there is no similar command for patchmgr in patching/upgrading step...

      ReplyDelete

    Load 2 billion rows with SQL LOADER

    I worked on few Extract data / transform data (nor not) / load data into an Oracle database projects during my professional life and SQL ...