An Unknown DBA blog: How to Patch / Upgrade Exadata to any version -- Part 5 -- Troubleshooting

It does not seem possible to me to write an exhaustive blog about troubleshooting an Exadata patching session which would go bad (or it would be incredibly pretentious). Indeed, an Exadata stack is a complete and complex mix of software and hardware which can, on top of that, be configured very differently depending in each company needs, norms, compliance rules, etc....

The best way, in my humble opinion, to be able to efficiently troubleshoot a failure during an Exadata patching session is to:

Know vi and grep to check the logfiles :)
Have the full picture of the Exadata patching procedure
Keep in mind that, in case of rolling patches, even a crash or a non reponsive server does not impact the uptime of the applications as everything is (at least) redundant so take your time to troubleshoot and stay cool (as a cucumber)
Have access to well documented procedures from the real life which help setting and/or manage Exadata components; indeed, the main reasons for problems during Exadata patching sessions are:

Failed pre-requisites:

A crash / timeout / non responsive server during the patching:

Again, this is not exhaustive and this is the beauty of it ! and this is why this blog is a living blog and will be updated when new issues and solutions appear.

Troubleshooting a failed patch starts with checking the logfiles to get more informatio about the issue you just faced; below a list of the most commn logfiles:

patchmgr.log -- the main patchmgr logfile
patchmgr.trc -- a more detailed patchmgr output
nodename.log --
/var/log/cellos/dbnodeupdate.log -- located on the node (not in the patchmgr directory), detailed log of the patch application on this specific node

Each procedure listed below has been executed on real life production Exadatas at least once (many have been used far more than once)