With AOS 5.0 Nutanix introduced the Life Cycle Manager (LCM) feature, which helps you to track and upgrade software and firmware versions of all entities in the cluster. LCM is supported for all Nutanix NX and SX platforms. But be aware there are some limitations around scenarios where you have LACP enabled – I learned it the hard way!
The Life Cycle Manager is accessible through the Prism interface and basically performs two functions: taking inventory of the cluster and performing updates on the cluster. Before performing an update, LCM runs a pre-check to verify the state of the cluster. If the check fails, the update operation is aborted. The LCM framework can also update itself when necessary.
Taking Inventory with the Life Cycle Manager
In Prism, click the gearbox button and select Life Cycle Management from the drop-down menu.
In the Life Cycle Manager page click the Inventory
Life Cycle Management Inventory
Click Options > Perform Inventory and wait until the LCM displays all discovered entities.
For details about any entity, click See All. The entity shows the current version, as well as the date and time of the most recent update.
Click All Entities to return to LCM. To close LCM and return to the web console, click the X in the upper right corner.
Performing Updates With the Life Cycle Manager
Click Available Updates.
To specify the location where LCM should look for updates click Options > Advanced Settings.
The Fetch updates from field auto-populates with the URL where LCM will look for updates. To change the location, enter a different address in the Fetch updates from and click Save.
For each displayed component, you can click Define to see available updates for that component.
If you want to perform only some updates and not others, select the button in the upper left of each update to let Prism know which updates it should execute. If you do not select any updates, Prism assumes you want to perform them all.
From the Options drop-down menu, perform your updates.
To perform all available updates, select Update All. To perform only required updates, select Update Required. To perform only updates you have selected, select Update Selected. (If you have not selected any updates, this has the same effect as Update All).
During my first upgrades through Life Cycle Manager I was not aware of the following limitation – I simply didn’t read the release notes:
LCM updates for BIOS, BMC, and SATA DOM are not currently supported on Nutanix clusters that use link-bundling protocols such as LACP or EtherChannel. You must disable any such protocol in order to perform LCM updates. (ENG-74790)
Well things went bad and my first node didn’t come back online after the reboot and was stuck in maintenance mode. Thankfully the awesome support from a Nutanix SRE solved my homemade issue within a couple of minutes und saved my weekend. Thank you Serdar!
After posting in the Nutanix NTC Slack Channel about the issue I faced, fellow Nutanix Technology Champion (NTC) and Nutanix Platform Expert (NPX) Samuel Rothenbühler pointed me to a simple workaround for this limitation: If your switch supports this, enable LACP fallback mode on the LAGs connected to your cluster.
The underlaying reason for the limitation of not supporting LCM updates for BIOS, BMC and SATA DOM is that during these updates the nodes boot into the Phoenix ISO and Phoenix today doesn’t support LACP. Nutanix is actively looking into this and working on a solution to detect LACP configurations during LCM pre-checks and hopefully in the future even have support for LACP in Phoenix.
Until we have LCM support for LACP configurations, the LACP fallback mode will allow Phoenix to successfully connect to the network without the Link Aggregation Control Protocol (LACP). Keep in mind, if your switch does not support this feature there is always the possibility to perform the upgrades manually through IPMI. Details about the manual procedures can be found in the Nutanix BIOS Manual Upgrade Guide and the Nutanix BMC Manual Upgrade Guide.
Last but not least, LCM is not supported on single-node clusters too, because firmware updates usually require services to be stopped (in a single-node cluster there is no other node to take over the workload from the node being updated.) The manual upgrade path is the only way to perform the upgrades on single-node clusters.
If you want to know more details about Nutanix Lifecycle Management, there is an interesting blog article (AOS 5.0 New Feature: Life Cycle Management) including the following demo video authored by Deepa Pottangadi, Instructional Designer, Nikhil Bhatia, Staff Engineer, Manoj Sudheendra Member Technical Staff and Viswanathan Vaidyanathan, Member Technical Staff at Nutanix in the Nutanix Next Community.