As I’m currently in a metal tube cruising at some 35,000 feet on my way to another enablement session I figure I can put the time to good use and update all my Azure Stack Friends on where things stand on hardware P&U for your Dell EMC Cloud for Microsoft Azure Stack appliances (VxRACK AS). As a teaser, I’ve got some great news.
But before I get into what’s new let’s review a little bit about P&U on Azure Stack and the various jiggly bits of the process. On the Azure Stack software front we know that this is being handled via updates on a nearly monthly cadence through the Azure Stack Administrator Portal – the detailed process can be found here, but essentially the update is advertised for the Azure Stack Operator to download and update via the portal interface. From an automation perspective all the draining and re-hydration of compute nodes and application of updates are handled by the Azure Stack software and there is tooling provided by Microsoft to monitor progress. (It is important to note that updates are characterized as “minimally disruptive” and it is strongly suggested that they occur during planned maintenance windows as service interruptions are possible depending on the nature of the update.) The process itself is fairly straight forward.
Great! But that leaves us with an incomplete picture of what actually goes into the care and feeding of an Azure Stack when it comes to P&U and the hardware piece is an entirely different matter. First let’s consider the components we are talking about, in general terms. Obviously, the compute nodes within the cluster are part of the equation, when we update these they must follow a similar process to the Azure Stack software in order to avoid unnecessary disruption to the Azure Stack services. It follows that we must put nodes (individually) into maintenance mode, update them, bring them back into service and then validate success and then rinse and repeat x the number of nodes in the cluster. This requires a high degree of interaction and monitoring by an Operator involving multiple manual steps to bring nodes in and out of maintenance mode. While this is time consuming, at first glance it may not seem particularly arduous. But, it is a highly repetitive task, with many steps and one that if done incorrectly can lead to unanticipated downtime, configuration drift, and generally speaking less than desirable results. For instance, what if I mistime taking nodes down and end up with multiple nodes in maintenance mode simultaneously, or if steps are inadvertently skipped?
Next, we have the Hardware Lifecycle Host (HLH). The HLH doesn’t sit within the data plane and the Azure Stack software itself is blissfully unaware of its existence and it requires a separate process for updating.
Finally, we have the ToR switching which is handled by another entirely separate process.
The manual instructions for updating the hardware components of VxRACK AS can be found here.
All of this begs for automation and simplification. This has been a promise Dell EMC has committed to since early on in our Azure Stack program.
One final thought before I get into the “update on updates” portion. I’ve seen numerous questions as to the cadence of hardware level updates. The real answer there is when required. As changes need to be made to support changes in the Azure Stack software and features or as security needs dictate (think Specter/Meltdown) updates will be provided. This could mean back to back updates month over month, but that will not always be the case either. Part of your regular process needs to be validating the currently recommended hardware versions for your VxRACK AS at support.dell.com.
Now for what you really came here for. I’m pleased to announce that the current release of the Dell EMC Cloud for Microsoft Azure Stack (1805) delivers a preview of our Automated Update tooling! It includes automation of the compute node updates as well as the HLH.
Now if I were you, I’d be saying, “Hold on Greg – tell me what you mean by ‘preview’.”
Glad you asked! For the time being, in order to get access to the tool you must open a support ticket with us (or work through your TAM to do so). This allows us to track who is leveraging the tool and allows us to work with you directly as you run through your initial updates via the automated process.
Now let’s take a glimpse of what the tooling looks and the process looks like:
First, you will download the available updates and put them somewhere accessible by the tool (and remember checking for updates is going to be part of your SOP, right?).
You will then launch the tool that is provided by Dell EMC getting a window that looks like the below:
Upon clicking “Next” you will be queried to provide a set of information including the location of the downloaded update packages, IP ranges for the iDRACs on your nodes, and the appropriate credentials.
The HLH and Azure Stack compute nodes can be scheduled together or done individually with a simple check box depending on your requirements:
Note: If updating the firmware on the HLH you will be prompted to suspend Bitlocker on the HLH.
From here I click to initiate the update process and the tool runs through the process including bringing the nodes in an out of maintenance mode, validating success, and providing status.
The entire update process will take some time to complete (think multiple hours) but no further Operator interaction is required until the updates are completed. The net of this is the total operator interaction on a successful update is minutes. There is no series of repetitive tasks to preform introducing increasing risk of human error, outages, or configuration drift.
So what next? As I noted above the current tooling updates the HLH and compute nodes, but as you’ll recall I mentioned an additional set of componentry in my review above – switches. Automated updating of switch firmware is a current roadmap item and should be available in the tooling in the near term.
If you are a VxRACK AS customer I’d encourage to reach out to our support teams to get working with the update tool – it will greatly streamline your process and leave you more time to focus on high value activities. If you aren’t a VxRACK AS customer, and are interested in checking out the value we bring, we’d love to talk to you!