A Best Practice approach to updating Hyper-V environments
Updating environments with Hyper-V can be more of a challenge compared to updating an environment that consists of mere physical servers. Not only the workloads need regular updating, but also the Windows servers and Hyper-V servers underneath them.
Hyper-V relies on a Parent Partition, whether you're using a Full installation of Windows Server 2008, a Server Core installation of Windows Server 2008 or the stand-alone Hyper-V Server. When you restart the Parent Partition your Child Partitions will also be paused. How to plan your maintenance window?
Updates can result in loss of functionality. Even though updates get tested thoroughly there is a chance a series or combination of updates or an incompatibility with a third party application or service hangs up your server or results in unexpected behavior. When you install any update it's hard to troubleshoot these kinds of situations: which update resulted in the situation?
Some updates address security holes and require immediate installation in some situations: The risk of breaking stuff outweighs the risk of getting compromised.
A Best Practice approach
Within a good design for any environment a difference would be made between physical and virtual machines, safe and unsafe(r) networks, application-, messaging-, directory-, database- and security services.
The Windows Server System Reference Architecture (WSSRA) comes to mind. The basis of this architecture is to unravel an environment into five layers, (network, storage, application, management and security) supplying guidance for meeting the requirements of an enterprise. The purpose of this guidance is to build highly available, secure, scalable, manageable, and reliable enterprise infrastructure.
The same architectural approach can also be applied to virtual environments. The logical division would be virtual infrastructure, hosts and workloads. In a Microsoft server environment this would mean a divide between:
|Level ||Description ||Examples |
|1 ||Virtual infrastructure ||Windows Server 2008 with Hyper-V |
Hyper-V Server 2008
|2 ||Hosts ||Windows 2000 Server |
Windows Server 2003
Windows Server 2008
|3 ||Workloads ||Exchange Server 2007 |
SQL Server 2008
Terminal Services Applications (like Office 2007)
In a reference architecture patching would yield three goals:
- Patched systems
- Predictable downtime during maintenance windows
- Possibilities for investigation of relationships between patches and loss of functionality / availability
Formulating best practices
Not all updates need to be applied immediately
Software products need to be patched to provide security and functionality. Not every patch is important, depending on your situation. When the main focus for some systems is to secure systems you need to apply all security updates immediately. When your systems perform loads of transaction to other countries, you'd better apply all Daylight Savings Time (DST) patches, otherwise you can delay applying the updates a little while.
Microsoft offers three levels of updates: Important, Recommended and Optional.
Decide for yourself which updates need to be applied and when they need to be applied.
Test or delay updates
You'd better test updates in a test environment when systems are mission critical. The dependency on these systems usually justify the cost of a test environment. When you don't have a test environment wait at least until the third Tuesday of every month (a week after Patch Tuesday) and search online for any signs of updates breaking functionality or availability.
Virtualization offers flexible means to test updates. Snapshot functionality even allows a rollback scenario for updates. Remember though problems may occur on physical machines that you might not experience in virtual machines…
Automatically applying updates
Windows offers functionality to apply updates immediately. By default updates will be applied at night around 3:00 AM. This may not be an ideal method to apply updates:
- A branch office on the other side of the world might be using the system at that time
- The updates might be applied during backup, defragmentation or other maintenance
Furthermore this setting doesn't offer much control. In a small environment without a dedicated systems manager the setting would sound logical, but in large environment choosing the setting is illogical.
Windows Server Update Services
A means to gain control over updates and when (parts of) your servers restart (services) to apply updates is to use Windows Server Update Services (WSUS). Using Organizational Units and Group Policy Objects (GPOs) you can divide servers into logical groups. Setting the Microsoft products for which to apply updates, setting when to apply updates and whether to restart automatically are examples of how to control updating in your environment.
Optionally you can distribute 3rd party applications and updates through Windows Server Update Services (WSUS) by using the Local Publishing feature in the WSUS 3.0 SP1 API.
Even more control can be obtained using System Center Configuration Manager. The WSUS server integration with Configuration Manager 2007 allows to scan all clients in the organization and apply the updates.
End users don’t like to be confronted with downtime, but if they do, they prefer it to be announced in advance and have a fair amount of regularity. An IT department, that arranges a default maintenance window on Friday from 18:00 to 21:00 will receive less complaints, less questions and less frustration from end users, compared to an IT department, that organizes maintenance windows irregularly. Good candidates for maintenance windows are:
- The company’s weekly happy hour
- A departments weekly birthday cake eating hour
- Lunch time
Rogue Patch investigation
A critical element in updating your Microsoft environment is investigating which update was responsible for which broken functionality. (if any) This element is more important in virtualized environments, compared to physical environments, since a rogue patch on the Windows Server in the virtualization layer may cause serious problems for all virtual guests residing on the box.
In combination with the suggestion of having a maintenance window every week I suggest updating per logical layer. (virtual environment, virtual hosts, workloads) For instance this would result in a maintenance window for the virtual environment (where all virtual guests will go down temporarily when the virtual host reboots) every first Friday of the month, a maintenance window for all virtual Operating Systems running every second Friday of the month and a maintenance window for workloads running in the virtual guests (for instance Microsoft Exchange Server and Microsoft SQL Server) every third Friday of the month. One whole maintenance window remains to do maintenance on the Storage Area Network (SAN), the network, etc.
Depending on your environment you’d place your most critical layer on the second Friday of the month after you’ve tested them, since Microsoft releases updates every second Tuesday of the month. (except out-of-band updates) When you delay your updates (in lack of testing) place your most critical layer on the third Friday of the month.
Creating a snapshot in Hyper-V before applying updates allows you to rollback updates in case of broken functionality. When everything’s fine you can ‘flatten’ the snapshot by applying the snapshot, shut down the virtual machine and allow sufficient time for the disk changes to be merged into the main VHD.
Using snapshots may not be a good idea in combination with certain workloads (read: Active Directory Domain Controllers) or availability needs. (with large updates the virtual machine may need to be off for a long period of time)
Below are five of my best practices for updating virtual environments to control the updates to your virtual server environment, control the downtime and be able to address issues with rogue updates:
- Distinguish a virtualization layer, a virtual guests layer and a workload layer. Plan an update strategy per layer.
- Don’t install updates automatically unless it makes sense. (it rarely does)
- Use Windows Server Update Services whenever possible.
- Test or delay updates.
- Plan maintenance windows.
(Manually) Updating Server Core
(Automatically) Updating Server Core
Analyzing the Server Core Updates Estimate
Updating a web site to apply a security patch with the help of Hyper-V
Local Publishing of Updates and Applications
Released Hyper-V updates (up till September)
Integrated Installation and The Beauty of the Win6 Servicing Stack
How Microsoft IT does Patch Management
Steve Riley on Hyper-V Patching
Hyper-V How To: Patch VMs Offline