Virtualizing domain controllers used to be a big no-no…

So, I’ve been away from blogging for a while. Work has kept me busy during the day and my little daughter, which came last year, did the same at night 🙂

But for this post I’d like to tackle the issue of virtualizing a domain controller.

In the past this could cause some rather unforeseen and dramatic consequences if one was not aware of how to do this properly. Most businesses I’ve visited over the years since virtualization began to ramp of have to a large degree virtualized all servers possible and what are better candidates than domain controllers which most of the time doesn’t use the ressources allocated.
Most of these have also adapted their backup solutions to meet these new scenarios where one of the most seen solutions is based on some form of virtual machine snapshotting.
Well, that works okay for a file server, print server or other applications that are crash-consistent. But what happens if you restore a domain controller from a snapshot?

Let’s take some background information on how domain controllers work at first. They, as we all know, run Active Directory and each one of them holds a copy of the Active Directory database. This database is assigned a value known as an InvocationID and each update (that does not require the involvement of a FSMO role master) is done against the local Active Directory database, which is then synced with its replication partners. This is done using Update Sequence Numbers (USN). These two numbers form a unique identifier on the local database and are used to determine whether updates need to be processed.

Now, take the scenario from before where a domain controller is restored from a snapshot. What happens then?
Well, the domain controller that comes up from a restored state doesn’t know this so it will attempt to sync with its partners. But then comes the problems as the USN it tries to use are already acknowledged by the other domain controllers as used and therefor no sync will happen.
And as stated above, as changes happens towards the local Active Directory database then over the course of time they will become more and more out of sync. An example (in the light end is a user changing his/her password on the restored domain controller, which doesn’t sync to the others. You then have one password on some systems and another on others, depending on which DC they validate against).

This is really not a desired scenario, but thankfully Microsoft has done something about this…

In Windows Server 2012 Microsoft made the domain controller virtualization aware (on supported hypervisors that is). This means that the domain controller now knows it is virtual and can take steps to prevent the above described scenario.
So how is this done…

Well, Microsoft introduced a new attribute to Active Directory, more specifically the VmGenerationID which is a number stored on the domain controller object in Active Directory.
This attribute is then monitored by the Windows operating system to ensure it matches the number the server has stored locally. If this is not the case, then the domain controller assumes it has been restored and take actions the prevent the above scenario.
What happens in this case is that the domain controller resets its InvocationID and discards the issued RID pool effectively preventing the re-use of USN. It then marks its local SYSVOL share for a non-authoritive restore.

This picture shows how the proces is done:

Virtualization Safeguards During Normal Boot

The requirements for using this cool new feature is the following:

  • Supported hypervisor platform (Hyper-V 2012 or newer, vSphere 5.02, ESX 5.1)
  • Windows Server 2012 domain controller
  • PDC emulator running Windows Server 2012

 

Vmware – possible data corruption in virtual machine…

I came across this article on the Vmware support forums, and even though I haven’t encountered the error myself I though i’d post it anyways so as many people get this information.

Symptoms

On a Windows 2012 virtual machine using the default e1000e network adapter and running on an ESXi 5.0 or 5.1 host, you experience these symptoms:

  • Data corruption may occur when copying data over the network.
  • Data corruption may occur after a network file copy event.

Cause

The root cause of this issue is currently under investigation.

Please read this KB from Vmware on how to avoid this issue in case you are running ESXi 5.0 or 5.1 and have WIndows 2012 vm’s.

DNS Best Practice Analyzer error…

At a customer site, we’ve after some consideration enabled the Best Practice Analyzer monitor in Operations Manager. When I say careful consideration, I always tell my customer that they will be getting a lot of work with this monitor and sure enough it happened here as well.

The customer was busy cleaning out in the errors, but kept getting one that he couldn’t figure out:

Dns servers on <network adapter name> should include the loopback address but not as the first entry

Problem:
The network adapter <network adapter name> does not list the local server as a DNS server; or it is configured as the first DNS server on this adapter.

Impact:
If the loopback IP address is the first entry in the list of DNS servers, Active Directory might be unable to find its replication partners.

Resolution:
Configure adapter settings to add the loopback IP address to the list of DNS servers on all active interfaces, but not as the first server in the list.

The customer insisted that he had ensured that the DNS server local IP and loopback IP was listed as last in the order as shown below:

DNS BPA error

So, I took a look on the server and sure enough the server order was correct… on the IPV4 settings that is. Looking at the IPV6 settings (which the customer hasn’t deployed) the address ::1 was for some reason listed in the DNS servers.

Removing this and setting it to automatically retrieve DNS servers from DHCP fixed the BPA error.