Wednesday, March 26, 2014

DNS, Replication, and Intersite Trust Issues

Are you seeing a lot of strange replication and trust issues between domains in your company? We recently started migrating users from a company we purchased into our corp domain, everything appeared to be going fine, and then one day it all went to hell.

The trust between our domains went down, group policy's went all out of sync, lots of "can't contact a logon server" errors.

I can't say for sure any one of these caused the others, but it was chaos for a few days.

Here are the main things we fixed, and what you can look at if any of this sounds familiar to you.

Domain Trust: This was going up and down for us, it would hold steady for a week and then go down and come back up.  The trust itself doesn't throw any big obvious errors unless you are looking for them, so double check your trust and validate them to make sure everything is up.

DNS:  Usually the root of most of issues, in our case we had switched from conditional lookups to secondary zones at some point.  Make sure you remember to enable zone forwarding in the source domain, or this won't work.  Also specifically if you are using server 2008 and pushing to server 2012, make sure you are checking your _msdcs_ folder. We had to call microsoft about this one, and it solved a ton of our issues.  The msdcs zone is not automatically created, and has it's own forwarding zone check box.  So you have to do that and make sure you allow forwarding.

DFS Replication: At one point, one of our servers was shut down and created a "dirty" database for replication.  This halted all replication on the domain, as server 2012 does not have automatic recovery enabled.  Check your event history under DFS replication, you should see a warning along with the full command that needs to be run in order to force replication to start again.