2015-09-29
A Case Study in How NOT To Process Orders
Home »Windows»Checking Recursion
2015-09-29
A Case Study in How NOT To Process Orders
2014-02-08
Reading email SMTP headers to trace the origin of the message.
2014-02-07
Rants about the pains of performing what should be a simple package upgrade.
Your IP address is
3.238.111.130
Recently, a couple of customer systems have had various issues with Microsoft DNS on Windows Server 2003. Either DNS spits out invalid data, or it just hangs and times out when you ask it to query for external domains.
Usually, this is rectified by a restart of the DNS service - but, since the service is still running - how do you know when it's no longer working? You could just rely on the several hundred calls to the helpdesk, but ideally you want to know there's a problem before it gets to that point.
I decided there must be a better way, and wrote a small batch file to query the DNS server with nslookup.exe, and parse the output. If the server got it right, then good. If not, then take some form of action such as restarting the service.
This works fine for the problems when the DNS service is returning invalid data (eg thinks all records resolve to 0.0.0.0) but not for querying external domains. This is because the DNS server will cache the record. If you keep asking it for the IP address of www.example.com, it will cache that record for the duration of the TTL, and happily return the data to you without performing any further recursive lookups.
Fortunately, there's a much-maligned feature in DNS called wildcard zones. Whilst these can be a pain, they're certainly useful here.
If you query a wildcard zone, anything will match it. MS DNS only caches that particular query that you make.
Let's say, you have a zone called example.com, which is configured with a wildcard record. Requests for foo.example.com, bar.example.com and anything.example.com will all resolve. Anything .example.com will resolve to whatever you configured the wildcard as. If you query for north.example.com, and get given a response of 192.168.99.59 - then north.example.com will be cached. If you later query for south.example.com, although it is part of the same wildcard zone, it hasn't been cached, and so a new recursive query needs to be made.
So, our checking batch file looks like this:
REM A simple script to determine if the DNS server is responsive, and restart it if it isn't
REM this needs to run as an account which has permission to stop/restart the DNS server service
REM populate the NSBOX variable with the IP address (not hostname - for obvious reasons) of the server you want to check
SET NSBOX=192.168.0.1
nslookup.exe test%RANDOM%.%RANDOM%-%RANDOM%.ws %NSBOX% | findstr 64.70.19.33
IF ERRORLEVEL 1 GOTO PROBLEM
IF ERRORLEVEL 0 GOTO FINE
:PROBLEM
echo %TIME% %DATE% Restarting DNS Service on %NSBOX% due to lack of response >> dnscheck.log
eventcreate /L APPLICATION /ID 367 /SO DNSCheck-Script /T WARNING /D "DNSCheck Script is restarting
the DNS Server Service, as it has failed to correctly resolve a hostname."
sc.exe \\%NSBOX% stop dns >> dnscheck.log
sleep 5
sc.exe \\%NSBOX% start dns >> dnscheck.log
exit
:FINE
exit
Run this as a scheduled task, but run it every 5 minutes (possibly 4). Don't run it too often, or you'll find that it hasn't finished starting up the DNS service by the time that it is scheduled to run again - and so the second instance of the script will begin to restart the service whilst the first script is still trying to start it!
Also note the sleep 5 statement in there. That's to stop sc.exe from trying to start the service before it's finished shutting down. You could change sc.exe to a net.exe stop DNS command, but since I wanted to log the output, sc.exe gives more information (eg the PID).
In the example given above, the query is made for a random string in the .ws tld. This is because the entire .ws is a wildcard zone. If a site does not exist, then the wildcard match is returned. Currently this is 64.70.19.33. You're better off querying your own wildcard zones if you can, but this isn't always possible. If you have your own wildcard zone, you're in a better position to know that you won't be querying for anything that might actually exist. As it is, it is unlikely that somebody will have registered a domain like 27517-189.ws, and even less likely that they will have added a host to it called test32737.27517-189.ws - but keep in mind that it is possible, and you may, just possibly - get a false-positive match on this.
There's more information about wildcards at wikipedia.
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.0 UK: England & Wales License.
Design by GetTemplate