I have a challenge.. I have a network with a primary (HQ) location (US-East), three regional locations (US-Central, US-West, and Canada), and about 350 branch locations. Most of the branches are connected to HQ via MPLS, and the regional locations also connect to HQ. Some of the branches connect to their closest regional site.
We have a set of data files (document templates, for example) that are being replicated to a server in every location and used by a Kix-based application. All files are maintained at the central HQ, including archives and new (development-stage) files. All active files are replicated to the regional and remote branch sites. The development files are not replicated at all, and only archives 1-year old and newer are replicated to the regional sites. (older files are pruned from the regional sites by the replication service.)
If I am at a branch, I need to be able to query my local server for the file I want. If it isn't there, I need to locate the next closest regional server, and then the master server at HQ if the data isn't in the regional server.
Since branch offices don't have "servers", a share is arbitrarily defined on the manager's workstation and the replication service is installed by the setup process.
I'm looking for thoughts/ideas on how to solve the following requirements:
During installation of the server/replication service on a workstation, that workstation can identify itself via some central process. The method to access the central repository, both for update and query, should be able to easily traverse subnets (and firewalls).
The Kix script placed on a client can not use any local configuration file or hard-coded paths to locate the data files.
The client should try to use the nearest server first, moving to regional and finally the primary server if it can't locate what it wants.
The client should never use a peer-site server (ie - a branch client cannot query a server in another branch)
I have a working solution at this moment, but A) wonder if there are better methods, and B) want to know if other admins have similar situations.
We had a similiar issue when I worked in Alaska. We got over it by defining on workstations different variables (enviromental ones) that would contain Local Resource name, Regional Resource name and HQ resource name.
As part of the installation of any new workstation, we would automatically launch a BAT file during setup, that allowed users to "program" via the batch file what enviromentals they got.
Example
Set LR = \\LocalMachine\Share Set RR = \\RegionalServer\Share Set HR = \\Headquaters\Share
Extremely over simplified, but it allowed us to use the %XX% in our scripts when defining drives, printers, etc...
Does your AD structure follow the HQ -> Region -> Branch model? If so, you could just walk up the AD structure.
If not, how about using a simple group hierarchy?
A "branch" group whose description defines the locally elected branch server(s). This group is a member of:
A "regional" group whose description defines the next tier of server(s). This group is a member of:
An "HQ" group whose description defines the top level of server(s)
Determining the order of hosts to try by walking up the group tree.
You don't *have* to populate the branch group with all your machines - if you have a reasonable OU naming standard you can give the branch level group a name that is easy to deduce.
You can create as many tiers as you want of course.
The description which defines the server to use (or other AD field if there is something more appropriate) can be set programatically so you can do lots of tricks with it such as updating the description record when the service is installed or rotating multiple servers for simple load balancing.
Well, both reasonable ideas, although both rely on configuration settings that are per machine. I don't see how I could handle mobile users that travel from site to site, particularly when two sites that are physically close are in different telecom regions, thus in different regional networks. That's the real problem that I see, since AD is often split on admini-political boundaries that have little to do with the WAN connections.
Here's what I'm doing (so far) and it seems to be working out OK..
I define custom SRV records in DNS. A SRV record provides service information - a FQDN host name, service name, port number, and arbitrary weight and priority values. Take a look at your AD-DNS and you'll find SRV records representing your DCs, the PDCe, GC, LDAP, and several other network resources.
I create a custom SRV record called "_swdist" under the "_tcp" group when the application server service is installed. The installation script asks the user for the size of the network that this host serves, so a branch office (1024 addresses) would get a "22" representing 22 bits of netmask. This value is placed in the Weight field of the SRV record, and the priority is set to 2 for a branch, 1 for a region, and 0 for the master. The one assumption is that the network is heirarichal within regions (that is, branches within a region are all subnets of the regional network.)
I use a UDF that takes 3 parameters - IP address, "Short" flag, and Type. If you specify the type, the IP and flag are ignored, and all SRV records matching the defined type are returned. This allows you to locate all DCs, GC servers, and the like. If the Short flag is true and type is null, the standard processing is short-circuited and all _SWDIST tpye records are returned without sorting or prioritizaiton. If only the IP is specified, a list of all _SWDIST records is collected. The network of each server is determined based on the Weight/Netmask value. The IP address is compared to this range and - if it matches - the record is added to a "consideration" list. After the complete list of records are processed, the Consideration list is sorted based on the number of subnet bits - a greater number being more specific (thus closest). Note that the netmask value in the SRV record is not necessarily that of the actual server's subnet - it instead represents the "supernet" of networks served.
The application processes the list, trying the closest server first. If the resource is unavailable, the next (regional) server is queried if a regional (Priority 1) server exists. Since network architecture could preclude defining regional servers by subnet, regional servers could exist without a SRV record. Before checking the master server directly for the resource, a lookup table on the master server is queried to determine if a regional server exists without a SRV record (exception process). If so that server is queried. If it doesn't exist, or does not contain the needed resource, the resource is obtained directly from the HQ Master server.
It might sound complicated, but it's actually quite simple in practice. Creation of the SRV record is automatic during the installation with just 3 questions to the installer - 1. Regional or Slave server? 2. Create SRV record? 3. Size of network served (bits) or list of CIDR networks served?
Question 3 changes based on how Q2 is answered. If a SRV record is created, you can specify a single netmask bits value. If no SRV record is created, you can specify a list of CIDR networks that this system will serve, and the central "exceptions" table is updated. Of course, if a Slave server is created, the SRV record is always created and Q2 is skipped.
With this method, I can walk into a branch with my laptop, run the application, and it will automatically connect to the local, regional, or HQ server where the resource exists. No custom configuration or settings on the client system. I can even have the app installed on a flash drive and run it without local installation - there's no software or configuration prerequisites.
I'll post my UDF and some example code in a few days. Of course, you'd need to create a few SRV records to really test the functionality, but custom SRV records can be created and deleted without any impact to the environment. For testing, I created a master on one subnet, and the regional and slave on my PC subnet (pointing to other valid PCs that had no resources). The resources can have arbitrary netmasks, unrelated to their local subnet size, so it's easy to test the heirarchy.
Well, both reasonable ideas, although both rely on configuration settings that are per machine. I don't see how I could handle mobile users that travel from site to site
Good point. If you have a large enough peripatetic staff base then that would be a problem. If you were going down the group route then you'd base your "branch" group's name on the local (masked) subnet and create the hierarchy as before.
Here's the UDF I created to query the SRV records. It's also useful for locating DC, PDC, GC and similar AD records. I'll post it to the UDF forum after further testing and possibly some feedback.