DHCP is a critical service that needs to be thoroughly integrated into a practical and functional network design. I'm going to provide you with an overview of how to build a redundant and easy to manage DHCP infrastructure with modern technology. Because it is impossible to talk about DHCP design without talking about the underlying network infrastructure, I will start by covering some basic TCP/IP network design issues.
You should have a thorough understanding of TCP/IP networking and subnetting concepts, as well as Cisco routing and switching configuration, to fully comprehend all of the material in this article. However, even if you don’t fully meet these prerequisites, you can still read this article and get a basic understanding of DHCP planning and design that can help you work better with network architects and IT consultants.
Designing your TCP/IP network
A basic understanding of TCP/IP networking can help your design avoid the biggest factors in network congestion. In this section, I'll also give you some tips for creating a "binary friendly" design and working with layer 3 switches.
Performance and sizing of subnets
In order to design a high-performance and low-congestion network, you must understand the enemies of network performance. In the past, the biggest enemy of network performance was data collisions caused by the use of Ethernet repeaters (a.k.a. hubs). With hubs, anytime data is transmitted by one computer to another, the data is repeated to every single port of the hub, which results in excess traffic on each one of those network nodes.
This cause of excess traffic is mostly a thing of the past, because Ethernet switches have replaced Ethernet hubs at the core of most networks. Data collisions have all but become moot on modern Ethernet networks because Ethernet switches isolate traffic between two nodes, while keeping all other ports clear and open for other communication.
The new king of congestion is the broadcast storm. Computers (especially the ones running NetBEUI) have a nasty habit of calling out to the entire TCP/IP subnet, which in turn forces an Ethernet switch, which normally likes to keep traffic isolated, to send that data stream to every port on the switch on the same subnet. Even worse, sometimes every node on that subnet has to respond to the sender, causing the original broadcast to be amplified a thousand times.
Unfortunately, this scenario sometimes puts us back into the same predicament that Ethernet hubs had to deal with constantly. The only way to combat this traffic problem is to keep the number of hosts on a single broadcast domain to a minimum. That means probably no more than 128 nodes on a single TCP/IP subnet. I have seen sites with thousands of computers on a single subnet, and I can tell you it isn't pretty when monitoring their broadcast storms. In fact, I have seen setups so bad that people were kicked out of their terminal server sessions a dozen times a day because of network instability caused by broadcast storms.
Designing a clean, “binary nice” subnet
In this design example, I'll start with the premise that you have a single LAN site at a single location. While it's possible to run DHCP over WANs, it isn't considered best practice, so you will stick to a single LAN. The site will have up to 1,000 users with 1,000 computers broken down into 256-host sized VLANs with no more than 100 users per VLAN and room to spare. (VLANS are created by logically segmenting a network with a managed layer 2 or layer 3 switch.) This means you will require a minimum of 10 VLANs on this site.
Additionally, because you'll want to be able to summarize this site into a single supernet when routing, you'll round up to the next “nice” binary number, 16. You'll use the private class A scheme of 10.x.x.x for your company, so for this site, you will run the entire site under the network ID of 10.0.0.0/20. For those of you new to this naming convention, this is the abbreviated terminology for the Network ID of 10.0.0.0 with subnet mask of 255.255.240.0, which defines all IP addresses ranging from 10.0.0.0 to 10.0.15.255.
By using “binary nice” numbers like 2, 4, 8, 16, 32, and so on, you can define the entire subnet by the single network ID of 10.0.0.0/20. The reason for using the single ID isn't solely aesthetic; it greatly simplifies routing and security rules because you can define the entire network with a single statement. This not only simplifies management, but also improves performance and reduces the chance of mistakes. Some of you at this point may be balking at the idea of running 10 separate subnets for only 1,000 users, but bear with me—it is not that difficult to handle if you use the right technology. Also keep in mind that there are 65,536 256-host sized subnets in the 10.0.0.0/8 class A private network. This means that you can have 4,096 of these sites with 16 subnets each. Obviously, the next LAN locations of similar size will be defined as 10.0.16.0/20, 10.0.32.0/20, 10.0.48.0/20, and so on.
Routing subnets with layer 3 switches
Now that you have the basic network laid out, you must build it. It's best to use a managed Ethernet layer 3 switch, such as a Cisco Catalyst 6500 series with Multilayer Switching Feature Card (MSFC), but a Cisco 3550-12G can be used instead for smaller networks or tighter budgets. (Note that Cisco isn't the only equipment that can do this job, but for the purposes of this article, I'll use Cisco equipment as an example). The 3550-12G makes for a great poor man’s core/distribution layer switch at one-tenth the cost.
Both of these switches can act as the core layer or core and distribution layers of the network. Then you can proceed to connect access layer switches, such as the Cisco 2980 switches—you can use cheaper unmanaged switches for this, too, but understand that you can’t break them up into additional VLANs or have trunking support—to the 6500 via gigabit Ethernet uplinks.
Then you can distribute these access layer switches throughout the location so that the actual Cat 5e or Cat 6 copper runs to the clients are kept to a minimum length, which vastly reduces cabling cost in material and labor while increasing signal reliability. Once this two- or three-tier design is in place, you can proceed to configuring the switches.
The Cisco 2980 access layer switch has VLAN or bridge group capabilities, but has no routing capabilities of its own. For that, it can connect or trunk into the core switch using 802.1q trunking over the gigabit uplink via Cat 6 copper or full-duplex fiber. The core switch using the 6500 MSFC or the 3550-12G can act as a massive VLAN router to handle all routing requests and can act as the default gateway for every VLAN on all tiers by configuring a single static routing table and/or a routing protocol such as EIGRP, RIP, or OSPF.It can act as the DHCP relay agent for all the VLANs, as well, and is definitely easier and cheaper than setting up at least 10 separate Windows or Linux boxes to act as DHCP relay agents.
For an example of six VLANs using six 2980 Layer 2 switches and a 3550-12G as the core/distribution layer switch, take a look at the diagram in Figure A.
|A 3550-12G acts as the core/distribution layer switch in this network.|
A DHCP relay agent sits in place of an actual DHCP server in a TCP/IP subnet. It basically extends the reach of the DHCP server without the need for multiple DHCP servers on each subnet by acting as the server’s helper agent in a remote subnet. DHCP relay does not manage IP addresses itself, but relays the DHCP request to the DHCP server on behalf of the client, obtains the IP address, and then hands out the IP addresses to the asking client on behalf of the DHCP server.
The only other alternative (besides separate DHCP servers) to using DHCP relay when you have multiple subnets is setting up a single DHCP server with multiple Ethernet ports sitting on each VLAN, but that configuration has some serious limitations in scalability. On Cisco layer 3 switches, DHCP relay can easily be achieved with a single command of ip helper-address 10.0.14.255 entered into each VLAN interface, as shown in Figure B.
10.0.14.255 will be the broadcast address of the VLAN that will contain your DHCP servers. You can use a specific IP address here instead of a broadcast address, but that would mean having only one active DHCP server, or else you must cluster two or more DHCP servers on a single IP address. For our example, Figure B shows configurations with VLAN definitions (a.k.a. bridge group), default gateways, and DHCP relay configurations for Cisco or IEEE standard configurations.
|IEEE standard configuration on a Cisco 2948 Layer 3 switch used as a core/distribution layer switch|
Note that VLAN 14 is the only bridge group interface that does not need the helper-address because it contains the DHCP server itself, and also note that VLAN 11 through 15 will be used for spare, DMZ, or server farms. I didn’t have a 3550-12G with Gigabit Ethernet handy, so I used a 2948-L3 with Fast Ethernet instead as the core switch for this example.
Note that some of the Cisco L3 switches use a different type of command line interface. The following is a command example with a Cisco 6509 MSFC L3 module:
description Subnet 1
ip address 10.0.1.1 255.255.255.0
ip helper-address 10.0.14.255
no ip redirects
no ip directed-broadcast
This looks quite a bit different from the 2948-L3, but still uses the same DHCP relay command. The VLAN command accomplishes the same thing as the BVI command, but it is a little easier with the 6509-based CLI (command line interface) because you don’t need to declare the IEEE bridge protocol. The Cisco 2948-L3 CLI must manage the routing as well as the switching and port configurations. The 6509 MSFC module is more of a dedicated routing and management module with the physical switch ports handled by a separate CLI. You can consult your switch manual or Cisco’s Web site for more information on your particular hardware.
While it is possible to use a Windows or Linux server as a DHCP relay agent, it would be overkill to dedicate 15 (or more) separate machines to do the job of a single command on your layer 3 switch. Note that without this technique of using the L3 switch, it would be very impractical to implement this degree of TCP/IP segmentation on a LAN. You would also need 15 separate servers for DHCP relay agents and 15 traditional routers to join the 15 VLANs. The point is, take the easy route and use a layer 3 switch at the heart of your network. It opens up all sorts of possibilities and can greatly reduce the hardware necessary to build a robust and effective network infrastructure.
DHCP redundancy and configuration
Set up two non-overlapping DHCP servers
The DHCP servers will reside in the subnet of 10.0.14.0/24 along with many of your other servers. Since DHCP is an extremely low activity service, I recommend that you host the DHCP servers on your Windows NT or Windows 2000 servers. You can even let DHCP piggyback on a domain controller, file server, DNS server, WINS server, or another basic network service. Wherever you decide to put DHCP, you will need to find two separate servers for it.
Next, simply install the DHCP service and proceed to configure each server to serve only half the subnet with non-overlapping scopes. (It is also good to cluster your DHCP servers, but that requires Windows 2000 Advanced Server, which may not be an option for everyone.) The first DHCP server will be configured with a scope of host numbers 10-109, and the second DHCP server will host 110 to 219. This leaves hosts 1-10 reserved and 220-254 for static IP addresses, which can be used for things like printers. This is what is called a 50/50 configuration, and you may also hear recommendations for an 80/20 configuration where IP addresses are a bit scarcer.
I also recommend not using DHCP reservations, because that makes the management of DHCP servers extremely messy by fragmenting the scopes. I would much rather assign static addresses from the 220-254 range (you can make this range as large as needed). Because the DHCP relay agent is forwarding to the broadcast address where these two DHCP servers reside, it is basically a first-respond, first-served environment.
In this example, it doesn’t matter, since all of our users can fit in a single DHCP server with room to spare. However, with a larger number of clients, if the two DHCP servers have an equal load and are equal in speed, users will end up half and half on each DHCP server.
Building DHCP scopes and setting the scope and server options
In this example, you will use a Windows 2000 server for setting up DHCP scopes. On these DHCP servers, you'll need to create 10 new scopes using the Create Scope wizard. During the creation of these scopes, simply name them VLAN1 through VLAN10 and enter the corresponding IP ranges.
When creating each scope, only enter the default gateway for each scope—don’t enter any other DHCP options. You don't want to enter the scope options for each scope. Instead, you want to set "server" options that will apply to all the scopes. (I often see IT pros get confused over this issue.)
It's possible to put any type of DHCP attributes—such as default gateway, DNS servers, WINS servers, etc.—in either Scope Options or Server Options (formerly known as Global Options under NT4). The best practice is to put the default gateway under Scope Options and put all other options, such as WINS and DNS, under Server Options. Then the server options will automatically be inherited into all of the scopes, saving you a lot of manual entry and possibility for errors. To configure the server options in Windows 2000, open the DHCP applet, right click on Server Options, and select Configure Options (see Figure C).
|Set Server Options for DHCP scopes in Win2K.|
Once you select an option, then you can fill in configuration details in the area below the options in the windows. In addition to the options shown in Figure C, you can scroll down and select additional options, such as:
- 006 for your DNS servers
- 015 for your default domain suffix
- 044 for your WINS servers
- 046 for 0x8 for your WINS/NBT node type
After you set the server options, you should repeat this procedure on the second DHCP server, doing everything the same. The only difference is that the host range will be 110-219, instead of 10-109 as on the first DHCP server.
At this point, some of you astute readers may be wondering how to actually bind all the different scopes to their respective network IDs. The answer is surprisingly simple: You do nothing. When you created the scopes, you had to define the separate IP ranges of all the corresponding scopes. That alone is enough configuration to match up the scopes with the subnets they will serve.
When the DHCP server receives the DHCP forwarded request from the DHCP relay agent, it simply examines the source IP of the DHCP relay agent that forwarded the request, then matches it up to the scope that serves the subnet of the DHCP relay agent and grants an IP configuration set back to the relay agent. Then the DHCP relay agent passes on that IP configuration set to the original client that made the DHCP request in the first place.
Finally, after all that, be sure you activate your DHCP servers and authorize them by right clicking on the DHCP server and choosing Authorize. Once authorized and activated, you have just set up two DHCP servers to serve 10 separate subnets with the aid of a single layer 3 switch. Note that this type of infrastructure is extremely scalable and could just as easily serve 1,000 scopes if needed. A DHCP server only has to do one transaction per user per week so even 1,000 scopes is not a lot of work for a modest 500-MHz server.
Be aware that this type of architecture absolutely mandates a good DNS and WINS name resolution infrastructure. You cannot rely on the old broadcast discovery techniques for name resolution as you could under a flat subnet where all your clients lived in the same subnet. However, that's a great performance advantage and puts less reliance on luck when using broadcasts. Rest assured that having a disciplined TCP/IP name resolution infrastructure will pay great dividends when all the inconsistencies and mysteries of legacy-style Windows networking disappear.
Using MAC address “security” on DHCP
You can set up a basic level of "security" for your network by issuing only IP addresses with pre-reserved MAC addresses. I say that a bit sarcastically, because it is "security through obscurity." The method can only be used for basic security because it is based on the honor system. Hackers can spoof MAC addresses on any network adapter within seconds. Your MAC address is what you declare it to be. The ease of forgery is also reason why MAC addresses can’t really secure wireless.
The other problem with this security scheme is that even if you don’t assign someone an IP address, that doesn’t mean they can’t just simply type in an IP address manually and still participate on your network. In addition, maintaining a 12-digit hex number gets to be quite cumbersome for a thousand users. Nevertheless, this technique generally keeps the non-technical person from connecting a rogue machine to your network, but it really has no security capabilities beyond that. Real security needs to be handled at the switch level with 802.1x and EAP.
802.1x port-based access control and EAP
Some advanced switches, such as the Cisco Catalyst 6500, support 802.1x port-based access control and extensible authentication protocol (EAP). Basically, this means no authentication, no access. The Ethernet port remains closed until authenticated successfully over EAP. Unlike using MAC reservations on the DHCP server, you can’t just forge the MAC address or even manually enter an IP address—either hack is useless when 802.1x/EAP is employed on the access switch.
When 802.1x is employed, a client connecting to a port on the switch must support the 802.1x protocol. Currently, Windows XP is the only operating system that natively supports 802.1x, but Microsoft is promising 802.1x support for legacy operating systems including Windows 98, NT, and 2000 in the near future.
Basically, when the 802.1x-capable client connects to the switch, it must send EAP credentials to the switch. The switch then forwards the EAP message to a RADIUS (Remote Authentication Dial-In User Service) server. If the RADIUS server accepts the credentials, it will respond with an EAP success message to the switch. Only then will the switch transition the port to an open state and permit DHCP requests and full network participation. (This same RADIUS infrastructure can also be used to provide enterprise grade wireless security.) For more information on 802.1x with Cisco switches, see this Cisco configuration guide for port-based authentication.
All the old concepts and ideas on hubs, switches, routers, and DHCP servers have been revolutionized with the latest technologies in Cisco routing and switching and 802.1x/EAP security. The infrastructure that I have shown you how to build in this article is faster, more scalable, and more secure than what has been used in the past. Not only are you able to create a more manageable and robust DHCP and network infrastructure, but you're able to do it with less money, equipment, and time. It's simply a matter of taking advantage of what new technologies have to offer.