Understanding the Windows 2000 Distributed File System

You add hard drives to your servers, and your users fill them up. How do you add storage with minimal impact on your network? In this Daily Drill Down, Jim Boyce offers a solution—the Windows 2000 Distributed File System.

If you’re familiar with the UNIX networking world, you’re probably familiar with the Distributed File System (DFS), an Open Systems standard that defines the capability to build a homogenous logical file system from disparate systems. In other words, you can build a single logical file structure that appears as one file system to clients but that can actually reside on multiple computers. A primary benefit that DFS offers is the ability to simplify network access by clients, presenting shared resources under a single, common namespace. The clients don’t need to know that the files actually reside somewhere else.

Windows 2000 offers two technologies that bring the same capability and benefit to Windows 2000 platforms. The Windows 2000 Distributed File System (Dfs—to differentiate it from Open Systems’ DFS) does for Windows 2000 platforms what DFS does in the UNIX world. Windows 2000’s mounted volumes offer a similar yet less powerful option for achieving much the same result on a local file system. In this Daily Drill Down, I’ll give you a detailed look at how both features can simplify your life and improve usability for your clients.

Understanding reparse points and directory junctions
NTFS version 5.0 introduces a new feature called reparse points that Microsoft uses to build other features into the file system, including both mounted volumes and Dfs. Reparse points are objects in the NTFS file system that carry special attribute tags that can trigger drivers other than NTFS to process the data referenced by the reparse point. Reparse points enable these external drivers to add functionality to the NTFS file system without requiring that NTFS be redesigned to accommodate the new functionality. In effect, reparse points make NTFS extensible.

You can think of a reparse point as a flag that notifies NTFS that the data referenced by the flag needs to be handled by another driver. When Windows 2000 encounters a reparse point, it passes the reparse point’s attribute tag back up the file system I/O stack. The attribute uniquely identifies the purpose of the reparse point, and the installable file system drivers in turn examine the attribute to determine whether they are responsible for processing the data referenced by the reparse point. When a match is found, the driver processes the data according to its function and purpose.

Data encryption is a good example of the use of reparse points. NTFS begins reading a file and comes to a reparse point, which it passes back up the I/O stack. The attribute tag for the reparse point indicates that the following data is encrypted, so the Encrypting File System (EFS) takes over and reads the encrypted data. NTFS itself doesn’t have to understand or handle encryption; it only needs to be able to pass the location of the information to EFS so it can process the data. Hierarchical Storage Management (HSM) is another feature implemented through reparse points. The reparse point marks the location of an off-line file, enabling NTFS to pass the information to Remote Storage Services (RSS) to retrieve the off-line data when it needs to be restored.

Directory junctions are another feature added in NTFS 5.0. These objects mark NTFS directories with a surrogate name. The directory junction (implemented as a reparse point) grafts the surrogate name onto the original pathname, enabling redirection to take place. The result is that both local volumes and remote shares can be mapped to local NTFS folders, making possible the two features covered in this Daily Drill Down: mounted volumes and Dfs.

Using mounted volumes for local drives
Mounted volumes are a new feature in Windows 2000 that rely on reparse points to do their job. Mounted volumes enable you to mount a local volume to an empty, local NTFS folder, providing essentially the same benefit for local volumes that Dfs provides for network shares. They let you build a homogenous local file system structure from disparate local volumes.

Mounted volumes have several useful implications. First, they enable you to apply quotas on a selected basis. Normally, quotas apply across an entire volume. However, you might want to apply a different quota level (or no quota at all) to a given set of folders. For example, assume you want to apply one set of quotas to the entire volume with two exceptions: different sets of quotas for the \Users and \Downloads folders (each with their own set of quotas). In this scenario, you could achieve the different quota levels using three different physical volumes—one primary volume, a second to map the \Users folder, and a third to map the \Downloads folder. Figure A illustrates the physical layout and the way the volumes are mapped into the directory structure of the first volume.

Figure A
Mounted volumes give you the flexibility to locate shares on different drives.

Another situation in which mounted volumes are useful involves extending your apparent disk size without replacing the disk. For example, assume you currently have one 4-GB volume in a system and it’s nearly out of available space. The applications in the Program Files folder alone take up 2 GB of space. You need to add several applications but don’t have the space. So you install a new 12-GB drive as a second volume in the system. Next, you move the contents of the existing C:\Program Files folder to the new volume (leaving the Program Files folder empty), and then map the new volume into the C:\Program Files folder. As far as your applications or users are concerned, the Program Files folder still resides on drive C. The net effect is that you now have lots of additional space (a net of 10 GB) in C:\Program Files for more applications. Plus, the physical drive C: volume now has an additional 2 GB of space (which was originally used by C:\Program Files) for adding other OS components, documents, and so on. The advantage is that you didn’t have to replace the existing drive with a larger one, which would require either cloning your current Windows 2000 installation to the new drive or reinstalling. Instead, you perform a simple file-move operation followed by a directory mount. Rather than spend several hours achieving the desired results, you spend an hour or so, including the time spent installing the new drive. Plus, your total disk space is greater because you retained the old drive.

Mounted volumes offer two other important benefits. First, they enable you to simplify what might otherwise be a confusing file system structure. Rather than have, for example, three different volumes on the system all referenced by a different drive ID, you can combine all of those volumes under a single file system structure that appears to reside on one logical volume. Second, mounted volumes enable you to overcome the 26-volume limitation imposed by using drive letters, since mounted volumes don’t require drive letters.

You could achieve many of the same results by creating a volume set, which enables you to combine multiple physical volumes into a single logical volume. The disadvantage to using volume sets, however, is that they must be created on dynamic disks. Creating new volume sets with basic disks is no longer supported as it was in Windows NT. So you’d have to convert your existing basic volumes to dynamic volumes—not a problem per se but a possible complication, particularly if you have a multiboot system with other operating systems that won’t recognize dynamic disks. Plus, mounted volumes give you the additional advantages already discussed—such as selective quotas—that volume sets don’t provide.

Creating a mounted volume
Creating a mounted volume is an easy process. You can use an existing volume (including devices such as CD-ROM drives) or you can install a new device. Once the device is in and working, create a partition as needed and then format the volume. You don’t need to assign a drive letter to the volume.

Next, create an empty folder on the NTFS volume where you want to mount the other volume. The mounted volume will appear as the contents of this folder. The mounted volume need not be an NTFS volume, but the empty folder where it is mounted must be NTFS.

Finally, open the Disk Management node of the Computer Management console. Right-click the volume to be mounted and choose Change Drive Letter And Path. Click Add and select Mount In This NTFS Folder. Specify the path to the empty NTFS folder or click Browse to browse for it. Click OK, click Close, and then close the Computer Management console. To verify that the volume is mounted, open My Computer, and then open the folder where the volume is mounted. A drive icon should have replaced the folder icon.

If you want to apply quotas to the mounted volume, open the Disk Management node of the Computer Management console. Right-click the mounted volume and choose Properties. Use the Quota tab to configure quotas on the volume. If the volume is identified by a drive letter, you can also get to the property sheet through My Computer.

Building a Distributed File System
Now that you have an understanding of what mounted volumes can do for you, understanding the advantages of Dfs should be a little easier. Dfs does for the network what mounted volumes do for the local file system. Dfs enables you to build a homogenous file system structure using volumes from various computers across the network. It enables you to present a single file system to users across the LAN, simplifying shared resource access. Rather than having to open several share points to gain access to the resources they need, users can open a single share point and have access to all resources. Their network browsing experience is simplified because they need only access a single familiar share. As far as the user is concerned, all the resources are located under a single share. In actuality, those shared resources could be located on several different servers scattered on multiple continents. Figure B illustrates a sample Dfs namespace created from shares on multiple computers.

Figure B
Here’s a sample Dfs namespace created from multiple shares.

Dfs also provides flexibility for rearranging the file system. Dfs uses link tracking to manage objects in the Dfs namespace, making it possible to move objects without breaking the logical link. You can therefore move a folder that is part of a Dfs namespace from one server to another without affecting what the users see. The move is completely transparent to users. Or you might want to increase the storage capacity of a given portion of the Dfs namespace. So you install the new drive, copy the existing data to the new drive, and then modify the link to point to the new volume. The data remains up for all but a few seconds while you modify the link. This ability to move links can be an advantage with Web servers as well, enabling you to move portions of a Web site from one server to another without taking down the site.

Availability, fault tolerance, and load balancing are other important advantages offered by Dfs. Because Dfs can publish the Dfs topology in Active Directory (AD), the Dfs topology is visible to all users across the domain (subject to access permissions and policies). In addition, AD-integration ensures replication of the Dfs structure across all DCs in the domain, providing a high degree of fault tolerance. If one server goes down, users can continue to access the Dfs roots from other DCs.

For load balancing, Dfs enables you to bring multiple replicas of a given share under a common share point, which associates multiple shares under the same share name. In other words, a single share point in the Dfs namespace can point to multiple folders, each on a different server. This helps reduce the load on any given server because the shares are allocated randomly.

To the user, the shared resources come from the same location each time, but they might actually come from different servers. You can create Dfs roots in the AD or create stand-alone Dfs roots. The latter option does not offer the replication and redundancy provided by AD-integrated Dfs roots.

Understanding Dfs structure and topology
A Dfs namespace is a collection of shared resources that exist under a Dfs root. The Dfs root serves essentially as a container for the namespace, much like the root folder of a volume serves as the entry point to the volume’s folders or a network share serves as the entry point to the share’s subfolders. Unlike the root folder of a volume that contains subfolders, the Dfs namespace contains links to the local and remote shares that make up the namespace. To the user, the links appear as folders under the Dfs root share.

One server functions as a host server for a Dfs namespace and in the current Dfs implementation can host one Dfs root. Other servers can host root replicas of the Dfs root to provide redundancy so that the Dfs root is always available, even if the host server is down. Remember that the Dfs root is essentially a container of links, so creating a root replica doesn’t take a huge amount of space—you’re duplicating the structure of the Dfs root (the links), not the data referenced by the links. Also, a given server can host one root or one root replica, but not both.

Users access shared resources in a Dfs namespace in the same way that they access individual network shares. The Dfs root functions as a regular share, and users access the Dfs namespace through that share name. For example, assume you’ve created a Dfs root named Documents on a server named DocServer. Users could browse to the root of the Dfs namespace through the UNC path \\DocServer\Documents. Users then see subfolders under that share that represent the links you’ve created in the Dfs root to shared folders that reside on the hosting server, other servers across the network, or even shares on client computers.

A Dfs link connects a virtual folder name in the Dfs namespace with a remote share, called a Dfs replica or Dfs shared folder. In effect, the replicas function as subdirectories under the Dfs root. Each link can associate multiple replicas with a given part of the namespace, essentially mapping multiple shares to the same name. Dfs randomly presents to the user one of the replicas under a selected link. When a server is down and its shared resources unavailable, Dfs automatically selects another and the client continues working. Figure C illustrates a simple Dfs root with a link containing multiple replicas.

Figure C:
A Dfs root can have a link containing multiple replicas.

It’s important to understand that the possibility for multiple replicas in a link doesn’t mean that Dfs automatically replicates the same data to multiple shares. While in most cases you would add replicas in a given link that all pointed to different copies of the same data, the replicas don’t have to point to duplicate data. You can create multiple replicas under a single link that all point to different content. If you want to ensure that all replicas offer the same data, implement a means of replicating the data from one server to another. Domain-based replicas offer this replication, as you’ll learn later.

Dfs root types
As mentioned previously, you can create AD-integrated (domain-based) and stand-alone Dfs roots. A domain-based Dfs root must be hosted on a domain member server or DC. Dfs automatically publishes the Dfs root topology in the AD, which provides topology replication to the DCs in the domain. Creating a domain-based Dfs root does not by itself provide replication of individual folders, however, but only replicates the Dfs root structure across the domain. You need to configure content replication separately. We’ll discuss replication in an upcoming Daily Drill Down.

A stand-alone Dfs root is stored outside the AD and doesn’t provide the replication and related advantages offered by a domain-based root. You can create a stand-alone Dfs root on stand-alone, member, and domain controller servers.

You can create root replicas only within the framework of the AD, meaning that you can create root replicas of domain-based Dfs roots but can’t create root replicas of stand-alone roots. Each server can host only one root, so a server that hosts its own Dfs root can’t host root replicas. Conversely, servers that host root replicas can’t host their own Dfs roots. A domain can host multiple Dfs roots, although each server in the domain can host only one.

Creating a root replica creates a logical relationship between two roots on two or more servers that are referenced by the same name in the Dfs namespace. However, you can create additional directory replicas within a root replica. This means that you start out with an exact copy of the domain-based root, but changes can be introduced in any root replica that make it different from other replicas of the same root.

For example, you might create physical folders in the root share of the root replica, but those folders won’t exist on the Dfs root from which the replica was created. The bottom line is that the existence of a root replica doesn’t guarantee that the replica is identical to the domain-based root from which it is created. You need to provide a means of replication and synchronization to ensure that your root replicas are identical from one replica to the next.

Clients supported by Dfs
There are two separate aspects of client support for Dfs: replica hosting and browsing. Replica hosting refers to a client’s ability to host a Dfs replica (a shared folder that appears in the Dfs namespace within a Dfs link). Any shared folder can be a Dfs replica since the Dfs root simply associates a link in the Dfs namespace to the shared folder and provides redirection to that share when clients access it through the Dfs namespace. Therefore, Windows 9x, Windows NT, and Windows 2000 clients can all host Dfs replicas without any special software since the computer hosting the share doesn’t have to be Dfs-aware.

To browse a Dfs namespace, however, clients must be Dfs-aware. Any Dfs-aware client can browse and use either a domain-based root or a stand-alone root. Windows 98 and Windows NT 4.0 with Service Pack 4 or higher include support for Dfs. Windows 95 does not include Dfs support, but you can add it by installing the Dfs client on each Windows 95 computer that needs to access Dfs. You can download the Windows 95 client from Microsoft’s Web site.

File and print sharing has matured quite a lot. No longer are you restricted to storing files on individual disk drives or servers. Using Windows 2000’s distributed file storage and mounted volumes support, you can scale up your network without your users being affected by the changes. In this Daily Drill Down, I showed you how both features can simplify your workday and improve usability for your clients.
The authors and editors have taken care in preparation of the content contained herein but make no expressed or implied warranty of any kind and assume no responsibility for errors or omissions. No liability is assumed for any damages. Always have a verified backup before making any changes.

Editor's Picks