Creating a searchable intranet or Web site is the focus of many IT groups interested in improving communications in their environments. After all, what’s the point of placing information on your Web site if nobody can find it? Microsoft Indexing Service 3.0 indexes the contents of folders on a Windows 2000 machine to enable free-text searching of the contents, making it much easier for users to find information on your site. In this Daily Drill Down, I’ll show you how to set up and administer Indexing Service 3.0.
How Indexing Service works
Indexing Service works by building a searchable database from the various documents on the system. You can quickly search this database using the Windows search tool, the Indexing tool, or a Web browser. The databases that Indexing Service builds are called catalogs.
When you install Indexing Service, it builds a catalog containing the contents of all documents on the system. If IIS is installed, Indexing Service builds an additional catalog containing the contents of the default Web site.
Indexing Service has the same base system requirements as Windows 2000, but, depending on how you use it, the base requirements may increase. For smaller installations, such as those with 100,000 documents or less, 64 MB of RAM will suffice. Beyond that number, the recommended RAM changes as follows:
- Less than 100,000 documents: 64 MB
- 100,000 to 250,000 documents: 128 MB
- 250,000 to 500,000 documents: 128 to 256 MB
- More than 500,000 documents: 256 MB
In addition to sufficient RAM, the service requires enough disk space to store the various catalogs it creates. This varies widely, based on the number and size of the documents. With disk space as inexpensive as it is, this generally isn't an issue. Finally, when you're indexing a Web site served by IIS, you should not store the physical catalog file outside the Web tree.
By default, Indexing Service indexes the following folders and puts the results into the catalog:
- C:\ Documents and Settings
The Web catalog indexes the following by default:
- C:\Program Files\Common Files\system\msadc
In all of the defaults above, keep in mind that if you've installed services into nonstandard locations, then the locations above will change as well.
Separate service, please
While you can use Indexing Service to index documents for your IIS-enabled Web site, the service is separate from IIS and needs to be installed separately. To begin the installation, choose Start | Programs | Administrative Tools | Configure Your Server. When the Windows 2000 Configure Your Server window appears, click the Advanced link. Next, click Optional Components. Finally, click Start The Windows Components Wizard. You’ll then see the wizard appear, as shown in Figure A.
|Choose Indexing Service from the list of components.|
Select Indexing Service and click Next. Follow the instructions on the remaining installation screens to complete the installation. You'll need your Windows 2000 Server CD, but you won't have to reboot to finish the installation.
Managing the service
You manage the Indexing Service like many other Windows 2000 components—via Computer Management. This tool is available at Start | Programs | Administrative Tools | Computer Management.
When the Computer Management console starts, as shown in Figure B, navigate the right pane to Computer Management | Services And Applications | Indexing Service. From here, you’ll do all of your management chores for the Indexing Service.
|Indexing service is administered via Computer Management.|
Whenever I install a new service, I like to make sure that it works as advertised before I put it into use. Testing Indexing Service is very easy. As an example, I've created two documents, both with identical contents. They both have the word TechRepublic inside them, but one is saved as a text document, while the other is a .doc file saved with WordPad. I've saved these two documents in C:\Documents and Settings.
To test the search service, use the Computer Management utility, expand Services and Applications | Indexing Service | System, and double-click Query The Catalog from the list of options. For this example, use the free-text query term “TechRepublic” and click Search.
As you can see in the search results shown in Figure C, the query located a total of four documents. Two of the results were documents unrelated to this test—other articles I’m writing for TechRepublic. Clicking on one of the results will open that particular document.
|The search returned a total of four documents.|
When I click on the second result, example text document.txt, I get the following results:
This is a sample TechRepublic text document
As advertised, the phrase TechRepublic does indeed appear in this document.
While this is a simple test of the service, it's sufficient to prove that Indexing Service works as it's supposed to.
Adding a new directory to an existing catalog
As I mentioned, there are two default catalogs in Indexing Service: the system catalog and the Web catalog. Each catalog indexes the files in specific directories. It's more than likely that you'll eventually need to add directories to this list as you add new files to your system. To add directories to the list, right-click the Directories tab under System and choose New | Directory from the shortcut menu.
The Add Directory dialog box in Figure D allows you to add a new directory to the catalog. You can use the Browse button to browse the contents of your computer or the network to find the directory if you can’t think of it off the top of your head.
|You can add a directory to an existing catalog.|
The UNC path box allows you to index a share located on another computer in the local catalog. Imagine how useful this would be when you have documents stored across multiple servers. You can set up a single catalog that searches the contents of all documents on all servers.
Once you choose a directory to add to a catalog, choose OK. It will take a few minutes for the additional directory to be scanned into the index, so don't expect search results on this new information to be immediately available.
Adding a new catalog
You can create a new catalog that is separate from the others by right-clicking Indexing Service and choosing New | Catalog from the shortcut menu. If you have the option of using one large catalog over multiple smaller ones, then why would you want to add the administrative overhead of multiple catalogs?
First, like any database, the larger it gets, the slower a query will be. Second, Indexing Service doesn't handle share permissions as one would expect. For local documents, NTFS permissions are fully respected, and users won't see document results that they don't have access to. For remote shares, however, users will be able to see results that include documents they don't have access to. By creating separate catalogs, you can restrict what directories on the remote machine the user will see.
To show you how new catalogs work, I'll create a new catalog named New. Before you create a catalog, you naturally need content to index. I've created a directory named Test on C: and copied sample documents to it.
After creating the directory and placing content in it, you can begin to create the catalog. Choose New | Catalog to see the Add Catalog dialog box shown in Figure E.
|Adding a new catalog to the Indexing Service is easy.|
Once you've provided the parameters of the new catalog—the catalog name and the location of the database—click OK to save it. Once you click OK, you'll receive a message indicating that the catalog will remain off-line until you restart Indexing Service. To restart the service after you add the catalog, right-click Indexing Service and click Stop. Then right-click Indexing Service, and click Start.
Next, you need to give the new catalog some directories to index. In the previous step, I gave the new catalog a home but no contents. To add a directory to this catalog, right-click Directories under Test Catalog and choose New | Directory using the method I described above.
After restarting the service and giving it time to index the documents, you’ll get the results you expect when you perform a sample query against the new catalog.
Checking Indexing Service status messages
As with all Windows services, Indexing Service writes messages to the Windows event log. In addition, the Indexing Service main screen gives you a quick overview of the current status of the service as it relates to all catalogs, as shown in Figure F.
|The Indexing Service main screen shows the status of each catalog.|
This status screen also shows you how large the catalog is, the location of the catalog, and the total number of indexed documents. The last column shows you the status. Some of the possible status messages you’ll see from Indexing Service are:
- Indexing Paused (High I/O): Indexing paused because of a high level of input/output (I/O) activity. Close some applications to reduce the I/O activity.
- Indexing Paused (Low Memory): Indexing paused because of low memory. Close some applications to make more memory available.
- Indexing Paused (Power Management): Indexing paused to save battery power.
- Indexing Paused (User Active): Indexing paused to minimize interference with user activity.
- Merge: A merge is in progress.
- Master Merge (Paused): Merge paused because of low resource availability.
- Stopped: Indexing of the catalog has been stopped.
- Query Only: The index is available only for querying.
- Recovering: Indexing Service is recovering from an abrupt shutdown.
- Scan Required: One or more documents need to be indexed. If this message remains for more than a few seconds, check the event log.
- Scanning: One or more directories are being scanned for new or modified documents.
- Scanning (NTFS): One or more NTFS volumes are being scanned for new or modified documents.
- Starting: Indexing Service is starting up.
- Started: Indexing Service has started.
Using Indexing Service for IIS Web searches
You can use Indexing Service to search Web sites for specific information. For this purpose, you can create Web catalogs that will index all of the contents under a specific Web site. To set up a catalog specifically for a Web site, create a new catalog and place it outside the Web tree for the site that you wish to index. If you place it inside the Web tree, you run the risk of the system thinking that the catalog database is unavailable and using significant CPU resources as a result.
Once you've created the catalog, right-click it and choose Properties. On the Tracking tab on the Properties page, you can indicate which Web site is to be indexed by choosing it from the drop-down list next to WWW server. You're also able to index newsgroups on your server in the same manner.
Once this is complete, stop and restart the Indexing Service for the new catalog to become active. You can now write ASP scripts that will take advantage of the search capabilities of your Web site.
Seek and ye shall find
Indexing Service provides a powerful search feature for your LAN and your Internet users. Be careful to create catalogs outside your Web tree and to check frequently for security updates from Microsoft that relate to this service.