How do I... Modify Word documents using C#?

Accessing Word components from C# isn't quite as straight-forward as many other features of C# and the .NET Framework. With that said -- it's not rocket science either. You simply need to know what to reference and how to use the components. Zach Smith lays out exactly what you need to do.

Accessing Word components from C# isn't quite as straightforward as many other features of C# and the .NET Framework. With that said — it's not rocket science either. You simply need to know what to reference and how to use the components.

This blog post is also available in PDF form in a TechRepublic download, which includes a sample Visual Studio project file with all the necessary code.

Referencing the Word assemblies

The first step is to get the correct assemblies referenced for Word. The name of your assemblies will vary based upon the version of Word that you have — in my case, it is version 11, or "Microsoft Word 11.0 Object Library."

The Add Reference dialog is shown in Figure A with the correct reference highlighted.

Figure A

Word Reference
Note: The Word library is found under the COM tab of the Add Reference dialog.

After you click OK, you should see the following under References in your C# project:

  • Microsoft.Office.Core
  • VBIDE (This is used for Visual Basic For Applications functionality)
  • Word

Using the Word components

Once you have the references set up, you can begin using the Word components. However, these components are a little tricky to deal with and can act in unexpected ways. These objects work by basically creating an instance of Word under the current session and giving you access to Word's functionality. You can even see the Word instance running if you instruct it to be visible. This is both intriguing and scary. On the one hand, you have access to a wealth of Word functionality, but on the other you're taking up valuable RAM by creating a new instance of Word.

For this reason, it's best to use these components in an environment where you don't expect a lot of heavy use, or where you know only one person will be using it at one time. For instance, a Web server is not a good place to use this functionality. However, using this in a client-based application would be fine.

Okay, now let's get to the code. To use the Word components, you will follow these basic steps:

  1. Instantiate a new Word.Application object.
  2. Create a new Word.Document object.
  3. Call Word.Document.Activate to make sure our document has focus.
  4. Do some action on the document (we'll replace some text in the example).
  5. Save the document using Word.Document.Save or Word.Document.SaveAs.
  6. Close the document.

The code shown in Figure B and available in a Visual Studio project file in the download version of this document demonstrates these steps. This code opens an existing document, replaces several tags (as if the document is a letter), inserts some text before and after the existing content, and then saves the document.

Figure B

Code Listing 1

You'll notice the "ref missing" parameter in several of the Word object calls. This is due to the Word components being accessed through COM Interop. We need to use the missing variable to indicate to the components that we want to use the default value for that particular parameter.

Another thing to point out is the FindAndReplace function, which is actually a helper function that wraps around the Word.Application.Selection.Find.Execute method. Figure C shows the code of the FindAndReplace function.

Figure C

Code Listing 2

Since there is no text selected, the Find method defaults to searching the entire document.

Other functionality

Take note of the number of parameters you can send to the Find.Execute method shown above. As I said before, a wealth of functionality is available from the Word components. More than you'll probably ever need.

Some of the more useful functionality is listed below:

  • CheckSpelling — Runs spell checker on the document
  • DowngradeDocument — Downgrades a document so it can be opened in previous versions of Word
  • FitToPages — Decreases the font size so that the document will fit into a certain number of printed pages
  • Password — Defines a password for the document
  • PrintOut — Prints the document

TechRepublic's free Visual Studio Developer newsletter, delivered every Wednesday, contains useful tips and coding examples on topics such as ASP.NET, ADO.NET, and Visual Studio .NET.  Automatically sign up today!

Editor's Picks