Software

The StarOffice font deuglification project, part 2: Displaying smart quotes in imported Word documents

In part two of his three-part series, Bryan Pfaffenberger explains why Word's special characters won't display correctly in StarOffice, and he provides instructions for modifying StarOffice so that it can display Microsoft's proprietary character set.


StarOffice is supposed to do an excellent job of opening Microsoft Word documents, and it does—up to a point. Most Word users leave the program’s “smart fonts” enabled, and when you open these documents in StarOffice, the result can be summed up in one word: ugly.

In place of Word’s “smart quotes,” apostrophes, and em dashes, StarOffice displays ugly little boxes or nothing at all. The same goes for other special characters, including many that often are used in European languages. Worse, StarOffice marks each of the box-containing words as a spelling mistake. Thinking that they’d be able to work with Word documents seamlessly, users find themselves spending inordinate amounts of time retyping all of the missing quotation marks and apostrophes, and they’re not very happy to do it. They’ll wonder if Sun (and you) has been overselling StarOffice’s compatibility with Word.

If the smart quotes problem is bugging your StarOffice users, you’ll be glad to know that there’s a workaround. It requires a bit of effort, but the results are well worth it. I’m typing this document right now within StarOffice, and I’ve batted it back and forth to Word a few times; on either end, I’m seeing smart quotes with no problems at all.

What’s involved? The trick lies in translating and modifying the TrueType fonts that Microsoft Word users are most likely to include in their smart-quote-infested documents, including Times New Roman (the default Word font), Arial, and the TrueType symbol font. In short, you’ll convert these fonts to Type 1 fonts, and you’ll install the Microsoft font encoding that enables your X server to display the smart quotes on-screen. You’ll then install the printer versions of these Type 1 fonts in StarOffice, just as you’d install any new Type 1 font. The result? Imported Word documents display beautifully. You can print them, too, as long as you’re using a PostScript printer.

This Daily Drill Down also shows you how to modify StarOffice so that Word-compatible smart quotes are enabled when you create new StarOffice documents. When these documents are exported to Word, the smart quotes will look just as good in Word as they do in StarOffice. (Note that the following instructions apply to StarOffice version 5.1a, the version that’s currently available from Sun.)

Why won’t Word’s special characters display correctly?
Let’s start with an important point: StarOffice’s difficulties with Word’s special characters aren’t StarOffice’s fault; they’re Microsoft’s. For reasons known only to them, the Redmondians chose to implement character sets within Windows that are essentially a superset of the character set standards to which the rest of the world adheres. If this sounds like yet another instance of Microsoft’s “embrace and extend” policy, you may be on to something.

For StarOffice users, switching to Microsoft’s non-standard character sets poses something of a dilemma. If you modify StarOffice to use the Microsoft character sets, as this Daily Drill Down suggests, you’ll be able to exchange documents seamlessly with Microsoft Word users, but your documents will look awful when they’re opened by other StarOffice users who haven’t modified their fonts. Of course, that’s precisely the consequence of Microsoft’s “embrace and extend” policy: you can experience pain if you use something other than Microsoft’s standards and less pain (I’m reluctant to say “pleasure”) when you give in to them. Still, the reality is simple: Your StarOffice users will need to exchange documents with the outside world, and the outside world uses Microsoft Word.

Fortunately, there’s a workaround. Microsoft’s proprietary extensions to the standard ISO8859-1 (Latin 1) character set aren’t hard-wired into TrueType fonts. Instead, they’re generated by an encoding, which is a character map that, among other things, tells the font server where to place each character in the numerical sequence of codes. For U.S. systems, Microsoft Word uses the Microsoft CP-1252 encoding by default. Other similarly proprietary encodings are used for Microsoft systems sold elsewhere. (“CP,” incidentally, is short for “code page,” and it refers to the proprietary character mappings that Windows uses in the various configurations that it tailors to U.S. and foreign markets.)

So, why can’t you just tell your X server to work with the Microsoft CP encoding that’s used to prepare a Word document? X servers are hard-wired to work with a variety of encodings. However, as you might imagine, Microsoft’s proprietary encodings aren’t among them. Fortunately, X is also designed to work with additional encodings. The trick lies in making the needed encoding file—a file with the *.enc extension—available in the default encoding directory. (On Red Hat systems, this directory is /usr/share/enscript.) You then prepare the fonts directory with the mkfontdir command’s -e switch, which generates a file called encodings.dir. This file alerts the X server that more encodings are available, and it also tells the server where to find these encodings.

Getting StarOffice to work with the Microsoft font encodings isn’t quite as simple—you’ll need to pull off a few tricks. Please note that the procedure discussed here isn’t supported officially by Sun Microsystems and may suddenly stop working if you install a new version of StarOffice. (The steps that follow work with StarOffice version 5.1a, which is available for download .)
If you’ve already configured your system to work with TrueType fonts, be aware that the modifications discussed in this Daily Drill Down won’t work. In fact, StarOffice won’t run. (You’ll get one of those lovely “uncoverable error” messages.) The problem lies in the fact that StarOffice can’t work with both a *.pfa and a *.ttf screen font mapping at the same time. Before you proceed with the instructions given here, remove the TrueType fonts that you would like to modify (such as Times Roman, Arial, and Symbol) so that they aren’t available to the X server (or X font server).
Shopping list
To pull off the tricks that are detailed in this Daily Drill Down, you’ll need some goodies. Take some time now to download them (and install them, if necessary) so that they’re ready for your use.
  • Microsoft encoding—By default, X servers are hard-wired to work with a number of font encodings, but Microsoft’s isn’t among them. Fortunately, X servers are also programmed to load additional encodings. You can obtain the necessary encoding file by downloading and untarring xfsft , a TrueType font server for Linux. Don’t install the font server. You just need the encoding files. Note: don’t confuse xfsft with xfstt (it’s “tt” instead of “ft” at the end); they’re two very different programs.
  • TrueType fonts—If you don’t have a licensed copy of Windows 98, you can obtain some free TrueType fonts from Microsoft's Typography site . Among these are Arial and Times New Roman, which are the default fonts for recent versions of Microsoft Word. Unfortunately, these fonts download as self-extracting archives. You'll need to download them to a Windows system, extract them, and copy the resulting *.ttf files to your Linux system.
  • TrueType to Type 1 converter—The utility of choice is ttf2pt1 . Download and install it so that you can execute the program from any directory. (On Red Hat systems, you can accomplish this task by copying ttf2pt1 to /usr/bin.)
  • Type 1 utility for creating fonts.scale—Download and install this vital utility so that you can execute it from any directory.

Overview
The rest of this Daily Drill Down describes, in detail, the steps that you’ll follow to modify StarOffice so that it can display Microsoft’s proprietary character set. It’s a bit complex—hey, this is Linux—but here’s a brief overview:
  1. So that X can display the Microsoft fonts properly, you’ll need to obtain and install an encoding file that describes Microsoft’s proprietary coding scheme. It’s called Microsoft CP-1252 (for U.S. systems). You can obtain this file by downloading and untarring (but not installing) xfstt, a TrueType server for Linux and by copying the necessary encoding files to the default encoding directory on your system (such as /usr/share/enscript). If you’re using StarOffice outside the U.S., note that this file contains additional Microsoft CP encodings that may prove more suitable. If you’re working with Word documents encoding with one of the other encodings, you should be able to substitute one of the alternative encodings for CP-1252 and get good results. (Note, though, that I’ve restricted my testing to CP-1252.)
  2. Next, you’ll copy TrueType fonts to your system, translate the uppercase file names to lowercase (if necessary), and convert the fonts to *.pfb files. You’ll use the utility called type1inst to create the file called fonts.scale; you’ll then modify the contents of this file so that the only available encoding is Microsoft CP-1252. Finally, you’ll run mkfontdir with the -e switch, which enables you to specify the directory where the needed encodings are found. The result of these steps will be a directory containing *.tff, *.pfb, and *.afm files for each font. The directory also contains a version of the fonts.scale file that contains only the Microsoft font encoding.
  3. Now that you’ve created and modified the Type 1 fonts, you’ll install them in StarOffice.
  4. The next step involves making the modified Type 1 fonts available to the X server or X font server. To do so, you’ll need to create a new directory for the required *.pfa files and run ttf2pt1 with the -e switch.
  5. Configure StarOffice so that it uses Microsoft’s character mappings for the inserted smart quotes when you create new StarOffice documents. Note, though, that this step will ensure that your StarOffice documents have the same deficiencies as Word documents when they’re opened by unmodified copies of StarOffice—little, ugly boxes everywhere. See what I mean about “embrace and extend”?

Installing the encoding file
  1. Untar the copy of xfsft that you downloaded. (See “Shopping list” above for information on where to obtain xfsft.) Note: Don’t confuse xfsft with xfstt. They’re very different programs.
  2. Switch to the xfsft directory that tar created (just below the one from which you ran the program). Switch to the encodings subdirectory.
  3. Copy all the encodings (*.enc) to the default encodings directory on your system (such as /usr/share/enscript).

Converting TrueType fonts to Type 1 fonts
Although StarOffice can display TrueType fonts in their native format, the program cannot print them, at least not without modifications that will take yet another Daily Drill Down to explain. (Part 3 of this series will deal with TrueType fonts in the StarOffice environment.) By far, the easiest way to deal with TrueType fonts for StarOffice lies in converting them to Type 1 fonts, which you’ll learn to do in this section. You’ll also learn how to modify the fonts.scale file so that StarOffice and X are forced to use Microsoft’s encoding scheme.
  1. Log on with the user account that’s normally used to run StarOffice.
  2. Within the StarOffice fonts directory (Office51/fonts), create a new directory called msfonts.
  3. Copy the core TrueType fonts that are normally used in the Word documents you receive (including Arial, Courier, Times New Roman, and Symbol) to the directory you just created (Office51/fonts/msfonts).
  4. If necessary, convert the fonts to lower case. You can use the following simple shell script:
    $for file in *.TTF
    >do
    >name=$(echo $file | tr A-Z a-z)
    >mv $file $name
    >done
    To use this shell script, type the first line (for file in *.TTF) and press [Enter]. Press [Enter] after typing each line.
  5. Most of the TrueType fonts that you’ll want to use come in at least three versions: regular (medium width), bold, and italic. For Arial, the corresponding file names are arial.ttf (medium width), arialb.ttf (bold), and ariali.ttf (italic). You’ll need to convert and install all three versions of the font.
  6. First, make the *.afm and *.pfb files with ttf2pt1’s -b switch. To avoid having to type the command repeatedly for each font in the directory, you can use the following simple shell script:
    $for file in *.ttf
    >do
    >name=$(basename $file .ttf)
    >ttf2pt1 -b $file $name
    >done
  7. In the same directory, type type1inst and press [Enter]. This utility creates the fonts.scale file.
  8. In a text editor that has search and replace capabilities, open the file fonts.scale, which Type1inst just created. You'll see a list of font names, such as the following:
    georgia.pfb -misc-georgia-medium-r-normal—0-0-0-0-p-0-adobe-fontspecific
  9. For each font name, carefully erase iso8859-1 and substitute microsoft-cp1252 in its place. You can use the text editor’s search and replace capabilities for this purpose.
  10. When you’ve finished changing the font's encoding, save fonts.scale and exit the text editor.
  11. Type mkfontdir,followed by -e and the location of the default encoding directory, such as /usr/share/enscript. This command will create a modified fonts.dir file and the encodings.dir file (which tells the X server where the non-standard encodings can be found).

Installing the fonts in StarOffice
Now that you’ve created the *.pfb and *.afm versions of the TrueType fonts you wanted to install, you’re ready to install them into StarOffice.
  1. Log on with the user account that’s active when StarOffice is used.
  2. Switch to the Office51/bin directory, type ./spadmin, and press [Enter]. You'll see the Printer Installation dialog box.
  3. Click Add Fonts. You'll see the Font Path dialog box.
  4. Click Browse. You'll see the Select Directory dialog box.
  5. In the Directory textbox, type the full pathname of the new fonts directory (such as /home/linda/Office51/fonts/msfonts). Or use the directory navigation window to locate and select this directory.
  6. Click OK. You'll see the directory you've selected in the Font Path dialog box.
  7. Click OK to close the Font Path dialog box. The utility scans for new fonts and displays them in a list. Click OK to install these fonts. Next, you'll see a dialog box informing you, incorrectly, that the new font directories will be added to the default X font path. Click OK to dismiss this dialog box.
  8. Important: Click Edit Font Attributes. You can’t print with the fonts that you've added until you convert all the metric files for printing purposes.
  9. In the Fonts dialog box, click Convert All Metric Files.
  10. Click Close until you see the terminal window again.
  11. Restart X.

Making the fonts available to the X server
To make the fonts that you’ve modified available to the X server, do the following:
  1. In the Office51/fonts/msfonts directory, run ttf2pt1 again to make the *.pfas file that the X server needs. To do this, run ttf2pt1 with the -e switch. For example, to create arial.pfa, type ttf2pt1 -e arial.ttf arial and press [Enter]. To save wear and tear on the fingers and wrist, you can use the following simple shell script:
    $for file in *.ttf
    >do
    >name=$(basename $file .ttf)
    >ttf2pt1 -e $file $name
    >done
  2. Make a directory called Office51/fonts/msfonts/pfa.
  3. Copy all of the *.afm and *.pfa files that you just created to the new directory.
  4. Switch to the directory to which you just copied the files (Office51/fonts/msfonts/pfa).
  5. Type type1inst and press [Enter].
  6. In a text editor that has search and replace capabilities, open the file fonts.scale, which Type1inst just created.
  7. For each font name, carefully erase iso8859-1 and substitute microsoft-cp1252 in its place. You can use the text editor’s search and replace capabilities for this purpose.
  8. When you’ve finished changing the font's encoding, save fonts.scale and exit the text editor.
  9. Type mkfontdir -e /usr/share/enscript and press [Enter]. If you stored the encodings files in a different directory, use that directory’s name instead of /usr/share/enscript.
  10. Now, tell the X server or X font server where the new font directory is located. Start your favorite text editor and open /etc/X11/XF86Config (for systems that serve fonts with the X server) or /etc/X11/fs/config (for systems that serve fonts with xfs).
  11. Add the directory that you just created to the appropriate area (the FontPath area of XF86Config or the catalogue area of /etc/X11/fs/config). Carefully type the directory name on its own line and type a trailing comma (unless the directory you are adding is the last in the list). Save the file and exit the editor.
  12. Restart X. When you’ve logged on again, open a terminal window, type xlsfonts | less and press [Enter]. Make sure that your X server is making the new fonts available. Also, make sure that they’re listed with the correct encoding (microsoft-cp1252). If you’re running GNOME or KDE, double-check the installation by using the supplied font utility. Make sure that the new fonts are available and that they display properly on-screen within these utilities.
  13. After you’ve restarted X and logged on to the user account where StarOffice is installed, start StarOffice and open a Word document that contains smart quotes and special characters. If you’ve followed the above steps correctly and you’re displaying a Word document that contains the fonts you’ve installed, you should see the smart quotes and other special characters on-screen. (If the author of the Word document used TrueType fonts other than the ones that you’ve installed, StarOffice isn’t really displaying them; rather, it’s using a bitmapped font that’s “close” to the font that’s not installed. To see the smart quotes and special characters, reformat the document with one of the converted TrueType fonts that you’ve installed on your system and StarOffice.)

If you’re still seeing funny little boxes or blanks where quotation marks and apostrophes should be, you might have made a mistake when you followed the above steps. Try printing the document. If the special characters show up in the printout, then there’s something wrong with the screen fonts. Make sure that you followed these steps correctly. If this action doesn’t solve the problem, open a new text document in StarOffice and see if the new fonts are available in the font selection box. If not, then they aren’t properly installed in StarOffice. It’s possible that the fonts were corrupt or that you omitted one of the steps involved in translating them. Try again.

Telling StarOffice which characters to use for smart quotes
If you’d like to be able to compose documents in StarOffice and use Microsoft’s special characters, you can save these documents with the Word export filters and give them to Word users. The smart quotes and other special characters will look just fine in Word, even though you composed the document in StarOffice. The trick lies in telling StarOffice which smart quotes and apostrophes to use.

To tell StarOffice which characters to use for smart quotes and apostrophes, do the following:
  1. Click Tools on the menu bar and select AutoCorrect/AutoFormat.
  2. On the AutoCorrect dialog box, select the Custom Quotes tab.
  3. In the Single Quotes area, select Replace.
  4. Next to Start Quote, click the button that shows the currently selected single quote character. You'll see a character selection box.
  5. Choose character number 145.
  6. Repeat the above steps to choose the characters for end single quotes (146), start double quotes (147), and end double quotes (148).
  7. Choose OK.

In part 3 of this series, you’ll learn how to make TrueType fonts work with StarOffice. Believe it or not, it’s possible—and you’ll love the results.

Bryan Pfaffenberger, a UNIX user since 1985, is a University of Virginia professor, an author, and a passionate advocate of Linux and open source software. A Linux Journal columnist, his recent Linux-related books include Linux Clearly Explained (Morgan-Kaufmann) and Mastering Gnome (Sybex; in press). His hobbies include messing around with his home LAN and sailing the southern Chesapeake Bay. He lives in Charlottesville, VA. If you’d like to contact Bryan, send him an e-mail .

The authors and editors have taken care in preparation of the content contained herein, but make no expressed or implied warranty of any kind and assume no responsibility for errors or omissions. No liability is assumed for any damages. Always have a verified backup before making any changes.

Editor's Picks

Free Newsletters, In your Inbox