Open Source

How to find and convert StarOffice files

Marco Fioretti offers one solution for finding and converting old StarOffice files into formats that Apache OpenOffice and LibreOffice 4.x can access.

 

Find StarOffice files
 

The 4.x versions of Apache OpenOffice (AOO) and LibreOffice (LO) can no longer read or write the .sdw texts, .sdc spreadsheets, and .sdd presentations created with their direct ancestor, StarOffice. Even the most recent versions of the 3.x series may need an optional filter to do it. People have already started to ask, "How can I recover my .sd* files?!?" Right now, the easy answer they get is, "Install the filter, or a whole older version, then convert all those files with the Converter Wizard." This, however, may often be an incomplete answer.

Installing binary packages of an extra, older version may confuse users, and it will become harder to do this at every system upgrade. Installing from sources? Even worse. Besides, how does one find the files to convert? And what if your boss tells you, "Recover all our vital StarOffice documents from this pile of backup DVDs, right now — even those with wrong extensions!"?

Here is one answer, which I used this very week in a similar situation. It may even be more future-proof, in and beyond 2014, than the advice, "Open all the files you can find with an older AOO/LO. If you can install it, of course."

First, I searched for an older version of AOO/LO that was able to run off a USB key, because this software is compiled to depend as little as possible on system libraries. The first one I found was Libre Office Portable 3.3.4 for Windows. I only use Linux, but that package runs adequately (more on this later) inside Wine without any configuration. Since I had little time, I settled for it.

Finding and preparing the files

I created two empty folders called starofficeconversion and starofficeconversion_result. The first would contain copies of all and only the files to convert, still ordered in the same folders and subfolders as in the original archive. Working in this way, I would not damage the original files in case something went wrong. The second folder would store only the converted versions.

If I was absolutely sure that all the files had a .sdw, .sdc or .sdd extension, this command (in the top folder of the archive) would have been enough:

#> find . -type f | egrep '(sdw|sdc|sdd)$' > list_1

The command would save their names and relative paths into a text file called list_1. However, I could not exclude that there were misnamed files as well. Luckily, the Linux "file" command analyzes a file and outputs a description of its type. Using the file command on some of the sdw/sdc/sdd files found in the first step, I realized that the one common thing in their descriptions was the string "Composite Document File V2 Document." Hence, I stored the descriptions of all the files in or below the top directory:

#> find . -type f -exec file {} \; > /tmp/filetypes.log

I put these into a file called /tmp/filetypes.log, which had this format:

./2001/catalog_sample.txt: ASCII text 
./2001/contract.sdw: Composite Document File V2 Document, Little Endian...

I also saved the file names I wanted — that is, the parts before the colon of the lines that contained the string above — in a separate list:

#> grep 'Composite Document File V2 Document' /tmp/filetypes.log | cut -d: -f1 > list_2

Finally, I merged the two lists, removing the duplicates:

#> cat list_1 list_2 | sort | uniq > final_list

This allowed me to put all those files in a tar archive and extract the content inside the starofficeconversion folder:

#> tar cf starofficefiles.tar -T final_list 
#> cd starofficeconversion 
#> tar xf ../starofficefiles.tar 

Converting the files

Having all my files in one place, I fired up my vintage copy of LibreOffice in Wine and told its Converter Wizard (Figure A) to save — inside starofficeconversion_result — OpenDocument versions of all the StarOffice files it would find in starofficeconversion. 

Figure A

 

Figure A
 

The Converter Wizard.

I had to uncheck the text templates box and disable logging, otherwise the Wizard wouldn't continue — but apart from that, the conversion worked fine. As you can see in Figure B, "Gemelli alla Rucola e Prosciutto Crudo," along with many other family recipes not available anywhere else, are now safe from oblivion.

Figure B

 

Figure B
 

The conversion process completed.

Final tips... and requests

For the record, I had to disable the log file — otherwise, the Converter macros in that LibreOffice would fail with this error:

Basic Runtime error. 
An exception occurred. 
Type:com.sun.star.uno.RuntimeException 
Message: [msci_uno bridge error] unexpected C++ exception occurred!.

Here, instead, are a couple things you should know before trying this at home. First, the "Composite Document File V2 Document" string is also present in the description of many non-StarOffice files, and they're almost always recognizable by their different extensions. For example, my first list_2 had entries like "test.beh" and "trial.draw." If you don't want to duplicate and process those files too, remove their names from list_2 with the following command:

egrep -v -i '\.(beh|draw)$` list_2 > list2_tmp

The AOO/LO Converter recreates all the necessary parts of the original directory tree in the target folder. In my case, starofficeconversion/Documents/recipes/gemelli_rucola.sdw generated starofficeconversion_result/marco/Documents/recipes/gemelli_rucola.odt. This is very good, but wherever they end up, the converted copies all belong to the user who launched the Converter. If that's a problem for you, ask me for an extra post on how to fix it.

I'll close this post with a couple of requests. Please offer your assistance and/or suggestions in the discussion thread below.

  • A command-line friendly, stand-alone converter made by removing all the unnecessary code from those earlier AOO/LO releases would be great. Is there any developer interested? 
  • StarOffice was easy, but I still need a similar fix for FrameMaker files. Thanks in advance to whoever will help me to find it. 
 

About

Marco Fioretti is a freelance writer and teacher whose work focuses on the impact of open digital technologies on education, ethics, civil rights, and environmental issues.

Editor's Picks