Software

A formula-less technique for parsing Excel data

Excel's Text to Columns feature parses data quickly and almost effortlessly.
Whether you're supporting a legacy application or importing foreign data, you probably have to parse entries occasionally. For instance, you might have to parsing first and last names. You could use a complex formula to separate the names, but depending on the condition of the data, one formula might not be enough. I like to use the easiest method possible, which in this case, is the Text to Columns feature. To use Text to Columns to parse existing data, do the following:

  1. Select the column or data you want to parse.
  2. Choose Text to Columns from the Data menu. In Excel 2007, click the Data tab and choose Text To Columns in the Data Tools group.
  3. Click the Delimited option and click Next.
  4. Click the Space option in the Delimiters section. Now, this won't always be the case, but a single space character usually separates first and last names. Take a minute to view the results in the Data Preview pane. If you don't get the expected results, try another delimiter.
  5. If everything looks good, click Finish.

It's a one-time deal, so if you have to clean up data on a regular basis, a formula or even a function might be a more efficient solution.

About

Susan Sales Harkins is an IT consultant, specializing in desktop solutions. Previously, she was editor in chief for The Cobb Group, the world's largest publisher of technical journals.

3 comments
dqkennard
dqkennard

You show right in the example one of the ways this can be a problem: the third record has a middle initial where the last name should go. I break apart names all the time, and a set of formulae is almost always necessary, usually preceded by a manual skim-through to see what kind of stuff might be there. In this case, there's obviously a mix of records with or without middle name. A scan might reveal that there are some "Jr." or other suffix words included (which makes more difficult the use of the "last" word as the last name. Etc. The bottom line is that rarely do I see name data clean enough to use this technique -- and even the example data doesn't meet that standard.

Yangtze
Yangtze

Your explanation assumes that there is a space between the first and last names. If that is the case, then a modification of the following formula would work. I use this formula to extract a Unit ID number from our data often. Many of my coworkers adopted it. =left(cell ref,find(" ",cell ref)-1) the " " would locate the first space from the left in the cell with the concatenated first and last names. If you're extracting a number, as I do often, modify the formula: =value(left(cell ref,find(" ",cell ref)-1)) If your data has a set pattern, such as the assumed space in the example, I think this formula is easier than using the parsing feature.

ssharkins
ssharkins

The example data was purposely iffy to show the feature's limitations. However, when it works, it works great -- no need to throw the baby out with the bathwater! ;)

Editor's Picks