Tutorial: Excel parsing expressions

Parsing data can be a difficult task, and it gets even harder if the entries aren't consistent. Use Excel's string functions to parse inconsistent data.

You probably wouldn't store first and last names in the same cell, but you might have to work with a legacy workbook that does. Or, you might import data from a foreign source where the names are combined into one field. Fortunately, Excel has several string functions, Right(), Left(), Find(), Len(), and Mid() that can parse the name components into separate entries.

First, the easy part; parse the component to the left using the simple expression:

=LEFT(A2,FIND(" ",A2)-1)

It makes no difference whether the component is the first or last name. In the case of Robin Banks, the FIND() function returns the value 6, but the expression subtracts 1 from the results. Consequently, the expression extracts the first five characters. If you want to extract the space character, omit the -1 component.

The inconsistency of the entries—some have middle initials and some don't—makes extracting the last name, a bit more complicated. You might try the following expression, but as you can see, it doesn't work as expected:

=RIGHT(A2,LEN(A2)-FIND(" ",A2,FIND(" ",A2)+1))

If the entry doesn't contain two space characters, the second FIND() returns an error value. Use the following expression instead:

=IFERROR(RIGHT(A2,LEN(A2)-IFERROR(FIND(" ",A2,FIND(" ",A2)+1),FIND(" ",A2))),A2)

IFERROR() handles the errors, but the logic is similar.

There's one last step—returning  the middle initial:

=MID(A2,FIND(" ",A2)+1,IFERROR(FIND(" ",A2,FIND(" ",A2)+1)-FIND(" ",A2)-1,0))

If there's no middle initial, this expression returns an empty string instead of an error.

TechRepublic's Microsoft Office Suite newsletter, delivered every Wednesday, is designed to help your users get the most from Word, Excel, and Access. Automatically sign up today!

It's worth mentioning that the Text To Columns feature is an expression-less solution if the entries are consistent. In addition, to learn more about using string functions, read Save time by using Excel's Left, Right, and Mid string functions. Finally, IFERROR() is new to Excel 2007. The logic for these expressions is the same in 2003, but use ISERROR() to handle the error values.


Susan Sales Harkins is an IT consultant, specializing in desktop solutions. Previously, she was editor in chief for The Cobb Group, the world's largest publisher of technical journals.

Editor's Picks