Software

Using Excel's Find and Mid to extract a substring when you don't know the start point

Here's how to use Excel's Find function in conjunction with the Mid function to locate and extract a string, regardless of how many characters the source string contains.

In “Save time by using Excel’s Left, Right, and Mid string functions,” I showed you how to extract substrings from a text entry in a spreadsheet cell. To review, the Left function returns the first (leftmost) n characters from a string and the Right function returns the rightmost n characters.

The Mid function requires two arguments—a start point and the n number of characters to return. In the first lesson, the start point for the string we wanted to extract from the middle of the string was constant. In other words, we knew we wanted to “grab” two characters out of the middle of our sample string, and we knew those two characters always started in the fourth position of the string.

Knowing that the middle string always starts in a certain position makes it easy to use the Mid function. But what do you do when the substring you want to extract could start anywhere within the string? As long as there’s something in the string that identifies the start of the substring, you can use the Find function with the Mid function to accomplish the mission.

Use Find to generate the start point for the Mid function call
Here’s a simple example to demonstrate how this works. Suppose you have a series of variable-length strings like those shown in Figure A. The only thing that’s consistent in those strings is the presence of the dash (or hyphen). Your challenge: You want to compose a formula that returns the first two characters following the dash.

Figure A
You can use the Mid and Find functions to extract the first two characters following the dash in each of these entries.


Here’s the trick: You use the Find function to return the position of the dash in the string, and you use the Find function itself as the Mid function’s first argument.

The Find function takes the form:
=Find(thing_to_find,source)

As Figure B shows, the function =Find(“-“,A2) returns the value 6, since the dash appears in the sixth position of the string in A2. We copied this Find function down the column so you can see that it returns a different value each time, depending on where the dash appears in the strings in column A.

Figure B
The Find function returns the position of the dash in each of the strings in column A.


In order to extract the first two characters that follow the dash, you’ll use the Find function to locate the dash and add 1 to that result. (The Find function returns the location of the dash; you want to start extracting characters beginning at the next character, the one following the dash.)

So our formula looks like this: =Mid(A2,Find(“-“,A2)+1,2). That formula says, in English, “Look at A2; determine the position of the dash in A2; add 1 to that value; then return the next two characters.” Figure C shows our results.

Figure C
The Mid function relies on the Find function to tell it where the dash appears in the source string. You add 1 (circled in red) to the Find function’s result to start the extraction in the correct place.


Only the beginning
We’ve just scratched the surface of what you can do by combining the Find and Mid functions. In a future article, we’ll show you how to use a variation of this technique to transpose first and last names to format them as Last, First.
To comment on this article or to share your favorite Excel tip, please post a comment below or drop us a note.
6 comments
ALey85
ALey85

Hi All, 


I need to pull out information from very long strings and they can be quite inconsistent in the amount of characters. However the one thing that is consistent is that they are separated by a /  and the number of slashes is the same 

eg. I need to pull "test_A_v01","test_B_v01 out from the below strings :

example1/worksheet/test_A_v01/number9/value6

example1/worksheet/test_B_v01/number9/value6


Is there any easy way to do this?


Thank you!

jsnwilson
jsnwilson

What if I have a string of text and need to extract certain portions that may be repeated?  For instance, my text may read "#123456, #234567 and #345678" and I want the results to read as "#123456", "#234567" and "#345678" in separate columns?  Is there a way to use formulas to pull the first "#" and number in column B, the second "#" and number in column C, etc.?

Darbaby55
Darbaby55

I am reversing first name lastname usign this formula. =MID(C3&", "&C3,FIND(" ",C3)+1,LEN(C3)).  It is truncating the first name.  Seee below Can you help?

Kelley Henry             Henry, Kelle

Kelly Sevine              Sevine, Kell

tlcooper
tlcooper

what if i want it to find the 2nd space in a string and start at that point and go the length until it finds the 3rd space? for example, "corp cab I29 06/p7/b4" all i want to extract out of that is the I29. or it could be "corp cab ab12 08/i2/b8" and i want to extract ab12.

tlcooper
tlcooper

What if I want it to find the 2nd space in the string and start at that point and go the length until it finds the 3rd space?

XaeroA
XaeroA

@tlcooper

I'd suggest nesting find statements as follows so:

=MID(A1,FIND(" ",A1,FIND(" ",A1)+1)+1,((FIND(" ",A1,(FIND(" ",A1,FIND(" ",A1)+1)+1))-(FIND(" ",A1,FIND(" ",A1)+1)+1))))

It's a little long winded so my apologies for that, but it should do the job just dandy.

Side note, if your text doesn't contain 3 space this will return an incorrect value or an error (#VALUE! I believe) so be careful of that