TR Dojo: Three ways to find duplicates in Excel
March 17, 2011, 8:26am PDT | Length: 00:06:27
Bill Detwiler shows you how to find duplicate data in Excel by using advanced filters, conditional formatting, and counting formulas. Once you’ve watched this TR Dojo video, you can find a link to the original TechRepublic article and print the tip from our TR Dojo Blog.
No messages found
No messages found
Log in to display your contacts' posts.
Once logged in, adding contacts is simple. Just mouse over any member's photo or click any member's name then click the "Follow" button. You can easily manage your contacts within your account contacts page.
Transcript
>>Bill Detweiler: Finding all the duplicate information in an Excel spreadsheet can be a challenge and you'll often need to use more than one technique to get the job done. Well, I'm Bill Detweiler, during this episode of TR Dojo I'll show you three ways to find duplicates in Microsoft Excel.
Music There are many different types of duplicate data that can occur within an Excel spreadsheet. You can have duplicate entries confined within a single column or spread out across multiple columns. You can even have whole records repeated within the sheet and unfortunately there isn't one function or filter that will find every type of duplicate entry. Instead Susan Harkins, TechRepulic's Microsoft Office blogger, suggests that you match the right detection technique to the specific type of duplicate you're trying to find. For example, let's say you have a spreadsheet where the individual records are located in rows and for whatever reason there are a lot of duplicate records. Now using Excel's advanced filter options you can easily filter out all the duplicates. So first, open the spreadsheet and select any cell inside the record set then from the data menu choose Filter and select Advanced Filter. When the advanced filter dialog box appears, select Copy to Another Location in the action section and enter a copy range in the Copy To control. Now make sure that the Unique Records Only option is checked and click OK. Excel will copy a filtered list of unique records to the range you specified in the Copy To control. Now at this point you can replace the original record set with the new duplicate free list. While this quick advanced filter trick is great at finding entire records that have been repeated within a spreadsheet locating duplicate entries within a single column or span across multiple columns is a bit more complicated. For locating duplicates within a single column Susan recommends using Excel's conditional formatting feature and here's how you do it. Select the first status cell within the column that you'll be searching. In this example cell A2. Now choose Conditional Formatting from the format menu. In the formula control enter a formula like this: Equal, count if and then in parenthesis A:A,A2 and in parenthesis and then greater than one. Now remember you'll need to tweak this formula to fit your spreadsheet. Now click the Format button and specify the appropriate format. For instance, click the Font tab and choose Red from the color control, then click OK and click OK again to return to the worksheet. With the cell still selected click the Format Painter and then select the remaining cells in the column. The conditional formatting feature will apply your new format to any value in column A that's been repeated. One thing to note about this method; if you want Excel to select only the copies leaving the first occurrence of the value unaltered, enter the formula: Equal, count if and then in parenthesis $A$2:$A2, A2 and in closed parenthesis and then greater than one instead of the formula that I specified earlier. Define duplicate data that's spread across multiple columns Susan suggests a combination of two expressions. One that concatenate the columns you're comparing and a second to count the duplicates. For example, if you wanted to find duplicates of both first and last names in a spreadsheet like the ones shown here, you could enter the following formula in cell D2 to concatenate the first and last name values: Equal A2 and B2. Now you could insert a space character between the two names if you like but it really isn't necessary. Then copy and paste that formula into the column's other cells to capture the remaining records. Next, moving over a column in cell E2 enter the formula shown here. Now remember you'll need to tweak both formulas to work with your specific spreadsheet and just like the first formula you'll need to copy and paste it down the column. Each of the duplicates should now be marked by the word "duplicate" in column E. Now if you want to get really fancy you can even combine methods two and three. For example, look at the record in row six of this sample spreadsheet. This record duplicates the first name Susan but not the last name. The conditional formatting which we applied via method two highlights the first name because it's a duplicate in column A. However, the formula in column E applied through the third method doesn't identify the combined value of columns A and B as a duplicate because the first and last name together is a unique value. Well, that does it for this episode. For more Microsoft Office tips and tricks checkout our Office blog or subscribe to our Office newsletter and as always for more teachings on your path to becoming an IT ninja, visit TRDojo.TechRepublic.com. Sign up for our weekly newsletter or follow me on Twitter. Thanks for visiting the TR Dojo.
Silence While this quick advanced filter trick is a great way...ah...now let's do that one again. Well, this quick advanced filter trick is great at finding entire records that have duplicated...records that have been duplicated...blah, blah...sea, she sells seashells by the seashore, Susan suggests expressions. Well, that does it for this episode. Ahh, I was rewriting the things. Alright, that's okay, I can pick that up later. Think I can remember that sentence.
Music
==== Transcribed by Automatic Sync Technologies ====



