Editor's note:In the video, Brandon Vigliarolo uses Microsoft Office 365 and walks through the steps of finding, identifying, and removing duplicate data in Excel. The following tutorial by Susan Harkins was originally published in January 2009.
In the duplicate world, definition means everything. That's because a duplicate is subjective to the context of its related data. Duplicates can occur within a single column, across multiple columns, or complete records. There's no one feature or technique that will find duplicates in every case.
SEE: Comparison chart: Office suites (Tech Pro Research)
To find duplicate records, use Excel's easy-to-use Filter feature as follows:
- Select any cell inside the recordset.
- From the Data menu, choose Filter and then select Advanced Filter to open the Advanced Filter dialog box.
- Select Copy To Another Location in the Action section.
- Enter a copy range in the Copy To control.
- Check Unique Records Only and click OK.
Excel will copy a filtered list of unique records to the range you specified in Copy To. At this point, you can replace the original recordset with the filtered list (the copied list) if you want to delete the duplicates.
Finding duplicates in a single column or across multiple columns is a bit more difficult. Use conditional formatting to highlight duplicates in a single column as follows:
- Using the example worksheet, select cell A2. When applying this to your own worksheet, select the first data cell in the list (column).
- Choose Conditional Formatting from the Format menu.
- Choose Formula Is from the first control's drop-down list.
- In the formula control, enter =COUNTIF(A:A,A2)>1.
- Click the Format button and specify the appropriate format. For instance, click the Font tab and choose Red from the Color control and click OK. At this point, the Conditional Formatting dialog box should resemble the following figure:
- Click OK to return to the worksheet.
- With cell A2 still selected, click Format Painter.
- Select the remaining cells in the list (cells A3:A5 in the example worksheet).
The conditional format will highlight any value in column A that's repeated. If you want Excel to highlight only the copies, leaving the first occurrence of the value unaltered, enter the formula =COUNTIF($A$2:$A2, A2)>1 in step 4.
The conditional format works great for a single column. To find duplicates across multiple columns, use two expressions: One to concatenate the columns you're comparing; a second to count the duplicates. For example, if you wanted to find duplicates of both first and last names in the example worksheet, you'd enter the following formula in cell D2 to concatenate the first and last name values:
You could insert a space character between the two names if you liked, but it isn't necessary. Copy the formula to accommodate the remaining list items.
Next, in cell E2 enter the following formula and copy it to accommodate the remaining list:
Notice that the worksheet has a new record (row 6). This record duplicates the first name, Susan, but not the last name. The conditional format highlights the first name because it's a duplicate in column A. However, the formula in column E doesn't identify the combined values across columns A and B as a duplicate because the first and last names together aren't duplicated.
- How to use Excel's what-if tools to analyze business scenarios (free PDF) (TechRepublic)
- You've been using Excel wrong all along (and that's OK) (ZDNet)
- A cheatsheet of Excel shortcuts that make inserting data faster (TechRepublic)
- Six clicks: Excel power tips to make you an instant expert (ZDNet)
- Five ways to take advantage of Excel list features (TechRepublic)
- Microsoft Excel 2016 for Windows (Download.com)
Susan Sales Harkins is an IT consultant, specializing in desktop solutions. Previously, she was editor in chief for The Cobb Group, the world's largest publisher of technical journals.