There are lots of ways to find duplicates in the same set of records. Using the COUNTIFS() function, you can compare different data sets for duplicate records.
The 2010 entry, Use formulas to find multi-column duplicates in Excel, combines IF() and COUNTIF() to find duplicate records within the same data set. However, using the COUNTIFS() function, you can also compare two data sets for duplicate records.
For instance, the two data sets shown below share only one duplicate record, row 4. The other records share common values in columns A and B, but not C. To quickly expose any duplicates, you can use COUNTIFS() to compare both data sets.
The COUNTIFS() function uses the following syntax to specify multiple criteria to determine which values in a range to count:
COUNTIFS(countrange1, criteria1, [countrange2, criteria2]...)
In this example, countrangex will refer to a column in one data range and criteriax will refer to the first corresponding data field in the same column in the other data range, as follows:
=COUNTIFS($A$9:$A$13,A2,$B$9:$B$13, B2, $C$9:$C$13,C2)
This example includes all three columns for a full-record comparison, but you could compare partial records by including only the columns you want to compare. In addition, if the data sets are on different sheets, be sure to include the sheet identifier, appropriately.
Now, enter the above function in D2 and copy to the remaining rows in the same data set. This function returns 1 when the corresponding record in the second data set is a duplicate of the first data set. When the rows aren't duplicates, the function returns 0.
You can use this column to filter the data set, as follows:
- Click the Data tab and then click Filter in the Sort & Filter group to display dropdowns for each column. In Excel 2003, choose Filter from the Data menu, and then select AutoFilter.
- Click the dropdown for column A.
- To see the duplicate record(s), select 0, and click OK.
Similarly, you'd deselect 1 to display your non-duplicate records and select Select All to view them all.
This method is extremely flexible because the duplicates don't have to be in the same row in both data sets. The duplicates can be anywhere within the data set and this function will still find them.
To help you learn this technique, sample Excel worksheets are available.