Use formulas to find multi-column duplicates in Excel

Duplicates in the same column are easy to find by sorting or filtering. Duplicates that span multiple columns require a bit of setup, but the solution's not difficult to implement.

Finding duplicate values in the same column is easy; you can sort or apply a filter depending on the circumstances. Finding duplicates that span multiple columns is a tad more difficult. A sort can work, but then you have to find the duplicate values. So while it's better than no solution at all, it's not a good solution. You probably want a solution that kind of screams out at you Here I am! I'm a duplicate! Let's take a quick look at a simple example. The sheet below contains a column of dates and a column of initials. A few dates are repeated and a few initial sets are repeated; they represent duplicates within those columns. However, we're interested in records that repeat the same date and the same initials. That's what I mean by a multi-column duplicate. It's fairly easy to spot the duplicates, rows 8 and 9, in such a simple sheet, but what if you had hundreds or thousands of rows to check? An AutoFilter will work, but it's a vulnerable solution. In this case, there are five distinct dates. That means a user has to review at least five sets of records to find duplicates. Even then, you have to trust your user to actually spot them. It's not a good solution, especially if you have lots of data. You might try an advanced filter or even conditional formatting, but both would require so much hoop-jumping that I'm not sure the end result would be worth the fuss, when compared to the easy-to-implement solution that follows:
  • Concatenate the columns you're checking.
  • Use CountIf() to count the number of combined values.
Let's enhance the sheet above to see how this works:

  1. In cell C2 enter the formula =A2&B2. (You can combine more columns.)
  2. Copy the formula to C3:C9. Excel uses each date's serial value, but that won't interfere with the technique. However,  if your values contain times, it might, depending on how the time values were entered.

  1. In cell D2 enter the following formula: =IF(COUNTIF($C$2:C2,C2)>1, "Here I am! I'm a duplicate!","Original")
  2. Copy the formula to D3:D9. At this point, finding multi-column duplicates is as easy as sorting by column D (although this example doesn't require any sorting).

The IfCount() function counts the number of times the concatenated values occur within the extending range. If the count is greater than 1, the formula returns the string "Here I am! I'm a duplicate!"; when the count isn't greater than 1, the formula returns the string "Original." Only the first occurrence will be identified as Original. This technique easily adapts to include additional columns. Simply add each column to the concatenating formula (column C in this example). Of course, there are other ways to identify multi-column duplicates in Excel. If you'd like to share a more efficient solution or a more complex problem, please start a conversation below.


Susan Sales Harkins is an IT consultant, specializing in desktop solutions. Previously, she was editor in chief for The Cobb Group, the world's largest publisher of technical journals.

Editor's Picks