Counting duplicate values is a subject that begs definition. What constitutes a duplicate? Within Excel, you can have duplicate values within the same column or you can have duplicate records — a row where every value in the record is repeated.In this article, we'll focus on duplicate values within the same column. Once you start counting duplicates, you'll often discover that you need more. For instance, you might need the opposite — how many unique values are in the column.
There are several ways to count duplicate values and unique values. You can work with most any dataset or download the .xlsx or .xls demonstration file (although specific instructions for the .xls format aren't included in this article). To read about finding (as opposed to counting) duplicates in Excel, check out my previous article, "How to find duplicates in Excel."
Figure A shows a COUNTIF() function that gets the job done:
COUNTIF() counts duplicate species.
We're not counting the number of actual duplicates but rather the number of times the value occurs within the given range. If we were counting strictly duplicates, we wouldn't include the first occurrence of the value.
As you can see, the function returns the correct count, and it's a quick fix. The main problem with this structure is the B:B reference. It's great if you add and delete records, but it'll also evaluate non-contiguous values. If you enter one of the values below to the dataset, the function will add it to the count, as shown in Figure B, even if it's not in the actual data range.
The reference evaluates non-contiguous values.
Removing the duplicates from view
The COUNTIF() is adequate, but you might want a list of unique values rather than the full dataset. In this case, you can use a PivotTable, as shown in Figure C. To create this view, do the following:
- Click a Species value (any cell in B2:B5).
- Click the Insert tab and then click PivotTable in the Tables group.
- Accept all the default values in the resulting dialog — simply click OK.
- In the PivotTable Fields list (to the right), drag the Species field to the Row Labels and the Values sections.
Dragging the same to both sections forces a unique count.
One of the first two solutions will probably satisfy most situations, but you could also use Excel's Subtotal feature, which evaluates data by groups. However, it's not as convenient, because you'll have to sort the data first. If that's not a problem, or it's what you're doing anyway, you might want to consider using the Subtotal feature:
- First, sort the data by clicking inside the column you want to sort. Then, click Sort & Filter in the Editing group (on the Home tab), and choose Sort A To Z from the drop-down list.
- With the data sorted, click the Data tab.
- In the Outline group, click Subtotal.
- In the resulting dialog, set the parameters for your data. In this case, select the Species field and specify the Count function (Figure D).
- Click OK.
This feature will insert a subtotaling row below each group. In this case, it also displays an optional grand total for the column (Figure E).
There's now a subtotaling row below each group.
To learn more about this feature, read the following articles:
- 10+ tips for working with Excel's Subtotal feature
- Harness the power of Subtotal in Excel to count grouped items
- Display multiple subtotaling functions in Excel
Counting unique values
The flip side of counting duplicates might be to count the number of unique values. The traditional method is to use the SUMPRODUCT() function. This solution has been around for a long time, and I can't take credit for it. To the best of my knowledge, Excel still doesn't have a built-in function for counting unique values. When counting unique values, use the following expression:
Figure F shows this function at work in our example data... sort of.
Return the number of unique values in a column.
As you can see, the function
returns 4 and there are 3 unique values. The problem is the column reference. There's nothing wrong, but there are actually 4 unique values in column B, because the function evaluates the entire column — including the string Species in B1. If you can delete the header text, this expression works. If you can't, subtract 1 from the final count, as shown in Figure G.
Refining the expression.
Duplicates and more
Within the context of duplicates, definitions aren't the same. In this case, we used a function and two built-in features to count the number of times a value is repeated in the same range. Then, we used an expression to return the number of unique values in the same range.
Send me your question about Office
I answer readers' questions when I can, but there's no guarantee. When contacting me, be as specific as possible: For instance, "Please troubleshoot my workbook and fix what's wrong" probably won't get a response, but "Can you tell me why this formula isn't returning the expected results?" might. I'm not reimbursed by TechRepublic for my time or expertise, nor do I ask for a fee from readers. You can contact me at email@example.com.
Susan Sales Harkins is an IT consultant, specializing in desktop solutions. Previously, she was editor in chief for The Cobb Group, the world's largest publisher of technical journals.