Counting duplicates is a common Excel task that's easily met with a simple COUNT() function or a filtered data set. However, if the counts matter more than the values themselves, you might find the task a bit harder. For instance, you might need to know how many times any value occurs once, twice, three times, four times, and so on. In other words, you need a conditional count of a simple count. Fortunately, the problem only sounds convoluted. In this article, I'll show you a simple solution that involves a helper column and a simple matrix. If you need an expression solution, I'll show you that too.
I'm using Excel 2016 on a Windows 10 64-bit system, but this solution will work in versions 2007 through 2016. The COUNTIFS() function isn't supported by Excel 2003 or earlier. You can work with any data set or download the demonstration .xlsx file.
Setting things up
You might be wondering what I mean by conditionally counting—Excel's COUNTIFS() function does exactly that. But in this case, we're conditionally counting counts, not values. Figure A shows a simple dataset. The first step is to count the number of times a value occurs in the same data set. This can be accomplished easily enough with a simple COUNT() function in the form:
where column references the column that contains the values you're counting and value is a cell in the same column. The counting function in column C:
uses a column reference (B:B) to count the number of times each value occurs in column B:
- Mike and Susan have only one record in the data set.
- John and George have two records in the data set.
- Dawn has five records in the data set.
Use COUNT() to return the number of times the same value occurs.
The column reference is a powerful tool, but it does limit us a bit: You can't use the empty cells in column B below or above the actual data set for anything else. Now, let's suppose knowing how many records per person isn't your goal—you want to know how many times any record is repeated. In other words, you need to know how many records occur once, twice, thrice, and so on.
Conditionally counting the count
In a simple data set, you can look at the results of the COUNT() function and quickly determine that:
- Two people, Mike and Susan, have one record each.
- Two people, John and George, have two records.
- No one has three or four records.
- One person, Dawn, has five records.
- No one has more than five records.
I could easily omit the names and the results would be just as important—we're after the conditional count of the count, not the actual values being counted. I mention them only to clarify the results.
Figure B shows our matrix, which uses the COUNTIFS() function to conditionally count the results of the COUNT() function in column C.
A simple matrix counts the results of the COUNT() function.
This function takes the form:
where countrange is the range with the values you're counting and criteria is the condition values must meet to be included in the count.
The COUNTIFS() function in row 3:
uses the header values from the simple matrix as criteria. That result is then divided by criteria.
However, the matrix fails to give a complete picture. We can clearly see that at least one value has five records but the matrix doesn't reflect that.
An expression solution
The matrix is only as good as its header row. We can extend that row, but how far will be enough? A more dynamic solution is to use the same expression to refer to the COUNT() results in column C. Figure C shows the expression:
which uses the result of the COUNT() function as criteria.
The matrix expression is more dynamic when referring to the count values.
By referring to the actual results, the expression guarantees that all possibilities are accounted for. The matrix could do this too if you knew ahead of time how far to extend it. Fortunately, ESP isn't a requirement for conditionally counting counts.
On the down side, the expression in column D might confuse users who don't understand its purpose. Column C displays the count and column D displays the number of times that "count" is repeated. Table A gives a quick review.
I suggest that you not include the counting columns when sharing the data with others. This solution helps you glean information that might be meaningful if presented in another format.
This column was inspired by a reader. I chose this route because, in my mind at least, it was quick, and the built-in features I often consider, such as filtering and Subtotals, presented too many hoops to jump through. Do you have a different solution? If so, please share it in the comments section below.
Send me your question about Office
I answer readers' questions when I can, but there's no guarantee. Don't send files unless requested; initial requests for help that arrive with attached files will be deleted unread. You can send screenshots of your data to help clarify your question. When contacting me, be as specific as possible. For example, "Please troubleshoot my workbook and fix what's wrong" probably won't get a response, but "Can you tell me why this formula isn't returning the expected results?" might. Please mention the app and version that you're using. I'm not reimbursed by TechRepublic for my time or expertise when helping readers, nor do I ask for a fee from readers I help. You can contact me at email@example.com.
More Office tips and how-to's
Susan Sales Harkins is an IT consultant, specializing in desktop solutions. Previously, she was editor in chief for The Cobb Group, the world's largest publisher of technical journals.