Microsoft

How to find duplicates in Excel

You'll need more than one trick up your sleeve to find duplicates in Excel.

This post was originally published in January 2009.

In the duplicate world, definition means everything. That's because a duplicate is subjective to the context of its related data. Duplicates can occur within a single column, across multiple columns, or complete records. There's no one feature or technique that will find duplicates in every case.

To find duplicate records, use Excel's easy-to-use Filter feature as follows:

  1. Select any cell inside the recordset.
  2. From the Data menu, choose Filter and then select Advanced Filter to open the Advanced Filter dialog box.
  3. Select Copy To Another Location in the Action section.
  4. Enter a copy range in the Copy To control.
  5. Check Unique Records Only and click OK.

january2009blog6fig1.jpg

Excel will copy a filtered list of unique records to the range you specified in Copy To. At this point, you can replace the original recordset with the filtered list (the copied list) if you want to delete the duplicates.

january2009blog6fig2.jpg

Finding duplicates in a single column or across multiple columns is a bit more difficult. Use conditional formatting to highlight duplicates in a single column as follows:

  1. Using the example worksheet, select cell A2. When applying this to your own worksheet, select the first data cell in the list (column).
  2. Choose Conditional Formatting from the Format menu.
  3. Choose Formula Is from the first control's drop-down list.
  4. In the formula control, enter =COUNTIF(A:A,A2)>1.
  5. Click the Format button and specify the appropriate format. For instance, click the Font tab and choose Red from the Color control and click OK. At this point, the Conditional Formatting dialog box should resemble the following figure:

january2009blog6fig3.jpg

  1. Click OK to return to the worksheet.
  2. With cell A2 still selected, click Format Painter.
  3. Select the remaining cells in the list (cells A3:A5 in the example worksheet).

january2009blog6fig4.jpg

The conditional format will highlight any value in column A that's repeated. If you want Excel to highlight only the copies, leaving the first occurrence of the value unaltered, enter the formula =COUNTIF($A$2:$A2, A2)>1 in step 4.

The conditional format works great for a single column. To find duplicates across multiple columns, use two expressions: One to concatenate the columns you're comparing; a second to count the duplicates. For example, if you wanted to find duplicates of both first and last names in the example worksheet, you'd enter the following formula in cell D2 to concatenate the first and last name values:

=A2&B2

You could insert a space character between the two names if you liked, but it isn't necessary. Copy the formula to accommodate the remaining list items.

january2009blog6fig5.jpg

Next, in cell E2 enter the following formula and copy it to accommodate the remaining list:

=IF(COUNTIF(D$2:D$7,D2)>1,"Duplicated","")

january2009blog6fig6.jpg

Notice that the worksheet has a new record (row 6). This record duplicates the first name, Susan, but not the last name. The conditional format highlights the first name because it's a duplicate in column A. However, the formula in column E doesn't identify the combined values across columns A and B as a duplicate because the first and last names together aren't duplicated.

About

Susan Sales Harkins is an IT consultant, specializing in desktop solutions. Previously, she was editor in chief for The Cobb Group, the world's largest publisher of technical journals.

44 comments
sasatechno
sasatechno

Thanks for sharing this information this is best for understanding.

thanhonap
thanhonap

[onaprsc.com.vn] believe that This is an indispensable resource with any computer in the world. Excell has been improved through many versions and becoming more user-friendly and user utility@ http://onaprsc.com.vn

RyanCarlos
RyanCarlos

Thanks for sharing as i was not knowing about it.

mtuppen
mtuppen

As for people's comments so far.. there will always be more than one way of "skinning a cat" or in this case finding duplicates in Excel. To me the important part is understanding how you get there rather than the end result because you can use that knowlege for other things. The countif function is one that I'll most likely use again after reading this article.

laman
laman

Why don't we use the built-in feature in Office 2007 or 2010? There has been several articules showing how to do it already. I have found it intesting why all the sudden this old tips should be re-surfaced, and people liking it.

Mark.Mathews
Mark.Mathews

I work with large data sets in excel. If I'm looking for a duplicate, usually I just need to find it in place without filtering. First I sort the data on the column, click the first cell in the column, then run a small utility macro. Sub NextAlike() ' This macro moves the active cell down until it finds a duplicate. varSwitch = False Do Until varSwitch = True Item = ActiveCell ActiveCell.Offset(1, 0).Range("A1").Select NextItem = ActiveCell If NextItem = Item Then ActiveCell.Offset(-1, 0).Range("A1").Select varSwitch = True End If Loop End Sub

Zahra B.
Zahra B.

Just use the function in Excel (at least in 2010+ versions)... Maybe this column needs a bit of an update (at least the pictures?).

eclark
eclark

In Excel versions 2007 and later, it is much easier to find duplicates using Conditional, Formatting, Highlight Cells Rules, Duplicate Values. After select the row(s) or column(s) desired and using the process above, use the column filter and filter for cell color or whatever you criteria indicated.

jdishun
jdishun

Excel add-in ASAP has a function to detect duplicates in a column - plus many other useful functions. I've used it for years. Go to asap-utilities.com. It's free.

excelinexcel
excelinexcel

Or use the build-in feature; Remove Duplicates. You can find this feature on the Data tab under Data Tools.

kalakala1
kalakala1

I used to find duplicated manually. what a waste of time! Thanks for this tutorial! my shed plans review

karendavis
karendavis

In Accounting it is easier to find a $10,000 error than a 10 cent error. Thanks for the taking the time to share some tips with us all.

insuranceman1
insuranceman1

Thanks for the helpful hints. This looks like a complicated yet worthwhile solution to my current solution-- the incredibly ineffective "CTRL + F" search-and-delete.

chase.foster
chase.foster

I like the first method of filtering out the duplicates, but just to make sure, does this keep ONE of the duplicates?? (As opposed to deleting them both). That would mean I'd have to do a V-Lookup between the filtered and the unfiltered list? I hope not...

pavithra2503
pavithra2503

Hello all, can you please guide me an excel problem where i actually struck. actually i have x number of persons with y available dates. each person gave me their available dates to set up an meeting. now i want to display one specified date where all persons are available and arrange a meeting. how can i do that in excel. please help me to solve the problem.

smitten111
smitten111

I have an excel worksheet (mac). Within the worksheet I want to find matching entries in columns F and G which i know are there but the spreadsheet is large. I need all other data in the worksheet to maintain their row integrity when the match is made eg. F3 data is same now as G3 data but all other cells have kept their original data when the match was made.... not explaining this too well...

rarg
rarg

I recently needed to make sure that I had three codes in a row of data with no duplicates, so I used the following formula (a1, a2, and a3 being the codes): =IF(SUM(COUNTIF(C2:H2,"a1")=1, COUNTIF(C2:H2,"a2")=1, COUNTIF(C2:H2,"a3")=1 < SUM(COUNTIF(C2:H2,"a1"), COUNTIF(C2:H2,"a2"), COUNTIF(C2:H2,"a3")), "err", SUM(COUNTIF(C2:H2,"a1"), COUNTIF(C2:H2,"a2"), COUNTIF(C2:H2,"a3"))) This gives me a count (up to 3) of unique codes and an "err" if a duplicate code is found. It might not be very pretty or elegant, but it works.

Mark.Mathews
Mark.Mathews

Ugh! Too much work. I prefer to let the computer do the work for me. I sort first then use custom macros: Sub NextAlike() ' This macro moves the active cell down until it finds a duplicate. a = 0 Do Until a = 1 Item = ActiveCell ActiveCell.Offset(1, 0).Range("A1").Select NextItem = ActiveCell If NextItem = Item Then ActiveCell.Offset(-1, 0).Range("A1").Select a = 1 End If Loop End Sub If I want to delete duplicates: Sub DeleteDups() ' Deletes entire rows that have duplicate data in the current column. ' Safety Recommendation: Do not setup a toolbar button for this macro. DialogStyle = vbYesNo + vbCritical + vbDefaultButton2 Title = "Deleting Duplicates" Msg = "Are you sure you want to delete entire rows?" deleteDupRows = MsgBox(Msg, DialogStyle, Title) If ActiveCell.Value "" And deleteDupRows = vbYes Then Do While ActiveCell.Value "" noDups = ActiveCell.Value ActiveCell.Offset(1, 0).Range("A1").Select If ActiveCell.Value = noDups Then DelStartRow = ActiveCell.Row Do While ActiveCell.Value = noDups ActiveCell.Offset(1, 0).Range("A1").Select Loop DelEndRow = ActiveCell.Row - 1 Range(Cells(DelStartRow, 1), Cells(DelEndRow, 1)).EntireRow.Delete GoBack = (DelEndRow - DelStartRow + 1) * -1 ActiveCell.Offset(GoBack, 0).Range("A1").Select End If Loop ElseIf deleteDupRows = vbYes Then Message = "You must start from a cell containing data." X = MsgBox(Message, vbOKOnly, "Empty Cell") End If End Sub

paofficer
paofficer

I have a similar problem. I have two columns. I want two columns to become one but not in the same way. If columns look like: A B Cat Dog Dog Cat I want C to look like: Cat Dog Dog Cat Then I want to find the duplicates in one column

kkroon_ftb
kkroon_ftb

With a bit of care, I've gotten pretty good results with Excel 2007's built-in Remove Duplicates command. Data tab > Data Tools > Remove Duplicates.

lkeppel
lkeppel

Sort on most unique field you have. Place focus in the first cell of this field. Select Conditional Format-->New Rule. From Select a Rule Type, select format unique or duplicate values. Make sure duplicate is in the drop down box Click on the format button and choose a color. Now everything easily stands out.

naoufel.mami
naoufel.mami

Hi, It's not commun to use such function with excel I prefer to use MSACCESS to find duplication it's much better. I hope this help Rgds Naoufel

mattohare
mattohare

This would require you to sort the data. If that is an option, this might be easier for some. I insert a column to the right of the one with potentially duplicate data. (Say potential duplicates are in column A, I insert a blank column at B.) First row is usually headings that will not match the data itself. (I'll change the heading for A if I have to.) Let's assume for this that the data starts in row 2 with headings in row 1. I put a formula in cell B2 as follows: =if((a2=a1),"Dup","-") I get 'Dup' if there is a duplicate, '-' otherwise. I hope this helps.

ssharkins
ssharkins

There's a note at the beginning of the article that identifies it as an old article from 2009. I didn't repost it, Techrepublic.com did -- once they buy these tips, they own them and can reuse as they see fit. Finding duplicates is almost a non-issue in the ribbon versions -- much easier, depending on how you need to see the data and what you want to do with it once you find the duplicates. However, the methods discussed might still be valuable when the new duplicate-finding features aren't adequate. But, by all means, use the easiest method!

sid
sid

This is really a simple issue. Once you learn how! Create a new column to the right of the last column currently used by your worksheet. I use Seq (Sequence) for the column name. Starting with a value of 1001 for the first row, fill the column to the last row of your worksheet. If your worksheet has more than 10,000 rows, you may want to start your numbering at 10,001. I use Quattro Pro more than Excel and also work with big files that are poorly developed so I employ extra checks that may be unnecessary. Once the sequence column is sequentially populate, you can make complete hash of any worksheet (as long as you maintain column integrity) and restore it to its original sequence by sorting the worksheet on the "Seq" column.

Mark.Mathews
Mark.Mathews

paofficer Try a custom macro: Sub CatDog() Row=1 Do While Cells(Row,1) "" If Cells(Row,1) = "Cat" or Cells(Row,2) = "Cat" Then [desired action] End If Row = Row + 1 Loop End Sub Hint: If you don't know how to write the action you want Excel to do, just record it as a new macro. Then cut and paste in Tools, Macros, Visual Basic Editor.

dogknees
dogknees

If your data is already in Excel and isn't well structured, getting it into Access in a form that is easy to work with is non-trivial. Similarly, finding duplicates across a row is challenging, to say the least, in Access.

mattohare
mattohare

A lot of offices will not install it for employees, but they still need to get the work done. In my case, I often have data scrubbing that must happen in Excel, because some data will not go into a database field until it's been through its own scrubbing.

dkidd23
dkidd23

I do something similiar except in my formula, I put =if((A2=A1),0,1) I then drag the formula down, copy it then paste special and select 'values'. I then do an ascending sort on my new values. All the "0" are now at the top of the page, easier to select and I do a Ctrl-X to cut them out then past them in a new sheet labeled duplicates just in case there are values in some of the other columns that are needed. I've actually written a macro that does all this for me since I frequently have to check for email/phone number duplicates. It also removes all formatting from phone numbers as well and applies phone number formatting to the cells with phone and fax numbers. I think I may do an article on the Macro during the holidays and post it. It is a very useful tool for scrubbing data for mailing lists.

damiross
damiross

Mattohare uses the same method as I do. However, I use 1 if a dup is found and blank if no dupe. Because in most cases I need to manually delete the duplicates after inspecting the 2 rows of data, I sum the column with the formula in it. This allows me to see how many more I have to go.

bkfriesen
bkfriesen

Well done. I will definitely be using this.

Clanphier
Clanphier

You can easily create a Pivot table using the row where potential duplicates are as the Row Labels and "Count Of.." as a summation value. The result is the number of occurances of each value in the selected column. If you want, you can then cut and paste special (values only) the Pivot table which will allow you to sort from high to low or vice versa.

bestbizbroker
bestbizbroker

Hi dkidd23 - I'm eager to learn more about how excel can find and seperate or delete duplicates from a phone/address list. I went through 5k records before seeking out an easier way... so you can see I'm a little slow. Even worse, I know nothing about using macros. Any help is greatly appriciated.

TobiF
TobiF

Just noted my approach was already described below.

mattohare
mattohare

But there were times when the autofill didn't go all the way down. I had to know that the formula was actually there.

Editor's Picks