Data Management

Eliminate duplicate records with this built-in Access query

Effectively maintaining an Access database requires the regular removal of duplicate records. The Find Duplicates Query wizard handles this chore quickly and easily.

When maintaining an Access database, it's a good idea to regularly remove duplicate records. Duplicate records can pop up even with proficient data entry processes. For example, the same customer might be listed twice under two different account numbers. Or, two different customer records might have the same address in their Street Address fields, only one of which is correct. Thankfully, record removal doesn’t have to be a long and tedious process; you can find these discrepancies in seconds using the Find Duplicates Query wizard.

Using the wizard
To show how this technique works, we will use the Find Duplicates Query wizard to determine which customers in the Customers table, shown in Figure A, have the same address assigned to their Street Address fields. (Note that while the sample table used here has less than 20 records, the same technique can be applied to a database of any size.)

Figure A


Begin by clicking Queries under Objects in the Database Window. Then, click on the New button in the toolbar to obtain the dialog box shown in Figure B.

Figure B


Select Find Duplicates Query Wizard, and then click OK to activate the first screen of the wizard, as shown in Figure C.

Figure C


In this screen, the Customers table is selected as the table to be searched for duplicates. Click Next to continue.

Figure D


The screen shown in Figure D is used to select the field we want to search for duplicate values. For this example, Street Address is chosen, as we want to eliminate duplicate mailings to the same address. Clicking Next brings up the screen shown in Figure E, where we select the fields to be displayed along with the Street Address field in the query results.

Figure E


In the final screen, shown in Figure F, we enter a name for the query, and then click Finish.

Figure F


The results shown in Figure G indicate that two different customers have been assigned the same street address, while two other customers have each been assigned two different customer IDs. After further research, we determine that the house number for Bernadette Williamson should be 384, not 834. We can now correct the Customers table by manually deleting the duplicate records and making the necessary corrections to Williamson's street address—right from the query results table.

Figure G


Deleting duplicates with Append Query
In the above example, only two duplicate records were found. But what if the Customers database consisted of thousands of records, and after running the Find Duplicates Query wizard, the results showed hundreds of duplicate records? Manually deleting all those duplicates from the query results table would be highly impractical. Instead, you can use Append Query to have Access delete them automatically.

First, create a copy of the structure of the table that contains the duplicates. Click on the table name Customers in the Database Window, and then click the Copy button in the toolbar. Next, click the Paste button, which will display the Paste Table As dialog box shown in Figure H. Enter a name for the copy of the table structure, as shown. Under options, select Structure Only. Click OK to create the blank Customers Without Duplicates table.

Figure H


Open the Customers Without Duplicates table in Design View and change its primary key to Street Address, as shown in Figure I. Making the Street Address field the primary key field will prevent Access from copying records to the new table that have duplicate street addresses.

Figure I


We are now ready to create an Append Query against the original table. Create a query in design view for Customers. Drag the asterisk (*) to the query design grid to include all fields from the original table. Then select Append Query from the query-type drop-down list, as shown in Figure J.

Figure J


In the Append dialog box, select the blank database Customers Without Duplicates, as shown in Figure K.

Figure K


Click the Run button. In the dialog box that asks whether you wish to append the records to the new file, click Yes. A dialog box similar to the one shown in Figure L will appear, indicating that some records could not be copied because there were duplicate values in the primary key field, in this case, the Street Address fields. Click Yes.

Figure L


The query results table will have only one record for each street address. When you are satisfied that the Customers Without Duplicates table is correct, you may delete the original table.

When to use the wizard
As shown in this article, the Find Duplicates Query wizard works well when it is used to check for duplicate entries in individual fields and when the changes are small enough to be handled manually; otherwise, it is more practical to use the Append Query when you need to delete large numbers of duplicate records. Nevertheless, to keep your data clean, it’s a good idea to run the Find Duplicates Query wizard periodically; if you do, you may never need to run an Append Query.

Calling all Access masters
If you consider yourself an Access 2000 master, it's time to ante up. Post a comment to this article and share your best Access tips and tricks.

 
1 comments
vankan0
vankan0

This doesn't work any more in acces 2010, since it won't let you define a primary key in a field that contains duplicate values. It automatically sets Index to indexed (no duplicates) and you cannot change that without removing the primary key.