In Microsoft Excel, I have used many different solutions to compare lists. Sometimes the solution finds duplicates across one or more columns; sometimes the solution focuses on finding unique values. But what if you want to compare two sets of data in Excel to determine if a record is missing from one of the sets?
SEE: Microsoft Excel: Become an expert with this online training (TechRepublic Academy)
For example, in Excel, you want to compare a set of incoming purchase orders to a comprehensive list of distributed purchase orders. Fortunately, it’s not as hard as you might think. In this tutorial, I’ll show you how to use VLOOKUP() in Excel to find missing records in one data set when compared to a comprehensive list.
I’m using Microsoft 365 desktop on a Windows 10 64-bit system, but you can use any version and Excel for the web supports VLOOKUP(). For your convenience, you can download the demonstration .xlsx and .xls files.
What’s the problem we’re trying to solve in Excel?
If your company supplies ongoing support, you’re probably familiar with the purchase order system. When a client calls and makes a request, you grab a blank purchase order, fill in the details and move on to satisfying the request. When the employee fulfills the request, they complete the purchase order form and return it to the system. I’ll use the term open to describe a purchase order in play and closed to describe a completed purchase order. You can think of the closed list as a subset of the open list. With most systems online now, the paper purchase order is a thing of the past, but it provides us with a simple example.
Figure A shows two Table objects. PODistributed, to the left, tracks the purchase orders distributed to staff — the open purchase orders. POReceived, to the right, is a list of purchase orders and their return date, meaning the job is done — these are closed purchase orders.
As is, there’s no reconciliation between the two Tables. We need a way to highlight the open purchase orders when there’s no matching number in the closed list, meaning that purchase order and the client request is still in play. Even with such short lists, you’re likely to make a mistake if you just wing it visually.
Now that you understand what’s required, let’s add a VLOOKUP() function in Excel to expose the open purchase orders.
SEE: 98 Excel tips every user should master (TechRepublic)
How to use VLOOKUP() to compare two sets of data in Microsoft Excel
The VLOOKUP() function has been the cornerstone of many Excel apps. Because Tables don’t support dynamic array functions, such as the newer XLOOKUP(), this solution uses the older VLOOKUP().
Excel’s VLOOKUP() function uses the following syntax:
VLOOKUP(lookupvalue, tablearray, colindex, [rangelookup])
The lookupvalue argument identifies the value to match, tablearray identifies the range to look for lookupvalue, colindex identifies the returned value when the function finds a match, and the optional rangelookup argument lets you specify an exact match. When using FALSE for this optional argument, VLOOKUP() will return a #N/A value if lookupvalue has no match, and that’s what we’re counting on.
To continue, enter one of the following functions into D3 in the PO Distributed sheet, the open list, and copy to the remaining cells:
=VLOOKUP([@[PO Number]],'PO Received'!$B$3:$B$13,1,FALSE)
=VLOOKUP('PO Distributed'!B3,'PO Received'!$B$3:$B$13,1,FALSE)
Use the latter if you’re not using a Table object. The !$B$3:$B$13 reference in both must be absolute.
In Excel, the Table object will automatically expand (Figure B). There are several records where the function did not find a match in the list of closed purchase orders to the main list of purchase orders. These records are still open. Knowing which purchase orders is important, especially when it’s been open for a long time.
The function is a bit complex, so let’s take a look at how it evaluates using the record in row 4:
=VLOOKUP(‘PO Distributed’!B3,’PO Received’!$B$3:$B$13,1,FALSE)
The array of closed purchase orders doesn’t include the value 102, so the FALSE argument forces the function to return #N/A. If the value 102 were in the array, the function would return 102.
You could easily stop if you know what the #N/A values mean; however, we can add a conditional formatting rule based on the results of Excel’s VLOOKUP() function to highlight records where the VLOOKUP() function returns this error. To add the conditional formatting rule, follow these steps.
- Select B3:D19, which are the open purchase orders in the PODistributed Table. Don’t include the header cells.
- On the Home tab, click Conditional Formatting and then choose New Rule from the dropdown.
- In the resulting dialog, click the Use a Formula to Determine Which Cells to Format option in the top pane.
- In the bottom pane, enter the function =ISNA($D3) and click Format. The column reference must be absolute ($D).
- Click the Fill tab, select red and then click OK once. Figure C shows the function and the format.
- Click OK to return to the sheet. The result is a visual focus on the open purchase orders.
You might decide to hide the helper column, but I recommend that you not do so. It’s easy to forget hidden values down the road, which can make dependency errors hard to troubleshoot. Because both lists are Tables, the system will update automatically as you enter new records in the two lists.