Software

​Office Q&A: A dynamic age-grouping solution for Excel

In this month's Office Q&A, we combine a matrix and lookup function to create a dynamic age-grouping value.

hero

Vincent asked for help grouping records by age groups. Each person has an age, but he wants to group them by age groups and not by specific ages. Each group is defined by a beginning and ending age. For instance, a child who's 1, 2, or 3 would fall into the 1-3 group. It's a fairly standard need with the group's range controlled by a custom business rule. He knows he can enter the group values manually, but he'd rather not, and that's reasonable.

If you try to write a single expression to do this, you'll end up with a headache. I prefer a simpler approach. In this article, we'll build a matrix that uses a concatenating expression to build each age group. Then, we'll use a VLOOKUP() function to return the appropriate group for each person.

I'm using Excel 2016 on a Windows 10 64-bit system. This solution will work in earlier versions, but the Table object and IFERROR() aren't available in Excel 2003 or earlier. This technique is compatible with 365's browser version. You can work with your own data or download the demonstration .xlsx file.

SEE: Five ways to skip content when checking a Word document for spelling errors

Where to begin

The first step is to determine the range for each group. Vincent's groups span three years: 1 to 3, 4 to 6, 7 to 10, and so on. As you can see in Figure A, that's where we'll start. The value in H1 represents the range for each age group. You can update this value to change the matrix as needs arise.

Figure A

2016febqaa.jpg
The input value in cell H1 is the anchor for the age group matrix we'll build.

The matrix

You could fill the entire matrix with literal values, but I promise you, if you do, someone will change the rules and you'll have to modify the matrix, cell by cell. By using expressions that reference an input value, you can change the entire matrix by changing only that value.

In most cases, the first group will begin with age 1 (or 0). It's the only literal value you'll use in this matrix—enter 1 in cell F4. Now, let's create the matrix:

  1. Enter the following expression in cell G4 (Figure B):

    =$F4+$H$1 
  2. Enter the following expression in H4 to return your first group, 1-3:

    =$F4&"-"&$G4
  3. Enter the following expression in cell F5 (Figure C) to return the first age in the second group:

    =$G4+1 
  4. Copy the expressions in G4:H4 to row 5 (Figure D).

Figure B

2016febqab.jpg
This expression returns the end age value for each age group.

Figure C

2016febqac.jpg
The first age in the next group.

Figure D

2016febqad.jpg
Copy the expressions.

Convert to a Table

The matrix, as is, will work, but a Table is better because a Table is dynamic. That detail will matter later. Let's convert the matrix into a Table object as follows:

  1. Click anywhere inside the matrix.
  2. Click the Insert tab.
  3. Click Table in the Tables group.
  4. In the resulting dialog, check the My Table Has Headers option (Figure E), if necessary.
  5. Click OK. The resulting Table is shown in Figure F.

Figure E

2016febqae.jpg
Specify whether the data set has a header row.

Figure F

2016febqaf.jpg
Convert the matrix into a Table object.

Completing the matrix is as simple as extending the Table. Grab the bottom-right corner (circled in Figure F) and drag it down. How far down is up to you. Figure G shows a matrix that accommodates groups up to age 66.

Figure G

2016febqag.jpg
Extend the Table to complete the matrix.

Return the age groups

With the matrix in place, complete the solution by using the following VLOOKUP() function in column D:

=VLOOKUP($C4,Table3[#All],3)

This function has a few challenges, as you can see in Figure H:

  • There's no match for Sylvia, so the function returns the nearest group, 64-66.
  • Heather doesn't have an age, so the function returns an error.
  • Avery's age of 0 returns an error.

Figure H

2016febqah.jpg
The expression isn't perfect.

Meet the challenges

The first problem, Sylvia, might be hard to detect in a large data set so it's important to consider the possibility in the development stage. In this case, what is the oldest age you'll need to accommodate? There's no one-size-fits-all answer—you need to know your data. For this example, we can quickly extend the Table and VLOOKUP() function, as shown in Figure I, by extending the Table. Earlier, I told you the Table would come in handy. If this weren't a Table, you'd have to update the VLOOKUP() functions in column D to reference the new rows in the matrix.

Figure I

2016febqai.jpg
Extending the Table solves the first problem.

If you need to consider the oldest age, you should also consider the youngest age. The matrix starts at 1, so the blank and 0 values return errors. You can mask them by adding IFERROR() to the expression, as shown in Figure J:

=IFERROR(VLOOKUP($C4,Table3[#All],3),"")

Figure J

2016febqaj.jpg
Use IFERROR() to handle the error values.

You could argue that the addition of IFERROR() doesn't handle the errors very well, and I would agree. The function masks the errors, but it doesn't do so in a meaningful way. If it's adequate for your purposes, there's no harm in this simple approach.

Knowing your data and how others will use it is the key. Most likely the blank can be handled with a business rule; simply don't allow blanks. Easier said than done, I know. You could employ data validation and/or conditional formatting to catch blanks to eliminate them.

That leaves us with the 0 value. Should the first age value in the matrix be 0 and not 1? Changing it changes the matrix and data set, as you can see in Figure K, but that change might be appropriate. This ability was built in on purpose to make the sheet easy to adapt when the rules change—that rule being the group's range of years.

Figure K

2016febqak.jpg
Changing the anchor age value in the matrix changes everything.

How you accommodate 0, if it's a valid age value, will depend on how you're using the data. For now, Vincent has a solution that returns the age group labels (in column D) that he can use in ways that are meaningful to him. How would you accommodate a 0 or .5 age value if the first matrix value must be 1? Share your suggestions in the Comments section below.

Send me your question about Office

I answer readers' questions when I can, but there's no guarantee. Don't send files unless requested; initial requests for help that arrive with attached files will be deleted unread. You can send screenshots of your data to help clarify your question. When contacting me, be as specific as possible. For example, "Please troubleshoot my workbook and fix what's wrong" probably won't get a response, but "Can you tell me why this formula isn't returning the expected results?" might. Please mention the app and version that you're using. I'm not reimbursed by TechRepublic for my time or expertise when helping readers, nor do I ask for a fee from readers I help. You can contact me at susansalesharkins@gmail.com.

Also read...

About Susan Harkins

Susan Sales Harkins is an IT consultant, specializing in desktop solutions. Previously, she was editor in chief for The Cobb Group, the world's largest publisher of technical journals.

Editor's Picks

Free Newsletters, In your Inbox