Returning the top or bottom n records of a data set in Excel isn’t difficult, and there are many ways to do so. You can filter, use expressions, or even a PivotTable; your route might depend on how you’re using the results. But now, thanks to the new dynamic array functions, you can use one expression to return as many columns and rows of the source data as you like. In this article, I’ll talk a bit about these functions. Then, we’ll apply them to return the top and bottom n records.
SEE: 83 Excel tips every user should master (TechRepublic)
I’m using Microsoft 365 (desktop) on a Windows 10 64-bit system. Both dynamic array functions in this solution are available in Microsoft 365 and Excel 2021, Excel for the web, Excel for iPad and iPhone, Excel for Android tablets and phones. For your convenience, you can download the demonstration .xlsx file. This article assumes that you have basic Excel skills, but even a beginner should be able to follow the instructions to success.
In the past, returning the top or bottom n required a bit of work and some specialized knowledge. You could use an advanced filter, an expression, or even a PivotTable. The beauty of the new dynamic array functions is that you don’t alter the source data; these functions create a new data set.
If you’re not familiar with the older ways to return the top or bottom n records, you might read some of the following articles are on this subject:
- How to display the top n records in an Excel PivotTable using built-in options
- How to highlight the top n values in a Microsoft Excel sheet
- Sum the top n values in a range using dynamic table behavior in Excel
- We need a better solution: How would you extract the top n records in Excel?
They’re not necessarily outdated, especially if you want to work with the source data in place. If you want to work with a new data set, the new functions are the solution.
Figure A shows a simple data set as a Table. We want to return the top and bottom n values in the Value column. A combination of the functions reviewed in the next section will return n records. The sheet’s name is Data; you’ll need that in a bit.
About the functions in Excel
We’ll use three functions: SORT(), SEQUENCE() and INDEX(). The first two are dynamic array functions and fairly new to Microsoft 365. INDEX() has been around for a long time, and you might already be familiar with it. But first, just what is a dynamic array function?
If you’ve ever entered an expression using Ctrl + Shift + Enter, then you’re already familiar with how Excel used to work with dynamic arrays. Thanks to the new dynamic array feature, these types of expressions can be much easier to create and maintain. The results spill into the cells below, filling as many as necessary to complete the expression’s calculations. That’s called the spill range. If you see a spill error, then the range needed to fulfill the function isn’t available.
Now, onto the functions.
SORT() returns a sorted array using the following syntax:
where array is the only required argument and identifies the range to sort. The optional arguments follow:
- sort_index: A numeric offset value that identifies the row or column to sort by
- sort_order: The number 1 for ascending sort or -1 for descending, with 1 being the default
- by_col: The logical values TRUE for a row sort and FALSE for a column sort, with TRUE being the default
In order to return the top or bottom n records, the data set must be sorted, and we’ll use the SORT() function instead of a manual route.
SEQUENCE() returns a series of values using the following syntax:
where rows is required and specifies the number of rows to fill. The optional arguments follow:
- column: the number of columns to return
- start: the first number in the sequence
- step: the amount to increment by
In its simplest form, you could use this function to return a series of fixed values, but it truly shines when you want to return all of the columns in the source data set. By combining the two, you can return a full sorted data set.
The last function, INDEX(), returns a value or the reference to a value from a Table or range using the syntax:
INDEX(array, row_num, [column_num])
where array is required and references a range or an array constant. If array contains only one row or column, the corresponding row_num or column_num argument is optional. If array has more than one row and more than one column, and only row_num or column_num is used, INDEX() returns an array of the entire row or column in array. The last two arguments may or may not be required:
- row_num is required unless column_num is present. It selects the row in array from which to return a value. If row_num is omitted, column_num is required.
- column_num is optional and selects the column in array from which to return a value. If column_num is omitted, row_num is required.
Now, let’s start using these functions, starting with SORT()
How to use SORT() in Excel
To return the top or bottom n values of any data set, you need a sorted data set. Thanks to SORT() a manual sort isn’t necessary. To see how SORT() works, let’s use it to return a sorted data set of the example data set (Figure A). First, you need to copy the column headers to another area of the sheet, or even another sheet. I’m going to use another sheet, so you can see how easily this is achieved. Copy the header cells B2:F2 into a second sheet. Then enter into B3 (of the second sheet) the following function:
Figure B shows the source data sorted by the Value field. Here’s how the arguments in the SORT() function work:
- Data! is the sheet name where the source data is.
- B3:F13 is the original data set.
- 2 is the index value, which specifies the Value column—the second column in B3:F13. By changing the index value from 2 to 1, 3, 4, or 5, you can display one or all of the columns.
Wasn’t that easy?
If you’re selecting these references, Excel will display structure referencing because the source data is a Table. You’ll want to format the results because the SORT() function doesn’t. That’s a bit of a nuisance when trying to use this in a dashboard setting. You’ll want to add a VBA procedure that applies the formatting. This is the one shortcoming I’ve run into.
That’s a simple sort, but to achieve a top or bottom data set, you’ll need to add SEQUENCE() and INDEX().
The SORT() function can easily return the full data set in ascending or descending order. We want to also limit the number of records returned, so we’ll add SEQUENCE() and INDEX() in the form
The arguments for the first SEQUENCE() function will be input values, so the user can change the number of records returned. That means we need input cells. Use Figure C as a guide to set up the top and bottom ranges and the input cell range. Because I inserted rows at the top to accommodate the input cells, the range references you’ve been seeing will be different, so don’t let that confuse you.
Let’s start with the top data set, which requires a descending sort, expressed by -1 in the SORT() function’s sort_order argument. In H6 enter the function
It will return a calculation error because there’s no value in I2—the input cell. Enter 3 in I2 to return the data set shown in Figure D. You already know how the SORT() function works; in this case, it serves as the INDEX() function’s array argument. SEQUENCE(I2) is the row_number argument and returns 3 because the input value in I2 is 3. Consequently, the returns a descending data set with three rows. SEQUENCE(1,5) specifies columns 1 through 5 in the source data set. All put together SORT() returns a fully sorted data set, but the two SEQUENCE() functions limit it to three rows and includes all five columns.
The bottom function works similarly, but it omits the sort_order argument because ascending (1), is the default. In addition, it references I3 as the input cell. You could use only one input cell and have them both reference it, but this way both data sets can be a different number of rows.
In N6 enter the function
The nuts and bolts are essentially the same. The default sort_order argument, not entered, returns an ascending order, so you get the lowest values at the top of the sort. By referencing I3, you determine how many rows to return. Figure E shows the input value of 4. Consequently, the combo function returns all five columns for the first four rows—the bottom four values in Value.
At first, everything seems a little more complex that you might like, but once you become familiar with the new dynamic array functions, the easier your solutions will be for you.
Earlier I mentioned that these functions might return an error if the spill range isn’t available. If this happens, select the full spill range and remove everything—data, formats, everything and see if that doesn’t help. Of special note is merged cells. For some reason removing them doesn’t completely clear them. You may have to move the entire works to a new sheet.
The only limitation I’ve noticed is the functions’ failure to maintain formatting. In a future article, I’ll share a VBA procedure that applies the appropriate formatting for you when you change an input cell.
Subscribe to the Developer Insider Newsletter
From the hottest programming languages to commentary on the Linux OS, get the developer and open source news and tips you need to know. Delivered Tuesdays and Thursdays