How to Find and Return Duplicate Values with the Excel Index Function

The Excel index function is a powerful tool that can return a value or a reference from a range of cells or an array based on a given row and column number. However, what if you want to use the index function to find multiple matches or duplicate values in a range? In this article, we will explore some methods to do that with the help of other functions such as match, small, countif, and if.

The index function has two syntax forms: array and reference. The array form is more commonly used and it looks like this:

=INDEX(array, row_num, [column_num])

The array is the range of cells or an array constant that contains the values you want to return. The row_num is the row position in the array that you want to return. The column_num is the optional column position in the array that you want to return. If you omit the column_num, the index function will return the entire row as an array.

The match function is often used with the index function to return a value based on a lookup value. The match function looks like this:

=MATCH(lookup_value, lookup_array, [match_type])

The lookup_value is the value that you want to find in the lookup_array. The lookup_array is the range of cells or an array that contains the values you want to search. The match_type is an optional argument that specifies how you want to match the lookup_value. It can be 1, 0, or -1. If you use 0, the match function will return the relative position of the first exact match of the lookup_value in the lookup_array. If there are multiple matches, the match function will only return the first one.

Procedures

To use the index function with duplicate values, we need to modify the row_num argument of the index function to return multiple positions instead of one. There are several ways to do that, but we will focus on three methods in this article:

  • Method 1: Using index, match, if, and countif functions
  • Method 2: Using index, row, and small functions
  • Method 3: Using index, match, and aggregate functions

Each method has its own advantages and disadvantages, and we will explain them in detail in the following sections.

Method 1: Using index, match, if, and countif functions

This method is based on the idea of counting the number of occurrences of a value in a range and using the if function to return the position of the duplicate values. The formula looks like this:

=INDEX(return_array,SMALL(IF(lookup_array=lookup_value,ROW(lookup_array)-ROW(INDEX(lookup_array,1,1))+1),k))

The return_array is the range of cells or an array that contains the values you want to return. The lookup_array is the range of cells or an array that contains the values you want to search. The lookup_value is the value that you want to find in the lookup_array. The k is the rank or the position of the duplicate value that you want to return. For example, if you want to return the second duplicate value, you use k=2.

This formula is an array formula, which means that you need to press Ctrl+Shift+Enter to enter it in a cell. You can then copy the formula down or across to return more duplicate values.

Let’s see how this formula works with an example.

Example

Suppose you have a table of sales data like this:

Table

Salesperson State Sales
Peter CA 541
John NY 538
Mary TX 523
Lisa FL 498
David CA 541
James NY 538
Anna TX 523

You want to use the index function to return the salesperson name based on the sales value. However, there are duplicate sales values in the table, so you need to use the method 1 formula to return multiple matches.

The formula in cell E2 is:

=INDEX($B$2:$B$8,SMALL(IF($C$2:$C$8=D2,ROW($C$2:$C$8)-ROW(INDEX($C$2:$C$8,1,1))+1),F2))

This is an array formula, so you need to press Ctrl+Shift+Enter to enter it. Then you can copy the formula down to cell E7.

The formula in cell F2 is:

=COUNTIF($D$2:D2,D2)

This is a standard formula that counts the number of occurrences of the sales value in column D. You can copy the formula down to cell F7.

The result is:

Table

Sales Salesperson Rank
541 Peter 1
538 John 1
523 Mary 1
498 Lisa 1
541 David 2
538 James 2
523 Anna 2

As you can see, the formula returns the salesperson name for each sales value, even if there are duplicates. The rank column shows the position of the duplicate value.

Formula Breakdown

Let’s break down the formula in cell E2 to see how it works.

  • The index function returns a value from the return_array based on the row_num argument, which is the small function in this case.
  • The small function returns the k-th smallest value from an array of values, which is the if function in this case.
  • The if function returns an array of row numbers where the lookup_array matches the lookup_value. For example, if the lookup_value is 541, the if function returns {1;FALSE;FALSE;FALSE;5;FALSE;FALSE}.
  • The row function returns the row number of a reference. In this case, it returns the row number of the lookup_array minus the row number of the first cell of the lookup_array plus one. This is to adjust the row number to match the position in the array. For example, if the lookup_array is C2:C8, the row function returns {1;2;3;4;5;6;7} minus {2;2;2;2;2;2;2} plus {1;1;1;1;1;1;1}, which is {0;1;2;3;4;5;6}.
  • The countif function counts the number of occurrences of the lookup_value in the range D2:D2, which is 1 in this case. This is the k argument of the small function, which means that we want to return the first smallest value from the if function array.
  • The small function returns the first smallest value from the if function array, which is 1 in this case. This is the row_num argument of the index function, which means that we want to return the value from the first row of the return_array.
  • The index function returns the value from the first row of the return_array, which is Peter in this case.

Method 2: Using index, row, and small functions

This method is based on the idea of using the row function to generate an array of numbers and using the small function to return the position of the duplicate values. The formula looks like this:

=INDEX(return_array,SMALL(ROW(lookup_array)*(lookup_array=lookup_value),k))

The return_array is the range of cells or an array that contains the values you want to return. The lookup_array is the range of cells or an array that contains the values you want to search. The lookup_value is the value that you want to find in the lookup_array. The k is the rank or the position of the duplicate value that you want to return. For example, if you want to return the second duplicate value, you use k=2.

This formula is also an array formula, which means that you need to press Ctrl+Shift+Enter to enter it in a cell. You can then copy the formula down or across to return more duplicate values.

Let’s see how this formula works with an example.

Example

Suppose you have the same table of sales data as before:

Table

Salesperson State Sales
Peter CA 541
John NY 538
Mary TX 523
Lisa FL 498
David CA 541
James NY 538
Anna TX 523

You want to use the index function to return the salesperson name based on the sales value. However, there are duplicate sales values in the table, so you need to use the method 2 formula to return multiple matches.

The formula in cell E2 is:

=INDEX($B$2:$B$8,SMALL(ROW($C$2:$C$8)*($C$2:$C$8=D2),F2))

This is an array formula, so you need to press Ctrl+Shift+Enter to enter it. Then you can copy the formula down to cell E.

The formula in cell F2 is:

=COUNTIF($D$2:D2,D2)

This is a standard formula that counts the number of occurrences of the sales value in column D. You can copy the formula down to cell F7.

The result is:

Table

Sales Salesperson Rank
541 Peter 1
538 John 1
523 Mary 1
498 Lisa 1
541 David 2
538 James 2
523 Anna 2

As you can see, the formula returns the salesperson name for each sales value, even if there are duplicates. The rank column shows the position of the duplicate value.

Formula Breakdown

Let’s break down the formula in cell E2 to see how it works.

  • The index function returns a value from the return_array based on the row_num argument, which is the small function in this case.
  • The small function returns the k-th smallest value from an array of values, which is the product of the row function and the comparison operator in this case.
  • The row function returns an array of row numbers of the lookup_array. For example, if the lookup_array is C2:C8, the row function returns {2;3;4;5;6;7;8}.
  • The comparison operator returns an array of TRUE or FALSE values based on whether the lookup_array matches the lookup_value. For example, if the lookup_value is 541, the comparison operator returns {TRUE;FALSE;FALSE;FALSE;TRUE;FALSE;FALSE}.
  • The product of the row function and the comparison operator returns an array of numbers where the matching values are the row numbers and the non-matching values are zero. For example, if the lookup_value is 541, the product returns {2;0;0;0;6;0;0}.
  • The countif function counts the number of occurrences of the lookup_value in the range D2:D2, which is 1 in this case. This is the k argument of the small function, which means that we want to return the first smallest value from the product array.
  • The small function returns the first smallest value from the product array, which is 2 in this case. This is the row_num argument of the index function, which means that we want to return the value from the second row of the return_array.
  • The index function returns the value from the second row of the return_array, which is Peter in this case.

Method 3: Using index, match, and aggregate functions

This method is based on the idea of using the aggregate function to perform array operations and return the position of the duplicate values. The formula looks like this:

=INDEX(return_array,AGGREGATE(15,6,(ROW(lookup_array)-ROW(INDEX(lookup_array,1,1))+1)/(lookup_array=lookup_value),k))

The return_array is the range of cells or an array that contains the values you want to return. The lookup_array is the range of cells or an array that contains the values you want to search. The lookup_value is the value that you want to find in the lookup_array. The k is the rank or the position of the duplicate value that you want to return. For example, if you want to return the second duplicate value, you use k=2.

This formula is a standard formula, which means that you do not need to press Ctrl+Shift+Enter to enter it in a cell. You can then copy the formula down or across to return more duplicate values.

Let’s see how this formula works with an example.

Example

Suppose you have the same table of sales data as before:

Table

Salesperson State Sales
Peter CA 541
John NY 538
Mary TX 523
Lisa FL 498
David CA 541
James NY 538
Anna TX 523

You want to use the index function to return the salesperson name based on the sales value. However, there are duplicate sales values in the table, so you need to use the method 3 formula to return multiple matches.

The formula in cell E2 is:

=INDEX($B$2:$B$8,AGGREGATE(15,6,(ROW($C$2:$C$8)-ROW(INDEX($C$2:$C$8,1,1))+1)/($C$2:$C$8=D2),F2))

This is a standard formula, so you just need to press Enter to enter it. Then you can copy the formula down to cell E7.

The formula in cell F2 is:

=COUNTIF($D$2:D2,D2)

This is a standard formula that counts the number of occurrences of the sales value in column D. You can copy the formula down to cell F7.

The result is:

Table

Sales Salesperson Rank
541 Peter 1
538 John 1
523 Mary 1
498 Lisa 1
541 David 2
538 James 2
523 Anna 2

As you can see, the formula returns the salesperson name for each sales value, even if there are duplicates. The rank column shows the position of the duplicate value.

Formula Breakdown

Let’s break down the formula in cell E2 to see how it works.

  • The index function returns a value from the return_array based on the row_num argument, which is the aggregate function in this case.
  • The aggregate function performs array operations and returns a value based on the function_num, options, array, and k arguments. In this case, the function_num is 15, which means that we want to return the k-th smallest value from the array. The options is 6, which means that we want to ignore any error values in the array. The array is the quotient of the row function and the comparison operator. The k is the rank or the position of the duplicate value that we want to return.
  • The row function returns an array of row numbers of the lookup_array minus the row number of the first cell of the lookup_array plus one. This is to adjust the row number to match the position in the array. For example, if the lookup_array is C2:C8, the row function returns {1;2;3;4;5;6;7} minus {2;2;2;2;2;2;2} plus {1;1;1;1;1;1;1}, which is {0;1;2;3;4;5;6}.
  • The comparison operator returns an array of TRUE or FALSE values based on whether the lookup_array matches the lookup_value. For example, if the lookup_value is 541, the comparison operator returns {TRUE;FALSE;FALSE;FALSE;TRUE;FALSE;FALSE}.
  • The quotient of the row function and the comparison operator returns an array of numbers where the matching values are the row numbers and the non-matching values are error values. For example, if the lookup_value is 541, the quotient returns {1;#DIV/0!;#DIV/0!;#DIV/0!;5;#DIV/0!;#DIV/0!}.
  • The countif function counts the number of occurrences of the lookup_value in the range D2:D2, which is 1 in this case. This is the k argument of the aggregate function, which means that we want to return the first smallest value from the quotient array.
  • The aggregate function returns the first smallest value from the quotient array, ignoring any error values. In this case, it returns 1. This is the row_num argument of the index function, which means that we want to return the value from the first row of the return_array.
  • The index function returns the value from the first row of the return_array, which is Peter in this case.

Comparison of Methods

The three methods that we have discussed in this article have their own advantages and disadvantages. Here is a summary of them:

Table

Method Pros Cons
Method 1 Works with any version of Excel Requires array entry (Ctrl+Shift+Enter)
Method 2 Does not require array entry Does not work with zero or negative values
Method 3 Does not require array entry Requires Excel 2010 or later

Depending on your situation and preference, you can choose the method that suits you best. However, if you want a more robust and flexible solution, you might want to consider using a pivot table or a filter to extract the duplicate values from your data.

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *