A distinct count is a way of counting the number of different or unique values in a data set. For example, if you have a list of names, a distinct count will tell you how many different names there are in the list, regardless of how many times each name appears.
A pivot table is a tool that allows you to summarize and analyze data in Excel. You can use a pivot table to create reports, charts, and dashboards based on your data. A pivot table can also perform calculations on the data, such as sum, average, count, etc.
To do a distinct count in a pivot table, you need to use a special option called Distinct Count, which is available in Excel 2013 and later versions. This option will count the number of unique values in a field that you add to the Values area of the pivot table.
However, if you want to do a distinct count based on unique values across multiple columns, you need to use a different approach. This is because the Distinct Count option only works on one field at a time, and it does not consider the combination of values in different fields.
For example, suppose you have a data set like this:
Name | Color | Shape |
---|---|---|
Alice | Red | Circle |
Bob | Blue | Square |
Alice | Green | Triangle |
Bob | Blue | Square |
Charlie | Yellow | Star |
If you want to count the number of unique combinations of Name and Color, you cannot use the Distinct Count option on either field. This is because the Distinct Count option will count Alice twice (once for Red and once for Green), and Bob once (for Blue), even though they have the same name. The Distinct Count option will also count Blue twice (once for Bob and once for Alice), even though they have the same color.
The correct answer for the distinct count based on Name and Color is 4, because there are 4 different combinations of Name and Color in the data set:
- Alice, Red
- Bob, Blue
- Alice, Green
- Charlie, Yellow
To get this answer, you need to use a different method, which I will explain in the next section.
Procedures
There are two main methods to do a distinct count in a pivot table based on unique values across multiple columns in Excel:
- Method 1: Using a helper column in the source data
- Method 2: Using Power Pivot and Data Model
Method 1: Using a helper column in the source data
This method works in any version of Excel, and it involves adding a new column to the source data that combines the values of the columns that you want to count. Then, you can use the Distinct Count option on the new column in the pivot table.
Here are the steps to follow:
- In the source data, add a new column to the right of the existing columns. Give it a name, such as Combined.
- In the first row of the new column, enter a formula that concatenates the values of the columns that you want to count, separated by a delimiter, such as a comma. For example, if you want to count the unique combinations of Name and Color, you can use this formula:
= [@[Name]]&","&[@[Color]]
This formula will create a new value that combines the Name and Color values in the same row, separated by a comma. For example, for Alice, Red, the formula will return Alice,Red.
- Copy the formula down to fill the entire column. You can use the fill handle or double-click the bottom-right corner of the cell to do this quickly.
- Create a pivot table from the source data. You can use the Insert > Pivot Table command to do this.
- In the Pivot Table Fields pane, drag the Combined field to the Values area. By default, Excel will use the Count function to summarize the data.
- Right-click on any cell in the Values area, and choose Summarize Values By > Distinct Count. This will change the calculation to count the number of unique values in the Combined field.
- Optionally, you can drag another field to the Rows or Columns area to create a more detailed report. For example, you can drag the Shape field to the Rows area to see the distinct count of Name and Color for each Shape.
Here is an example of the pivot table created using this method:
Shape | Distinct Count of Combined |
---|---|
Circle | 1 |
Square | 1 |
Star | 1 |
Triangle | 1 |
Grand Total | 4 |
As you can see, the pivot table shows the correct answer for the distinct count based on Name and Color, which is 4. It also shows the breakdown by Shape, which can be useful for further analysis.
The advantage of this method is that it is simple and works in any version of Excel. The disadvantage is that it requires modifying the source data, which may not be desirable or possible in some cases.
Method 2: Using Power Pivot and Data Model
This method works in Excel 2013 and later versions, and it involves using Power Pivot and Data Model features to create a measure that calculates the distinct count based on multiple columns. Then, you can use the measure in the pivot table without modifying the source data.
Here are the steps to follow:
- Create a pivot table from the source data. You can use the Insert > Pivot Table command to do this.
- In the Create Pivot Table dialog box, make sure to check the option Add this data to the Data Model. This will enable the Power Pivot and Data Model features for the pivot table.
- Click OK to create the pivot table.
- In the Pivot Table Fields pane, right-click on the table name, and choose Add Measure. This will open the Measure dialog box, where you can create a custom calculation for the pivot table.
- In the Measure dialog box, give the measure a name, such as Distinct Count. You can also choose a format and a description for the measure.
- In the Formula box, enter a formula that uses the DISTINCTCOUNT function to count the number of unique values across multiple columns. The syntax of the function is:
=DISTINCTCOUNT (table[column1], table[column2], ...)
This function will return the number of distinct combinations of values in the specified columns. For example, if you want to count the unique combinations of Name and Color, you can use this formula:
=DISTINCTCOUNT ('Table1'[Name], 'Table1'[Color])
This formula will count the number of different values of Name and Color in the Table1 table, which is the source data for the pivot table.
- Click OK to create the measure. The measure will appear in the Pivot Table Fields pane, under the Measures group.
- Drag the measure to the Values area of the pivot table. This will show the distinct count based on the columns that you specified in the formula.
- Optionally, you can drag another field to the Rows or Columns area to create a more detailed report. For example, you can drag the Shape field to the Rows area to see the distinct count of Name and Color for each Shape.
Here is an example of the pivot table created using this method:
Shape | Distinct Count |
---|---|
Circle | 1 |
Square | 1 |
Star | 1 |
Triangle | 1 |
Grand Total | 4 |
As you can see, the pivot table shows the same answer as the previous method, which is 4. It also shows the breakdown by Shape, which can be useful for further analysis.
The advantage of this method is that it does not require modifying the source data, and it allows you to create more complex calculations using the Power Pivot and Data Model features. The disadvantage is that it only works in Excel 2013 and later versions, and it may require some familiarity with the Power Pivot and Data Model features.
Comprehensive explanation
To understand how these methods work, it is helpful to know the concept of a unique identifier. A unique identifier is a value or a combination of values that can uniquely identify each row in a data set. For example, in the data set below, the Name column is not a unique identifier, because it has duplicate values. The Color column is also not a unique identifier, because it has duplicate values. However, the combination of Name and Color is a unique identifier, because it has no duplicate values. Each row has a different combination of Name and Color.
Name | Color | Shape |
---|---|---|
Alice | Red | Circle |
Bob | Blue | Square |
Alice | Green | Triangle |
Bob | Blue | Square |
Charlie | Yellow | Star |
To do a distinct count based on multiple columns, we need to create a unique identifier that combines the values of those columns. Then, we can count the number of unique values of that unique identifier.
The first method uses a helper column in the source data to create the unique identifier. The helper column uses a formula that concatenates the values of the columns that we want to count, separated by a delimiter. For example, if we want to count the unique combinations of Name and Color, we can use this formula in the helper column:
= [@[Name]]&","&[@[Color]]
This formula will create a new value that combines the Name and Color values in the same row, separated by a comma. For example, for Alice, Red, the formula will return Alice,Red.
The helper column will look like this:
Name | Color | Shape | Combined |
---|---|---|---|
Alice | Red | Circle | Alice,Red |
Bob | Blue | Square | Bob,Blue |
Alice | Green | Triangle | Alice,Green |
Bob | Blue | Square | Bob,Blue |
Charlie | Yellow | Star | Charlie,Yellow |
As you can see, the Combined column is a unique identifier, because it has no duplicate values. Each row has a different combination of Name and Color.
To count the number of unique values in the Combined column, we can use the Distinct Count option in the pivot table. This option will count the number of different values in a field that we add to the Values area of the pivot table.
To use the Distinct Count option, we need to right-click on any cell in the Values area, and choose Summarize Values By > Distinct Count. This will change the calculation to count the number of unique values in the Combined field.
The result will be the distinct count based on Name and Color, which is 4. We can also drag another field to the Rows or Columns area to create a more detailed report. For example, we can drag the Shape field to the Rows area to see the distinct count of Name and Color for each Shape.
The second method uses Power Pivot and Data Model features to create a measure that calculates the distinct count based on multiple columns. A measure is a custom calculation that we can create for the pivot table using the Data Analysis Expressions (DAX) language.
To create a measure, we need to right-click on the table name in the Pivot Table Fields pane, and choose Add Measure. This will open the Measure dialog box, where we can enter a name, a format, a description, and a formula for the measure.
The formula for the measure uses the DISTINCTCOUNT function, which is a DAX function that counts the number of distinct values across multiple columns. The syntax of the function is:
=DISTINCTCOUNT (table[column1], table[column2], ...)
This function will return the number of distinct combinations of values in the specified columns. For example, if we want to count the unique combinations of Name and Color, we can use this formula:
=DISTINCTCOUNT ('Table1'[Name], 'Table1'[Color])
This formula will count the number of different values of Name and Color in the Table1 table, which is the source data for the pivot table.
To use the measure in the pivot table, we need to drag it to the Values area. This will show the distinct count based on the columns that we specified in the formula. We can also drag another field to the Rows or Columns area to create a more detailed report. For example, we can drag the Shape field to the Rows area to see the distinct count of Name and Color for each Shape.
The result will be the same as the previous method, which is 4. However, this method does not require modifying the source data, and it allows us to create more complex calculations using the Power Pivot and Data Model features.
Example
To illustrate these methods with a real example, let’s suppose we have a data set of sales transactions, like this:
Order ID | Customer ID | Product ID | Quantity | Price |
---|---|---|---|---|
1001 | C001 | P001 | 2 | 10 |
1002 | C002 | P002 | 3 | 20 |
1003 | C003 | P003 | 4 | 30 |
1004 | C001 | P002 | 5 | 20 |
1005 | C002 | P003 | 6 | 30 |
1006 | C003 | P001 | 7 | 10 |
1007 | C001 | P003 | 8 | 30 |
1008 | C002 | P001 | 9 | 10 |
1009 | C003 | P002 | 10 | 20 |
We want to count the number of unique customers who bought each product. In other words, we want to do a distinct count based on Customer ID and Product ID.
Using the first method, we can add a helper column to the source data that combines the Customer ID and Product ID values, separated by a delimiter, such as a comma. For example, we can use this formula in the helper column:
= [@[Customer ID]]&","&[@[Product ID]]
This formula will create a new value that combines the Customer ID and Product ID values in the same row, separated by a comma. For example, for C001,P001, the formula will return C001,P001.
The helper column will look like this:
Order ID | Customer ID | Product ID | Quantity | Price | Combined |
---|---|---|---|---|---|
1001 | C001 | P001 | 2 | 10 | C001,P001 |
1002 | C002 | P002 | 3 | 20 | C002,P002 |
1003 | C003 | P003 | 4 | 30 | C003,P003 |
1004 | C001 | P002 | 5 | 20 | C001,P002 |
1005 | C002 | P003 | 6 | 30 | C002,P003 |
1006 | C003 | P001 | 7 | 10 | C003,P001 |
1007 | C001 | P003 | 8 | 30 | C001,P003 |
1008 | C002 | P001 | 9 | 10 | C002,P001 |
1009 | C003 | P002 | 10 | 20 | C003,P002 |
As you can see, the Combined column is a unique identifier, because it has no duplicate values. Each row has a different combination of Customer ID and Product ID.
To count the number of unique values in the Combined column, we can create a pivot table from the source data, and use the Distinct Count option on the Combined field in the Values area. We can also drag the Product ID field to the Rows area to see the distinct count of customers for each product.
The pivot table will look like this:
Product ID | Distinct Count of Combined |
---|---|
P001 | 3 |
P002 | 3 |
P003 | 3 |
Grand Total | 9 |
As you can see, the pivot table shows the correct answer for the distinct count based on Customer ID and Product ID, which is 9. It also shows the breakdown by Product ID, which can be useful for further analysis.
Using the second method, we can create a pivot table from the source data, and add the data to the Data Model. Then, we can create a measure that uses the DISTINCTCOUNT function to count the number of unique values across Customer ID and Product ID. For example, we can use this formula for the measure:
=DISTINCTCOUNT ('Table1'[Customer ID], 'Table1'[Product ID])
This formula will count the number of different values of Customer ID and Product ID in the Table1 table, which is the source data for the pivot table.
To use the measure in the pivot table, we need to drag it to the Values area. We can also drag the Product ID field to the Rows area to see the distinct count of customers for each product.
The pivot table will look like this:
Product ID | Distinct Count |
---|---|
P001 | 3 |
P002 | 3 |
P003 | 3 |
Grand Total | 9 |
As you can see, the pivot table shows the same answer as the previous method, which is 9. It also shows the breakdown by Product ID, which can be useful for further analysis.
Other approaches
There are some other possible approaches to do a distinct count in a pivot table based on multiple columns in Excel, such as:
- Using a formula in a regular table or worksheet to calculate the distinct count, and then using the result as the source data for the pivot table.
- Using a Power Query query to create a new table that combines the values of the columns that you want to count, and then using the new table as the source data for the pivot table.
- Using a SQL query to create a new table that combines the values of the columns that you want to count, and then using the new table as the source data for the pivot table.
However, these approaches may require more steps, skills, or tools than the methods that I have explained above. Therefore, I recommend using either the helper column method or the measure method, depending on your version of Excel and your preference.