A scatter chart, also known as a scatter plot or scatter graph, is a type of chart that displays the relationship between two numerical variables using dots or markers. Each dot represents one observation from the data set, and its position on the horizontal and vertical axes corresponds to the values of the two variables.
A scatter chart can be useful for exploring the correlation, trend, or outliers among the data points. It can also show how the data points are distributed or clustered in different groups or categories.
In this article, we will learn how to create a scatter chart with multiple series in Excel, using some simple formulas. We will also explain the basic theory behind this method, and provide a scenario with real numbers to illustrate the process and the result.
Basic Theory
To create a scatter chart with multiple series in Excel, we need to format the data in a specific way. The basic idea is to use a numeric proxy for each group or category, and then use a formula to assign the corresponding values to each proxy.
For example, suppose we have the following data set, which shows the height and weight of 12 students from four different classes: A, B, C, and D.
Class | Height (cm) | Weight (kg) |
---|---|---|
A | 160 | 50 |
A | 165 | 55 |
A | 170 | 60 |
B | 155 | 45 |
B | 160 | 50 |
B | 165 | 55 |
C | 150 | 40 |
C | 155 | 45 |
C | 160 | 50 |
D | 145 | 35 |
D | 150 | 40 |
D | 155 | 45 |
To create a scatter chart with four series, one for each class, we need to assign a numeric proxy for each class. For example, we can use 1 for class A, 2 for class B, 3 for class C, and 4 for class D. We can enter these proxies in a new row above the data, as shown below.
Class | Height (cm) | Weight (kg) |
---|---|---|
1 | ||
A | 160 | 50 |
A | 165 | 55 |
A | 170 | 60 |
2 | ||
B | 155 | 45 |
B | 160 | 50 |
B | 165 | 55 |
3 | ||
C | 150 | 40 |
C | 155 | 45 |
C | 160 | 50 |
4 | ||
D | 145 | 35 |
D | 150 | 40 |
D | 155 | 45 |
Next, we need to use a formula to assign the height and weight values to each proxy. The formula is:
=IF ($A2=E$1, $C2, NA ())
This formula checks if the class name in column A matches the proxy in row 1. If yes, it returns the value in column C. If no, it returns NA, which means not available or missing.
We can enter this formula in cell E2, and then drag it to the right and down to fill the rest of the cells, as shown below.
Class | Height (cm) | Weight (kg) | 1 | 2 | 3 | 4 |
---|---|---|---|---|---|---|
1 | ||||||
A | 160 | 50 | 50 | NA | NA | NA |
A | 165 | 55 | 55 | NA | NA | NA |
A | 170 | 60 | 60 | NA | NA | NA |
2 | ||||||
B | 155 | 45 | NA | 45 | NA | NA |
B | 160 | 50 | NA | 50 | NA | NA |
B | 165 | 55 | NA | 55 | NA | NA |
3 | ||||||
C | 150 | 40 | NA | NA | 40 | NA |
C | 155 | 45 | NA | NA | 45 | NA |
C | 160 | 50 | NA | NA | 50 | NA |
4 | ||||||
D | 145 | 35 | NA | NA | NA | 35 |
D | 150 | 40 | NA | NA | NA | 40 |
D | 155 | 45 | NA | NA | NA | 45 |
The result is a table that has the weight values for each class in separate columns, with NA for the values that do not belong to that class. This table is ready to be used for creating a scatter chart with multiple series.
Procedures
To create a scatter chart with multiple series in Excel, we can follow these steps:
- Select the height values in column B, and then hold Ctrl and select the proxy columns from E to H. This will select the data that we want to use for the scatter chart.
- Go to the Insert tab on the ribbon, and click on the Insert Scatter (X, Y) or Bubble Chart button in the Charts group. Choose the Scatter option from the drop-down menu. This will insert a scatter chart in the worksheet, based on the selected data.
- To format the scatter chart, we can use the options in the Chart Tools tabs on the ribbon, or right-click on the chart elements and choose Format. For example, we can change the title, the axis labels, the legend, the colors, the markers, the gridlines, and the data labels of the chart. We can also move or resize the chart as needed.
- The final result is a scatter chart that shows the relationship between height and weight for each class, using different colors and markers for each series. We can see how the data points are distributed or clustered in different groups or categories.