Scatter plots (also called scatter graphs) are similar to line graphs. A line graph uses a line on an X-Y axis to plot a continuous function, while a scatter plot uses dots to represent individual pieces of data. In statistics, these plots are useful to see if two variables are related to each other. For example, a scatter chart can suggest a linear relationship (i.e. a straight line).

Scatter plot suggesting a linear relationship.

Scatter plot suggesting a linear relationship.

Scatter plots are also called scatter graphs, scatter charts, scatter diagrams and scattergrams.

Correlation in Scatter Plots

The relationship between variables is called correlation. Correlation is just another word for “relationship.” For example, how much you weigh is related (correlated) to how much you eat. There are two type of correlation: positive correlation and negative correlation. If data points make a line from the origin from low x and y values to high x and y values the data points are positively correlated, like in the above graph. If the graph starts off with high y-values and continues to low y-values then the graph is negatively correlated.

You can think of positive correlation as something that produces a positive result. For example, the more you exercise, the better your cardiovascular health. “Positive” doesn’t necessarily mean “good”! More smoking leads to more chance of cancer and the more you drive, the more likely you are to be in a car accident.

Back to Top

3D Scatter Plot

A 3D scatter plot is a scatter plot with three axes. For example, the following 3D scatter plot shows student scores in three subjects: Reading (y-axis), Writing (x-axis) and Math (z-axis).

3d scatter plot

Student A scored 100 in Writing and Math and 90 in reading, and student B scored 50 in writing, 30 in reading and 15 in math. 3D plots are fairly easy to make for a few points, but once you start to get into larger sets of data, you’ll want to use technology. Unfortunately, Excel doesn’t have an option to create these chart. Statistical programs commonly available through colleges and universities (like SAS) can create them. There are quite a few free options available, but I recommend:

Plotly is an easy way to create a 3D chart online.

Gnuplot: downloadable program. Easy to use compared to other programs.

R: Also a download. Has a fairly steep learning curve, but handles most statistical computations. If you want a general stst package (As opposed to one that will just create charts), this is the best option.

Back to Top

What is a Bubble Chart?

What is a Bubble Chart?

Bubble plot showing Medicare amounts per service/specialty. Image: CMS.gov.

Bubble plot showing Medicare amounts per service/specialty. Image: CMS.gov.

A bubble chart is a way to show how variables relate to each other. It is similar to a scatter chart, only instead of dots there are different sized bubbles.

Bubble charts are a good choice if your data has 3 series/characteristics with an associated value; in other words, you need:

a category with values for your x-axis,

a category with values for your y-axis, and

a category with values for sizing your bubbles.

They are often used for financial purposes and for use with quadrants.

Types of Bubble Chart

In its most basic form, larger bubbles indicate larger values. The placement of the bubble on the x-axis and y-axis give you information about what the bubble represents. This chart shows length of investment (x-axis), price at time of purchase (y-axis) and the relative size of the investment today.

bubble plot 2

Color coded bubble plots use color to sort the bubbles into categories. For example, I might want to sort my investment chart into stocks, bonds, and mutual funds:

bubble chart 3

A cartogram is a bubble plot of a map, where the x-axis and y-axis are longitude and latitude. The size of the bubble could indicate population, number of oil rigs, natural weather events, or some other type of geographical data.

cartogram

The charts are sometimes referred to by dimensions:

Two-dimensional charts have x-values and y-values only. They are equivalent to a scatter plot.

Three-dimensional charts have the x-y axes and bubble size.

Four-dimensional charts have x-y axes, bubble size and color.

A scatter plot gives you a visual idea of what is happening with your data. Scatter plots are similar to line graphs. The only difference is a line graph has a continuous line while a scatter plot has a series of dots. Scatter plots in statistics create the foundation for simple linear regression, where we take scatter plots and try to create a usable model using functions. In fact, all regression is doing is trying to draw a line through all of those dots.

Step-by-step explanation: