You read that title right. U.S. teen pot smoking is correlated with the number of honey producing bee colonies. In this piece Nathan Palmer uses this strange statistical fact to help us better understand correlations and causal relationships.
Did you know that the rate of divorce Maine correlates nearly perfectly with margarine consumption in the U. S.? It’s true. Furthermore, the more teens arrested for marijuana possession every year in the U.S., the fewer honey producing bee colonies we have. That’s a fact! Most important to us here at SociologyInFocus, research indicates that the rate of sociology PhD’s awarded each year is correlated with the number of rocket ships we send into space each year (but only the noncommercial ones, I mean why would rocket launches designed for commercial purposes have any affect on sociology, ammirite?).
Wait, none of this makes any sense. Fake butter has nothing to do with divorce, pot smoking teens aren’t killing honey bees, and sociology departments aren’t waiting for a space shuttle launch to award a PhD. I can explain everything, but first we need to talk about correlation and causation.
Correlations are a shared relationship between two variables. This is a lot easier to remember when we break the word down: co (meaning shared) and relation (meaning relationship). So for instance, there is likely a correlation between the number of hours you study each day and your GPA. As the number of hours per day you study (variable A) increases so too does your GPA (variable B). You should be able to think of dozens of correlations (e.g. the more money one has correlates with the size of their home).
There are two types of correlations: positive and negative. When variable A and variable B both go in the same direction they are said to be positive. Given that as the number of hours studying per day increases so too does a student’s GPA, this is a positive correlation. It’s likely that the number of absences a student has negatively correlates with that students grade in the class. That is, an increase in class absences correlates with a decrease in class grade. Therefore, class absences negatively correlates with class grade.
There is an old saying in the sciences, “correlation does not equal causation.” The fact that two variables have a shared relationship does not mean that variable A caused variable B to happen (in science we call this a causal relationship). For a something to be defined as a causal relationship it must first pass a three step test.
- Variable A and variable B must correlate with one another
- Variable A must happen before variable B.
- We must find evidence to rule out possible alternative explanations.
If the first two criteria are met and we can’t think of any possible alternative explanation, then we can consider variable A to have caused variable B. To be clear, the 3 step test quickly rules out pairs of variables that can’t be in a causal relationship, but the process of “proving” that variable A caused variable B is a much more complex and involved process.
What Does It Mean That Fake Butter Is Correlated With Divorce?
Margarine has nothing to do with divorce in Maine. This is an example of how completely unrelated things can correlate with each other. When this happens it’s often just a mathematical coincidence. But sometimes we find that two seemingly unrelated things are correlated and that leads us to new scientific discoveries. For instance, the correlation between energy consumption and the average global temperature must have seemed wholly unrelated at first. Today we know that the carbon dioxide released from burning coal/gas leads to the greenhouse effect and causes climate change. When we discover that two variables are correlated we have to use our rational scientific minds to evaluate them with an open, but critical mind.
- Think of a causal relationship that was not discussed in this article. Describe both thing A and thing B and then run them through the three step test described in this article.
- It’s not uncommon for people to argue that something is causal when it couldn’t possibly pass the three step test. For instance, it’s not uncommon for people to say when discussing race, “did you ever think that talking about racial inequality creates racism and racial inequality?” Run this idea through the three step test and demonstrate how this isn’t a causal relationship.
- If you eat 5,000 calories a day and then gain weight, can we say for certain that your high-calorie diet caused your weight gain? Think of at least one possible alternative explanation. That is, one factor that could explain a person gaining weight that is separate from their diet.
- Think of 3 examples of negative correlations. Explain them. Remember that a negative correlation has variables that go in opposite directions.
Thanks to Tyler Vigen for creating his Spurious Correlations website that served as the basis of this article. ↩
I should be a responsible social scientist and tell you that I don’t have evidence on hand to show you that studying or skipping class affects classroom performance. These are hypothetical examples and despite the fact that they “make sense” we should withhold any conclusion until we have evidence. ↩