Sometimes a sociologist’s mind wanders and she starts thinking about research methods. Here, Bridget Welch discusses a case in which that happened to her and helps you understand some fundamental research methods concepts.
Driving through the beautiful Appalachian mountains in Kentucky, I hit the Bermuda Triangle of radio reception. For miles, all I can get are stations playing gospel — not really up my alley. Then, a dead zone. It’s all static until I finally catch some kicky beats. I nod my head to the tunes for approximate 1.73 seconds, the length of time it takes me to realize — it’s Blurred Lines. A song I boycott for OH SO MANY REASONS. I give up. The radio flips off. Stormageddon (that’s my nickname for my son) has paused in his attempt at world domination and has fallen asleep in his car seat. I got nothing to entertain me and over 500 miles to go. What to do?
I go through the usual. Think about work. Plan things that need to be done. Start calculating how many miles I’ll travel in the next hour, half hour, 20 minutes. [Please tell me I’m not the only one that does that.] When that’s all done I do something we usually try to avoid, pay attention to driving itself. Speeding up and slowing down occurs a lot in the mountains (particularly when you drive an old mom van) and I shortly notice something odd. I’m using my GPS to determine the route. The GPS estimates my speed giving me two sources — the GPS and the classic speedometer. And what I notice is that my van estimates my speed at 4 miles/hour slower than the GPS. I speed up, I slow down… 4 miles off. And I think to myself, “Huh. One — or both — of these is a reliable but not valid measurement of speed.” Really. I really thought almost exactly that. Getting a PhD does things to you.
So, what the hell was I so geeked out about? To explain, let’s pretend you want to do a study on binge drinking (something you definitely shouldn’t do while driving).
Step one is to conceptualize what you mean by binge drinking . When I ask my students what they think binge drinking means they usually say “drinking to get drunk.” But what is drunk? In actual research conceptualization requires a lot of reading of previous research to discover the generally scientific agreed upon way to conceptualize a concept. In this case, we would look to experts such as the National Institute on Alcohol Abuse and Alcoholism who define binge drinking as: “a pattern of drinking that brings blood alcohol concentration (BAC) levels to 0.08 g/dL” which is a more specific way of saying drinking to get drunk as they define drunk (blood alcohol level of 0.08).
Step two is to operationalize (a word that your spell check will not like but is real) your conceptualization of binge drinking in a way that will allow you to observe it (or measure it) in your study. In a survey this would be the item(s) or questions you include about a person’s binge drinking behavior. Our problem is that you can’t ask: “Do you have a pattern of drinking that brings blood alcohol concentration (BAC) levels to 0.08 g/dL?” Seriously, who would be able to answer that accurately? Instead, we would have to go back to the research and find that achieving this BAC “typically occurs after 4 drinks for women and 5 drinks for men—in about 2 hours.” So we can ask, “How many times in the last 30 days have you had 4 or more drinks for a woman and 5 or more drinks for a man in a two hour period?”
There are still a lot of problems with this question. What’s a drink? Is it a beer, a glass of wine, a shot of Everclear? Not all women need 4 drinks to get to 0.08 BAC and some need more (and same with men at 5 drinks). How well are people going to be able to remember the past 30 days? And that’s not a complete list!
Here’s where reliability and validity come in. In research methods, reliability and validity refer to how well you are measuring your concept of interest. Reliability is the degree to which you get consistent results out of your measure. Validity is how well your measure actually captures what you want to capture.
Pretend the target represents binge drinking. Each shot represents a person in our sample. If you ask a person and they give you a perfectly accurate answer about their binge drinking we hit the center of the target. The further we are from the center the further we are from actually capturing the binge drinking that person did. This creates four different possibilities. The worst case is the top right. Each individual’s response is in the upper two quadrants. Because of the wide spread (across two sectors) there is no consistency in the results you are getting making it not reliable. Neither is it valid as it is not hitting the center. Top right — while you are rarely hitting the center, you would get a good estimate on average for all people giving a valid group estimate. However, it is not reliable because the individual estimates are not consistent. Bottom Left — here you have the opposite problem. You are getting consistent individual scores (reliable), but the group is skewed away from the center (not valid). Bottom Right — Here you have the best case scenario where individuals are consistently giving you answers that are hitting at the center meaning.
Which case do we have with our measure of binge drinking? Likely it is a top right situation. Some men will need more than five drinks, some less (same for women and 4). But, overall, you should be estimating the group correctly.
- Go back to the driving measure. Explain the conceptualization and operationalization we use to estimate our driving speed.
- Explain how the GPS and car measurement of that speed was reliable but not valid.
- How could I figure out which measurement (car or GPS — or BOTH) is not valid?
- Select a topic of interest for yourself. Conceptualize and operationalize the concept. What problems with reliability and validity do you think you might find?