One of the ways we can reason erroneously is when we generalize from a sample that is either too small, or unrepresentative of the whole, or both.
These fallacies are pretty self-explanatory:
*Small sample = To base a general conclusion on a sample that is too small relative to the population as a whole.
*Unrepresentative sample = To base a general conclusion on a sample that is unrepresentative of the population as a whole.
You’d be committing the small sample fallacy, for instance, if you concluded that Delaware is unusually rainy because the only two times you drove through the state it was raining.
Indeed, this points to perhaps the main source of the small sample fallacy, which is that we have a natural tendency to overrate the importance of things we experience personally.
I know in a sense this flies in the face of a piece of common wisdom, namely that we learn and grow through personal experience, that it’s important to get out and see and experience things for ourselves rather than rely on what we’re told via books, classrooms, the Internet, etc.
It’s certainly not that I disagree that it’s valuable to directly experience as much as you can of the world. But it’s important also to be aware how it can lead one astray in drawing conclusions about that world.
Let’s say you’ve owned two cars manufactured by Nissan in your life, and they were both lemons. Unreliable, constantly breaking down, always having to spend money on them, etc. Real headaches.
If you’re like most people, this is going to give you a very negative impression of Nissan cars, to where you almost certainly won’t buy another one. (Indeed, having even one lemon like that, let alone two, would cause most folks to swear off buying another.)
But think about what a trivial, miniscule amount of evidence you’d be basing your decision on. You’re familiar with two of the millions of cars Nissan has manufactured, and you’re treating that as if it’s significant evidence of the performance of the next one you would buy.
In the grand scheme of things, your measly two cars count no more and no less than two Nissans owned by your family, two owned by people you barely know, or two owned by complete strangers in Timbuktu. Yet the closer the experiences are to you, likely the more weight you give them, because the experiences are more vivid, easier to call to mind. So the performance of the two cars you owned makes the strongest impression on you, followed by those you’re nearly as familiar with because they were owned by your family, followed by those you have only minimal connection with because they were owned by casual acquaintances, followed by those owned by the good people in Timbuktu.
But compare your two personal experiences with, say, someone who read a study in a reputable consumer magazine that examined a large number of cars, obtained a huge number of service records for this and other makes, etc., and concluded that Nissan is an unusually reliable vehicle.
Psychologically, you’ll probably be more confident of your anti-Nissan assessment based on your personal experience than he will be of his pro-Nissan assessment based on dry, printed facts and figures, because “Hey, I’ve owned one of those, two in fact, so I know what I’m talking about!” When in fact he’s actually got the better evidence.
Similarly, how much can you really learn about, say, the political mood of a city by traveling there in person and chatting with cab drivers and a few people you happen to come across, compared to the much more impersonal and less vivid evidence you could obtain by consulting scientific polls of the area?
The point being, by all means accumulate experiences and use that information in forming opinions, but recognize that the things you experience personally constitute a tiny, tiny drop in the bucket compared to all the available relevant evidence. So learn to also appreciate data you can obtain otherwise than through direct experience. Don’t ignore experience, just keep it in perspective and don’t overrate it.
Turning now to the unrepresentative sample fallacy, sometimes we might have a high quantity of evidence, but still not be able reliably to reason from it to a conclusion due to its being unrepresentative of the whole.
A long time ago, I saw an interview with Abigail “Dear Abby” Van Buren, the famous advice columnist. She claimed that living together before marriage was a very unwise decision, and she based her belief on the thousands of letters she’d received from people in that situation over the years, the overwhelming majority of whom had significant problems in their relationship.
But think about this. How representative of all cohabiting nonmarried couples are those who write to Dear Abby? She’s an advice columnist, a person people seek help from with their problems. Wouldn’t you think maybe people struggling with their relationships would be disproportionately likely to write to her? For her to generalize from that set of data is a straightforward instance of the unrepresentative sample fallacy.
In discussing these two fallacies with folks, including discussing how to apply them to evaluating polls, I’ve found that the small sample fallacy is more intuitive for most people. That is, they can more easily grasp why you need a large enough sample before drawing any conclusions than they can appreciate the importance of a sample being representative.
For instance in critiquing polls, most lay people are highly skeptical that they really mean anything if the samples seem small to them, but they’re less likely to pick up on possible flaws in the representativeness of the sample. So they’ll scoff at a mayoral poll of only a few hundred people, or a presidential poll, where, say, 1,500 people were polled in a nation of hundreds of millions, but they won’t think about the representativeness.
When in fact, mathematically you need only a much smaller sample size for a poll than most people realize. 1,500, for instance, for a nationwide poll may not be a small sample fallacy after all. The representativeness is what’s key.
I was taught an interesting analogy a long time ago that perhaps will render this less counter-intuitive.
Imagine you have a large container of soup on the stove, and you want to taste it to see if it’s seasoned the way you want. Now certainly you need to taste a large enough sample to get an idea, so a molecule or two of soup isn’t going to cut it. But once you’re above a certain minimum, increasing the sample size brings little improvement. What you can learn from sipping a part of one small spoonful is probably 99% as much as you could learn from consuming an entire large ladleful.
What matters, again, is the representativeness. Which in this case means you want your soup to be very well stirred before you taste it. You want to make sure the spices, the ingredients are present in your sample in comparable proportions to how they’re present in the whole container of soup.
In a poll, the sample size is the size of the spoon (which doesn’t have to be big at all), and the representativeness is how well it’s stirred (which is a lot more important).