Statistical significance tests are a bit like eating fish. Most of the time, you get a wonderful meal (assuming you like seafood!). But if you eat a lot of fish and are not choosy about where it comes from, you might end up with a memorable bellyache.
The ”data fishing” exercise you commonly see in market research is a prime example of indiscriminate seafood consumption. ”Data fishing’ refers to the practice of applying statistical significance tests to every row of figures in the data tables, between every possible pair of columns in the crosstab. This is often taught to researchers as a good way to spot significant differences between subgroups. Any results with a significant indicator are then written into the final report with confidence – ”yes, we tested it!”
What’s wrong with that? Just this – there’s a very high probability that one or more of those significant indicators is wrong. If only one fish in twenty is inedible, how many bad ones might there be in that net you just hauled up from the deeps, bulging with thousands of finny creatures? The answer, unfortunately, is quite a few.
Statistical significance testing, when done right, is a form of hypothesis testing. The researcher formulates a hypothesis about differences in key subgroup populations. A study is designed to collect the necessary data around the hypothesis. When differences in the data are observed, a test is carried out.
In market research, since the data is collected through samples, a statistical significance test is used to assess whether the differences observed in the data are due to a true difference in the population, or are if they are simply caused by random sampling error. Statisticians, being a cagey lot, will never give you a definite answer. Instead, their answers come in the form of probabilities. They may tell you that the difference is statistically significance with a confidence level of 95%. What this implies (but usually leaves unstated) is that there is a 5% probability that this answer is wrong, and there is in fact no difference. If you go data fishing, and run thousands of these tests, your friendly neighbourhood statistician will cheerfully give you the wrong answer to about 5% of all those tests. And that can leave you with the equivalent of a lot of bellyaches.
So what is a person to do? The key is to be choosy and limit your exposure. If you only need one or two fish, there’s no point casting a net for thousands. Conduct significance tests only on the key statistics you set out to prove and only among the subgroups of interest. If you receive a stack of data tables with significance tests everywhere, ignore the ones that you didn’t set out to prove, even if they have significant indicators beside them. And when you’re driving home tonight, if you pass a truck by the side of the road with a big sign advertising ”Honest Joe’s Discount Bulk Seafood and Bait Depot”, don’t stop.
What’s the role of accidental discovery in all this, you might ask? If you come across a truly interesting result, the prudent approach is to design another study around the new hypothesis, and test that with the new data. Unless proven a second time, a spurious result coming from data fishing is just that, spurious. So be careful what you fish for.