Leonardo da Vinci’s ”Vitruvian Man” and Matt Groening’s ”Homer Simpson” have a lot in common. They are both famous drawings of human males. Both characters have (more or less) the same body parts. Yet they are very different in one critical aspect – proportion. Vitruvian Man illustrates an ideal male form. Homer illustratesÛ_ something else entirely.
When we select a random sample from a population, we often want it to have ideal proportions, like Vitruvian man. We want the different groups of interest to appear in the sample in roughly the same proportions as they represent in the population. In practice, our sample sometimes looks less like Vitruvian Man and more like Homer Simpson. The right parts are there, but in the wrong proportions. This could happen because of deficiencies within the frame and the fielding process (lower response rates among young men, for example). It can also happen on purpose – sometimes we want a Homer.
When we design a sample, we may want to focus on specific parts of the population. To do that, we select a disproportionately large sample from those parts of the frame that we are most interested in. These parts of the population end up exaggerated, like Homer’s head and hands. The rest of the population becomes disproportionately small (i.e. Homer’s chest). To get an accurate picture of the entire population, we weight down the oversized parts of the sample and weight up the undersized parts of the sample. This is akin to stretching out Homer’s chest while shrinking his head and hands. This may sound like a roundabout way to approach things, but it provides extra detail for the portion of the population we are most interested in, while still providing results with realistic proportions.
In other cases, we set out to get a sample with Vitruvian Man proportions. But all too frequently, due to deficiencies in the sample frame or the fielding process, we end up with a Homer. Sampling from internet panels is particularly susceptible to this problem. Certain parts of the population are much less likely to be on the internet. They are also less likely to participate in internet panels or respond to survey invites. Even when a sample is designed to include them, they end up making up a smaller than desired proportion of the sample. Other parts of the population are more likely to be on the internet, participate on panels and more likely to respond to survey invites. These people make up a disproportionally large part of the sample. Weighting is again used to correct this problem.
This problem is not unique to internet panels, of course. With the incidence of mobile phone-only households skyrocketing and with response rates at landlines plummeting, there are similar but different distortions in the sample frames available by phone. Again, people must resort to weighting to try to transform Homer into the elusive Vitruvian man.
Weighting brings all the parts of a sample back to their correct proportion. Does this mean that our Homer-like sample (once corrected) is as good as another sample with Vitruvian proportions? The answer is no. Weighting makes the results more representative of the population, but it would be even better if we could somehow get the Vitruvian Man sample to begin with.
The amount of discrepancy between the sample and the population is measured by ”weighting efficiency”. It represents the degree to which we have to stretch/shrink the various parts of our sample to match the population. A sample with correct proportions is considered most efficient (100%) because it can represents the total population without additional correction. A sample of 50% weighting efficiency is only as good as a Vitruvian sample of half the size. The variation around the 50% efficient sample would be roughly twice that of a sample that requires no weighting. Although weighting is useful for correcting proportions that don’t match the population proportions, it decreases the efficiency of the sample and makes the effective sample size shrink.
So while weighting is very useful, it does have side effects. You can stretch and squash Homer into a Vitruvian Man-like shape, but you pay the price in terms of increases in the margin of error. The best practice is to pull sample that, when adjusted for response rate, will produce results that look like the Vitruvian Man.