Exactly what makes a chocolate chip cookie taste so good? Is it the chocolate chips, or is it the flour, the butter, or the sugar? How much does that pinch of salt contribute to overall goodness? Of course it's the combination of all these things. But if you want the best possible cookie, you might want to quantify the impact of each ingredient.
How would one go about that? By testing all possible combinations of ingredients. Bake a test batch without chocolate chips, another batch without the salt, etcetera. Then eat them and see how good they are. This might take a lot of baking and a lot of cookie tasting (insert sound of my kids cheering here). The key is to run these tests systematically, making sure you cover each ingredient on its own and in combination with all the other ingredients.
We encounter similar problems in market research. We often want to determine what is "driving" consumer behavior. For example, we may want to know why a consumer decided to buy a certain cell phone. To figure out which attributes are most important, we must examine the data to find patterns in perceptions and behavior. For example, if consumers purchase the product regardless of whether they like or dislike one of its attributes, we conclude that this attribute is probably not an important driver of, or barrier to, purchase. In driver analysis, we examine the whole host of factors that may influence purchase decision and prioritize their level of influence.
The challenge is, when we collect information on people's perceptions on a brand or attitudes toward something, we see a lot of overlap. So attributes like "tasty" and "delicious" get similar ratings, but not identical. This phenomenon is called "multicollinearity". But this multicollinearity is a problem for us, because we want to isolate the importance of being "delicious", relative to being "tasty". This is where Game Theorist Lloyd Shapley and our cookie experiments come in.
Shapley came up with a simple yet extremely elegant solution: test all possible combinations of ingredients (or attributes, or attitudes) and see what the difference is when an ingredient (or attribute) is included and when it is excluded. Shapley reasoned that the average difference between the times sugar is included and when it is excluded, for example, reveals the contribution sugar makes. This difference is what we call the Shapley Value. Applying this principle to driver analysis, we have Shapley Value regression. We use Shapley Value regression instead other forms of models, because it does a much better job of untangling the difference between, say, "tasty" and "delicious".
While this may seem like a small thing, think about the best chocolate cookie you've ever had tasted. Now think about the worst one. Does being able to understand the right balance of ingredients make a difference? It sure does. And not just to people who eat cookies, but to marketers, brand managers, product development and everyone who wants to accurately understand which beliefs drive choice.
P.S. Hey: why "model" when you can just ask why?
You might ask why we should bother with any of this data analysis. After all, couldn't we simply ask people why they buy certain products and not others? For the answer to that question (which is "no"), check out this piece, "Don't ask Why? The Answer Is Not What You Think It Is." (Synopsis - people tell you what they think you want to hear).