Luohan Academy

A Reverse Adverse Selection Effect Enabled by Big Data

Event materials

  • Transcript

Michael Spence is a Nobel laureate and a Professor of Management from Stanford University. His research focuses on economic policy in emerging markets, the economics of information, and the impact of leadership on economic growth.

In the third Luohan Academy Frontier Dialogue, Professor Spence shared his opinions on how big data and AI could positively impact the informational structures of markets through reducing pooling equilibriums, and creating more precise and effective product differentiation.


Speaker’s presentation by Michael Spence 

Thank you, Steve and thank you, Long. And greetings to everybody wherever you are. I got lucky this time, it's actually daytime. This is an amazing gathering of scholarly talent and practical experience, and it's a real privilege for me to be part of it. 

I want to take a minute and congratulate Chen Long and his colleagues for building a really important research institution in a startlingly small number of years. It's really an impressive achievement. There's prolific output, much of it available on the Luohan Academy website, and it includes two really major complimentary reports. The first one was, "Digital Technology and Inclusive Growth." Then we have the one that's being released today that Steve mentioned, "Understanding Big Data: Data Calculus in The Digital Era."
The first documents for Chinese e-commerce and fintech the inclusive features of the growth patterns enabled by these digital technologies, specifically open, platform centered and architected digital ecosystems

This first report suggests, I think persuasively, that data-driven ecosystems may efficiently solve at least some coordination problems via connected market structures, with the connections being an intelligent data and information layer sitting over them.

It also suggests, persuasively I might add, that data driven ecosystems may efficiently solve at least some coordination problems via connected market structures without requiring internalization within the firm. And they do this without requiring internalization within the firm. 

So, I think that the Coase theorem, which tells us how to calculate where the boundaries of firms and markets occur, is probably still valid. But the boundaries themselves may be shifting in favor of markets in some environments. And that has all kinds of implications for innovation and entrepreneurial activity and so on. So, it's a kind of open door for very interesting research. This hypothesis is fundamental and deserves in-depth research that it will undoubtedly stimulate.

The second report picks up the analysis, as I read it, in two really important ways.

The first one asks a centrally important question, which is, "Where exactly does the value creation enabled by digital data come from?" It's certainly not in stand-alone silos of data. Chen Long will say much more about this in his tour of the report, I'm sure. But for me, the short answer is the value comes from flows and sharing, and from pooling in certain use cases, then the value is realized from processing data via valuable and actionable insights and predictions. And it does this in a stunningly wide range of applications or use cases. From science, to health, to education, to e-commerce, to finance. And I'm sure there will be many more as we go along this journey.

Second, the value of creating flows and sharing that goes along with big data, it also creates vulnerabilities and risks. And the potential for misuse, I might add, as judged by the prevailing social values and norms, which may differ across cultures. And so the second part of this report, and it's not necessarily sequential, asks the other really important question, which is, "What, in some detail, are the potential misuses of big data?" And, "How, via a whole variety of mechanisms including defensive individual behavior and learning, regulation, legal technological solutions, and institutional trust building mechanisms, can these vulnerabilities, risks, and so on, be mitigated in ways that, to the extent possible, minimally impair the enormous economic and social value potential of the digital technologies, and specifically big data?"
In the world, and in research, this is a work in process.  It will be a long time if ever when it will be viewed as largely complete – in part because the technological underpinnings of the economy and society are changing so quickly. It is the quintessential moving target. This paper, as well and the discussion today, and the wide range of research cited in the paper, will advance progress in this area.


There's no question we have a trust problem. I mean, we've all seen surveys. The last one I saw, it came from McKinsey, they surveyed one thousand North American individuals or households and asked them about which areas of the economy that use and require personal data did they trust. The highest ones, and they weren't very high, were healthcare and finance. And then it fell off dramatically across a whole range of sectors. So, I think, we're not dealing with something that's unimportant. This going the wrong way, the trust deficits, if they were to expand, would significantly impair the benefits that we talked about before.


I'm going to use the rest of my time to talk about what intrigues me, but let me just mention one thing that I find ironic.

In the early days of the internet, much enthusiastic discussion focused on empowering the consumer via vast troves of previously inaccessible data/information that had the potential to reduce the informational asymmetries between consumers and suppliers of goods and services – with the deficit being on the consumer side. This was not wrong – but it was incomplete.

And the irony, I find, is that we now find ourselves in discussions about protecting consumers, meaning the buyers, and users from potential abuses by suppliers of services who hold vast troves of personal data, or abuses by third parties that find a way to gain access to the data.  


So that's kind of where we are. I think they're both legitimate points of view. We know that big data and the internet have profound effect in multiple dimensions

1)    Democratizing access to information at negligible incremental cost – critically a function of the non-rival feature of data and information
2)    Reducing the costs of remoteness – in terms of access to information, and markets
3)    A host of new services that have startling high value – from the research of Erik Brynjolfsson and his colleagues
4)    We have new markets based on very low transaction costs associated with creating them.
5)    One thing we don't talk about as much, but I think we have an explosion of entrepreneurial activity globally – in part because of low entry barriers in digital ecosystems
6)    Increases in the efficiency and quality of the matching function. An example in the report is the recommendation engines, as I'll touch on in a minute. But I'm sure Al Roth and others may want to talk about that.

But what intrigues me and gets my attention is the impact of big data on the informational structure of markets. Something I spent some time during my academic career on.

I think all of you on the call know what a pooling equilibrium is. It occurs in a range of markets where information is incomplete. And it occurs when there is real product/service differentiation on the selling side of the market that is undetectable on they buying side. 
An example of it is adverse selection, a phenomenon you're all familiar with. And undetectable in this context means, roughly, that signaling and screening mechanisms that might close these informational gaps, to some extents are either unavailable of they have costs that are too large relative to the individual benefits of using them so that they don't get used then you have pooling equilibrium. And pooling equilibria include cases where no transactions occurred, as in the complete collapse of the market in the adverse selection case

You can also, and I mentioned this without being able to go into it in detail, in markets you can have mixed pooling and separating equilibria. And the pooling tends to occur at the low end of the market, which has adverse distributional effects. So, if you stand back from this, or at least when I stand back from this, and look at some of the use cases involving big data and deep learning algorithms, meaning algorithms that detect patterns in data that are otherwise basically invisible to us mortals, what you see is a partial undoing of pooling equilibrium by reintroducing (statistically or probabilistically) product differentiation.

You can think of it as the reverse adverse selection effect.   

Big data, when in combination with artificial intelligence algorithms, is a new (and relatively low cost) signaling or screening option or mechanism, in markets with significant informational gaps and differentiation. This is the way I think about it.

As in traditional signaling and screening mechanisms, it will have incentive effects that feedback on behavior – mostly positive I suspect. And that raises questions about the transparency of the algorithm, and questions having to do with whether or not people know what to respond to. By the way, since what I've said is fairly abstract, you can see this clearly in the first report, in the expansion of credit markets using big data and AI. Observing people who are essentially out of the market. And I think Bengt Holmstrom may want to talk further about that.

I expect to see a similar evolution that we've seen in credit markets occur in labor markets, where you have very detailed, specific skills and experience that you might be able to detect with enough data on people's history, work history, and so on. Or you might be able to detect characteristics of individuals that signal that they can acquire these skills quickly. So, I think that labor markets are another area where we might see this. And I don't think it's unreasonable to speculate that an expansion of incredible product differentiation will occur in a wide range of markets.

Since I'm nearing the end of my time let me just say, because to me this is kind of an open-ended set of researchable subjects and we have some foundation in it, but this is two-sided.

We normally think of the information deficit as being on the buying side of the market. For insurance we have to flip it over and think of insurance as being a buyer of risk rather than a seller of insurance policies.

But you can see from this report’s very interesting discussion of recommendation engines, and the impacts of withdrawing them in controlled experiments, that the informational gaps and differentiation go both ways. Sellers may not, in fact do not, know the preferences of buyers. The recommendation engines are big data/AI use cases that deliver on the “know your customer” dimensions. In other words, they help them target. And if you take the targeting away, you basically get something – marketing, and advertising, and product recommendations – that are very inefficient, pretty close to useless.

So anyway, I just thought I'd introduce the discussion by saying I think there's a tremendous set of opportunities here to deepen our understanding. And I do think the evolution of the informational structured markets is set for major changes. It's not true that separating equilibria are always better. They have distributional consequences that most of us haven't explored very thoroughly. Except when we include people from the bottom up.

It's not clear, in the case of health insurance for example, that you want to dismantle the pooling equilibrium. Markets armed with these very powerful new tools operating on genetic and behavioral, and relative data, data about your relatives. They will dismantle these pooling equilibria. And that's not necessarily where we want to end up. In the brave new world of biomedical science, we're all going to have preexisting conditions. It's not difficult to imagine a world in which we ended up being able to insure ourselves against everything except the risks that are particularly pertinent to each of us individually.

Let's move on. Thank you, Steve.

For more information, please visit Luohan Academy's youtube channel: Luohan Academy


to leave a comment