Luohan Academy

Digital Footprints and the Implications for Credits

Event materials

  • Transcript
  • Slides

Manju Puri, Professor of Finance at Duke University, presented at Luohan Academy's Frontier Dialogue. Sumit Agarwal, Professor of Finance at the National University of Singapore, discussed Manju Puri's presentation. The following texts features transcripted excerpts. It has been lightly edited for clarity and length.


Speaker presentation by Manju Puri

It's a pleasure to be here with this global audience. Let me get started on fintech and finance. This is a very, very broad topic. It covers a number of wide-ranging areas, and the talks today going to be great and that will cover many aspects. And in such a broad area, I think my talk is going to focus more on digital footprints and the implications for credit.

When we talk about digital footprints, what do we mean? It can mean a number of things. You could have a bare-bone digital footprint, which is simply when you go and access the web, how much information do you leave behind? You can have a deep digital footprint where you mind your entire web presence, whether it's through social media or Twitter, or information that you have on various forums. And there's a growing industry that is claiming, at least, to do alternate credit scoring through digital footprints.

How informative are digital footprints? How durable? And there's both promise and concerns. There are number of promises, in particular access to credit for the unbanked. And there are some concerns. There's privacy concerns, discrimination concerns, etc.

What I'm going to start with first is just bare bones, digital footprint. These days, we're pretty aware that if we put something on Twitter or Facebook, it's publicly accessible. And people have become a little bit more careful as to what they put there.

But on a regular basis, all of us go to the web, access various internet sites, and register there. How much information does that simple act do? And that is accessing and registering on a website. And so that's what we're going to look at. We're going to look at this shallow digital footprint or the bare-bone digital footprint. 

Here's a picture. And this is a publicly available picture of New York City. Now, if I was a woman from Mars and I know nothing about the US let alone New York, what information could I gouge from this picture?

Spatial distribution of iOS, Android & Blackberry phones in New York city

Source: Gnip, MapBox, Eric Fischer, Data 2011-2013

In this picture, the red dots are basically iOS, the green dots are going to be Android, and the purple dots are Blackberries. The red dots are concentrated in the city of Manhattan. There's an interesting recent paper, which looks at what is the difference between the rich and the poor through the ages. It used to be “do I have cable TV”, “do I have a BMW”. In today's world it's “do I have an iPhone”. So you can see here that the wealth is sort of concentrated in Manhattan. Once you start going to New Jersey and Brooklyn, you see a lot more of these green dots. This is where there's less wealth. And in the heart of Manhattan, you see a few purple dots, and what are they? It's like the Obamas of the world that refuse to give up their Blackberries, or more likely it is bank issued sort of Blackberries.

This is simply one piece of information that you get when people access the web. And it's already telling us a lot about the wealth distribution in New York city. So the question here is, if we were to look at just the information that you leave behind, this and a few others, which is a shallow digital footprint, how much can we actually say about default rates and how well can we assess your credit?

While there are growing number of companies that do alternate credit scoring, the effectiveness of this is unknown. As a researcher and as a journal editor, it is important that we be able to access data and actually research and analyze and quantify this if we're to take this into account in our discussing in practice, and with regulators.

We were able to access data from an e-commerce website in Germany, which was very similar to Wayfair, where you browse and purchase the goods. The default rate is roughly 3% annualized, so very similar to what you would see in other places. The digital footprint is going to be three sorts of easily accessible variables. The first is simply the device type - do I use a desktop, a tablet, a mobile; the operating system is iOS, Android, or Windows; and who is the email provider. As I said, having an iPhone requires a certain amount of purchasing power. Research has shown it's a proxy for being in the top quartile of income. And you have free emails. You have free old emails like Yahoo, and Hotmail. And you have free new emails, relatively newer such as Gmail. And you have emails that you pay for such as AOL in Germany, and T-Online. So again, if you're paying for your email, that probably means you're a little more wealthy or that you have higher income levels. 

The next four are looking at the channel through which you get there - do you actually type in where you want to go, or are you clicking on paid ads. Perhaps not surprisingly to people in marketing, people are much more likely to default who click on paid ads. We also look at checkout time. What time of the day do you purchase? Morning, noon, or night. People who do their purchasing between midnight and 6:00 AM are the most likely to default. We look at tracking and email error. When you put your email in, does it bounce? If you are so drunk you can't remember your email and it bounces, you actually are more likely to default. 

We also look at - is your name in the email. There's a growing amount of research in entrepreneurship that suggest putting your name on your company actually matters. This is what we find. If your first or last name is in the email, you're less likely to default. And then finally, we look at do you type everything in lower case, or like civilized people do - do you actually capitalize the first letter of your name, etc. We find people who type all in lower cases, they are actually more likely to default.

These are 10 simple variables that you leave behind when you simply access and register on a website. It turns out a lot of predictive power in terms of default rates. Here's a graph which is looking at the credit bureau score and default rates. And you can see the difference between the first and the last deciles of the credit bureau score, it's fairly significant.

But if I take just two variables from the digital footprint - whether it's Mac and T-Online, a paid email as opposed to Android and an old free email. The difference in default here is higher than the difference in default between the first and the last deciles of the credit bureau score. It suggests that the digital footprint, even a bare bones digital footprint can actually be very powerful.

We then look at area under the curve, which is a pretty standard way of looking at discriminatory power. The idea here is how much of default the lowest 25% is giving you? If it's completely random, you're going to have this 45 degree line that 50/50 whether you default or not. And in general research suggests that if you have 60% or better, that's sort of a good score.

If you look at just the credit bureau score, you get a good area under the curve of 66%. But when you look at the digital footprint, the area under the curve is actually almost as good - above 68%.

Think of all the work and effort that goes into making a credit bureau score - all of the people, you have to track the credit default, late interest payments, etc. And think of the amount of effort it takes to get the digital footprint. Very little. And yet, even with the shallow digital footprint, it's doing as well as the credit bureau score. When you add the two together, it actually does even better. You actually add another sort of 5% of discriminatory power, which suggests that the two complement, as opposed to substitutes. 

Here's just a quick comparison with other studies. The one that is most informative perhaps is comparing it with the internal credit ratings of banks. When you look at banks and their internal credit ratings, there's a large literature saying banks are special, they collect all this information that they have a lot of private information. And all that time, energy and money is doing well. They add eight to 12% over a credit bureau score.

This is good news for banks, but our digital footprint, the barebone shallow digital footprint is already doing half as well. And this is without mining for additional information. So clearly this is telling you that banks competitive advantage is sort of being threatened.

The idea so far is the predictive part of the digital footprint for short term loans for products purchased online, what about for long-term loans? Let's say mortgages. We don't have data on that, but we can do the next best thing, which is see - does the digital footprint predict your future credit score?

When we do that, we find that is indeed the case. Now if that makes sense, what does the credit bureau score? It is your historical sort of analysis of everything that's happened. Their defaults in the past. What is the digital footprint? It's encompassing you as you today. I might have had a great credit bureau score, but if I lost my job, fell in depression, started drinking, started getting up at 2:00 AM, randomly shopping at night, the digital footprint is going to capture that. It's going to take time to get income passed into the credit bureau score. That's indeed what we see.

This has a number of implications. The first is on the information advantage for financial intermediaries like banks. One of the main advantages of banks has been that they have access to all this information about you, from your deposits, from prior lending, etc. Now it looks like the costless digital footprint can actually access a lot of this information and start threatening banks, traditional sort of information advantage. 

This raises a number of questions. Can banks use such information? Would regulators allow them? Is this durable? How does this operate through the cycles? Once we have, would this work well in downturns too?

The next implication is access to credit for the unbanked. In our sample, when we look at people who do not have credit scores, we find the digital footprint actually works as well, if not better, than for people who never had credit scores. This gives us hope that we could use the digital footprint to actually give access to credit to those who are non-banked.

Finally, last but not least, there's the implication for consumers, firms, and regulators in the digital sphere. While on the one hand you might be able to bank the unbanked, what about privacy concerns? What about discrimination? There are growing number of papers trying to look at what does this mean for discriminations? Could you sort of back channel information that you may not be able to use otherwise through fair lending act? And it certainly poses a challenge to regulators as to how should we think and regulate about this space.

Overall, is the digital footprint useful for payment behavior? Many companies seem to be using it. The research backs this, that even a shallow digital footprint is extremely informative. It compliments as opposed to substitutes for credit bureau scores, and it works equally well for people who don't even have the credit bureau scores.

We haven't looked at deep digital footprints. We can clearly do a lot more. More work is needed both using different data and environment. And this has many, many implications, which are of great interest to the global community, for banks and their model, for access to credit for the unbanked, for regulators, consumers, and firms. What does this mean? How should we behave in this digital sphere?

Discussant presentation by Sumit Agarwal

I'm going to talk about FinTech and the Future of Finance. Basically, I'll just try to pick up where Manju is leaving off. I think the real impact of FinTech and finance is in developing countries as opposed to developed countries, mainly because consumers can leapfrog using this technology in developing countries. And this has huge implications for consumer welfare.

If you think about in the old day is land line versus cell phones, computers versus smartphones. And nowadays, when we think about e-wallets in small villages in India and Africa VS bank branches. More broadly, what Manju was talking about, two billion people around the world don't have access to banking. 45 million people in America itself don't have access. In India, about 850 million people don't have access to credit. TransUnion says that if you give access to these people, for even around 115 million, that will be huge. But what do these people do have is majority of them actually have a cell phone.

Manju talked about the dark side. I'm not going to go into there. I'm not really going to discuss Manju's paper, but I'm going to pick up where Manju left off, which is about shallow versus deep. And secondly, picking off from looking into developing country where we don't really have credit scores. And discuss a couple of ideas there and see what is really happening in this space.

I'm looking at exactly the question of what will happen if we have social and mobile footprints using mobile phones. Then try to see if we can predict defaults using this alternative credit scores. This is exactly what Manju is trying to allude, that let's not use credit scores, let's try to see if we can find alternative forms of credit scores.

What we do is try to look at what is happening in India. We collected this massive data from one of the FinTech company, which gave out around 360,000 loan applications. What we have is this deep footprint, social footprint, applications they have on their mobile device, things like Amazon, Flipkart, travel applications, dating applications, social media applications, financial applications.

We also know if you log in through LinkedIn versus Facebook, which Manju talked about, iOS, Android versus other types of phones. Then we also have the call log. So we know who you called, what are the SMSs you were making, what time you were calling, were you calling your mother versus were you calling your friends. Can we use that to predict if you are a safer customer, especially using your call log information?

We are going to call this as the deep social footprint, basically because of this call log data that we can utilize, the time of the day, who are you calling, how many times you call the person. That is informative about your type, which is essentially talking about, are you less risky or not?

Using this data, we follow a similar approach that Manju has described. There is a huge lift in the AOC of this credit scores of up to 51% using this mobile and social footprints. If they do out of sample, we actually get also a big lift. So the social footprints have a big predictive power. This is on one side I want to show what can be done or advanced, in exactly what Manju brought up. 

Let me discuss one more idea - can we use FinTech to learn from the FinTech for migrant workers? My objective was to show what is the effect of FinTech in developing countries, and especially for poor people. Here we are looking at maids and foreign workers in Korea who now actually have mobile devices through which they can transfer money home. Earlier they would go to a bank or a credit union, and the credit union would charge them enormous amounts of money.

What this application does is it allows these people to not only remit and transfer, but also have this feature of cancellation. I can say, I want to transfer money today, but within 24 hours, this application says, you can cancel and transfer again. Because exchange rates may fluctuate. So the maids pay a lot of attention to what the exchange rate is today and tomorrow.

Here you have two examples, one is a remittance you did on June 20th,2019 at 2:00 PM. And then it went through because the exchange rate was favorable to you by next day and you transmitted. 

There is another example, which you canceled. First day, you tried to make the payment, and then you realized the next day the exchange rate was unfavorable to you, and so you actually withdrew that and canceled that transfer. And then you did a new transfer the next day.

Now, what we are trying to look here is the social network, this FinTech application allows you to create. After I do the cancellation, do I then immediately SMS friends and tell them - "If you did your transfer yesterday, cancel it right now, because if you cancel it, you can get better exchange rates today and transfer again." What we find is a lot of people, when they have a social network, due to this FinTech device, they immediately choose this cancellation function, and the cancellation function increases their welfare by around 13% by choosing the cancellation function because of this FinTech device.

If they're very savvy, these consumers in our data, obviously, are very sensitive to exchange rates. As soon as the exchange rate changes by even 1%, they will cancel their transfer. Here's some regression results. In all the regressions, when you learn the cancellation feature faster, you use it more in sequence, and you transmit it more to your friends. 

This is kind of something like Manju did a few years ago when she was the paper on bank runs. She was trying to see how information diffuses within a building, and here we are trying to show within a device, how does the information diffuse to your friends by telling them to use the cancellation feature now, and don't transfer your money.

Here are some results on how these networks get formed, because once you start out, somebody starts out, they tell their friends, they tell their friends. And once you start using the cancellation feature within that network, you actually form bigger networks as a result because you are transmitting that information and people are learning that I can actually get a better exchange rate because somebody is doing it. They're teaching me. They're telling me when and how to cancel my transaction as well.

These are pretty much all the comments I have. I'm just trying to take forward where Manju kind of left off in her presentation. There's a lot that can be done. I mean, payments, investments, and lending. The dark side is still a very big open question. Role of privacy and discrimination, which Manju talked about. I think the field is widely open and that's why there is a huge number of people in this call. And I hope other people will pick up and see what else can be done.

For more information, please visit Luohan Academy's youtube channel: Luohan Academy


to leave a comment