Have I Been Pwned, a free site that lets anyone search to see if their information was leaked in a data breach, is now just over 10 years old. We asked its creator and renowned security expert, Troy Hunt, whether the world has gotten any better at protecting itself from fraud and cyber attacks since he began the project.
Matt Davey, Chief Experience Officer at 1Password, chatted with Hunt on the Random But Memorable podcast about a variety of other topics including scraping (is scraping a data breach?) and the ethics of disclosure (has legislation like GDPR and CCPA made organizations more transparent about breaches?) Find answers to these questions and more by reading the interview highlights below or by listening to the full podcast episode.
Editor’s note: This interview has been lightly edited for clarity and brevity. The views and opinions expressed by the interviewee don’t represent the opinions of 1Password.
Matt Davey: How are things since we last caught up?
Troy Hunt: Very pwned. Everything is very pwned. There’s no bottom, is there? We just keep going and going. We were past 800 breaches the other day. I thought it was impressive when I started with five. So here we are, business as usual.
MD: Are you counting all the breaches that keep appearing in the press and thinking: “Oh my god, a billion credentials?"
TH: There’s been a lot in the press lately about U.S. social security numbers and the National Public Data (a background-check company) breach. That’s just one breach that’s in Have I Been Pwned.
What I’m not doing is getting sucked into this. For example, earlier this year there was “the mother of all breaches,” which was just a collection of different things. I’d rather see discrete incidents so that people can look at it and go: “Oh look, I was in the Adobe data breach. I know I need to go and have a chat with the folks at Adobe about how I feel about that and change the password I use there.” I don’t like these amorphous things where it’s like: “You’re in this collection of stuff, who knows what it was.”
MD: You recently celebrated 10 years of Have I Been Pwned. In that time, it has become an important tool for many individuals and organizations. How has the platform evolved since its creation and what impact do you think it’s had on public awareness of data breaches?
TH: It’s interesting that the platform itself really hasn’t changed architecturally until now. We’re just doing a rollover of the underlying database model. We got our first ever proper employee yesterday! The platform will still look the same on top, but now that we’ve actually got dev resources, we’ll be able to invest on top of that, hopefully even give it a little bit of a spruce up. A lot happens in 10 years.
“We got our first ever proper employee yesterday!"
In terms of data breaches, over the space of 10 years, it feels like the overarching picture hasn’t changed. We still have lots of breaches from lots of different places. All that’s different now is, I think, we’ve gone through cycles of different things that were getting breached. Around 2018 it felt like there was a lot of MongoDB being breached. More recently we’ve seen a lot of dependency pipeline breaches where there’s a compromised Snowflake. There’s a whole other story around how that actually happened but we see a vulnerability like that impact a whole bunch of customers downstream.
MD: Does anything surprise you anymore? Or have you become completely desensitized?
TH: I do see stuff occasionally where I think: “Wow, I’ve just learned about a whole subculture genre that I never knew existed.” And my eyes have been opened.
I’m also fascinated by how many government and corporate email addresses are in these breaches. You can’t help but go: “Have you not been paying attention?”
Putting aside whether or not you should be using these services, you definitely shouldn’t be using them with your work email address, and I’m fascinated that that continues to be a thing.
MD: Which breaches over the last 10 years really stand out to you?
TH: Ashley Madison. No hesitation. Ashley Madison was fascinating for so many reasons. I remember at that time – and this was 2015 – people were saying: “Is this going to be the wake-up call? Is this going to be the one where everyone goes, ‘Oh yeah, we should be more careful with our personal data and our cybersecurity?'”
Of course, it didn’t change anything. But what was so fascinating about it was the perfect combination of factors that resulted in massive media attention and a huge human impact. Just in the last year we’ve seen two different documentaries come out about this incident. It’s fascinating that it’s a mainstream thing. I’ve had random people that I know pop up and go: “Hey, I saw you on the Ashley Madison documentary thing.” It’s like: “Well, okay, normal people are watching this.”
I still haven’t seen anything that’s compared to that in terms of overall impact. We’ve seen much bigger breaches. We’ve seen breaches that have leaked arguably more sensitive information when it relates to things like health. But that’s just the one that captured everyone’s attention.
MD: Health information is probably more sensitive, but [Ashley Madison] was a service that customers didn’t want other people knowing they were using.
TH: I think because it was salacious that it got extra attention. Even before the breach, Ashley Madison had attracted a lot of ire from the masses about the ethics and morality of adultery, and it was an interesting thing to watch. It feels like American daytime talk show TV.
Of course, we learned so much as the incident unfolded about the mechanics of how the organization worked, about the bots, and the way their business model operated. That’s something I find fascinating about breaches in general. It peels back the veneer and you get to see what’s actually happening underneath.
MD: How do you verify and investigate a data breach? Has that process changed over the last 10 years?
TH: It hasn’t overtly changed. I’ve written before about how I verify data breaches but in a nutshell, being able to reliably attribute the source of an incident is enormously important for a couple of reasons.
One is that the individuals in a data breach really want to know who mishandled their data. Where do they go to complain or get upset or ask their records to be deleted or whatever it may be.
The other is, from my own self-preservation point of view, I want to get this right. I don’t want to go out there and say: “Hey, ACME Corp had a data breach” and then discover it wasn’t ACME Corp. Or it was someone completely different, and that then damages their organization. Verification and attribution to a source is important.
“I want to get this right."
Here’s one of the easiest ways to do that. A data breach has got lots of email addresses in there. Let’s say it’s a million email addresses. There’s almost always Mailinator addresses in there. And Mailinator is a public mailbox. You can send a mail to troy[at]mailinator.com, and then go to Mailinator.com, and put Troy in there, and you’ll see the mailbox. No one creating a Mailinator account ever expects privacy, they expect it to be a public mailbox.
So let’s say I look into a data breach, and out of the million email addresses I’m able to grab several Mailinator addresses. I can then go to the password reset form on the service that the breach was alleged to have come from, and put one of those Mailinator addresses in, and then see the reset email go to the correct inbox. It’s like, wow, what are the chances?
I’ve got almost five million subscribers on Have I Been Pwned at the moment. That’s for the freebie ‘enter your email address’ service. I’ll let you know if you turn up somewhere. Occasionally I will reach out to a bunch of those that are in a new breach and say: “Look, you signed up to this service. I think you might have been in a breach. Can you help me verify if this is legitimate? Did you use that service? Did you put your email address in there?”
A lot of the time people have got receipts. They bought something, or they kept the welcome email. Or, particularly if disclosure is required, I’ll go to the company itself and say: “Look, I’ve got this data, someone sent it to me, I think it’s yours. You should know about it.” There’s nothing like confirmation from the organization itself to be completely confident of the source.
MD: In the beginning, did companies believe that disclosure was optional, or go: “Nah, we didn’t do it.” Have things got better behind closed doors?
TH: I look at disclosure in two parts. There’s my disclosure to the organization of letting them know, and that’s pretty much the same for me. I’d say it’s still painful. It’s still one of the hardest things I do.
And then there’s the disclosure of the organization to the impacted individuals. I do fear, just as a gut feel, that it is worse now than what it was. The sense I get around this is a combination of things – it seems to go from talking to an organization to talking to their lawyers very, very quickly.
What I mean by that is not necessarily me getting lawyer letters – that does happen every now and then, and we have a nice chat and so far, it’s been OK. But it seems to very quickly go into damage control on behalf of the organization. I suspect a lot of that is due to the prevalence of class actions that happen pretty much overnight now every time there’s a data breach. I feel that organizations are going into self-preservation mode to try and protect their interests, and if they’re public, the shareholders’ interests as well – at the detriment of the individuals in the breaches.
“Organizations are going into self-preservation mode to try and protect their interests."
I’ve just seen so many incidents where these organizations are simply not notifying individuals. I lament the fact I feel this burden of responsibility. I have this data and millions of subscribers, and it’s up to me to let them know, and to do something that the organization should do. Really, the best outcome for Have I Been Pwned would be for it to be redundant, but we’re going in the opposite direction. Organizations aren’t doing disclosure at all in some cases.
MD: I’m really surprised about that. I would have thought it’s getting better.
TH: Some people would argue: “Well, there’s also more regulation.” Since I started this, we’ve had GDPR (the UK’s General Data Protection Regulation), the CCPA (California Consumer Privacy Act), a mandatory data breach disclosure scheme in Australia as well – different parts of the world have implemented regulatory controls. But what I think a lot of people don’t understand is that a lot of these regulatory controls don’t mandate disclosure to the individuals in a breach. They usually mandate disclosure to the local regulator.
I’ve seen so many examples. Europe is probably a particularly good one because people are like: “Let’s go and GDPR these guys.” I feel like it’s a verb. “We’re going to go and GDPR this company because they didn’t disclose to us.” But they disclosed to the local regulator and there are only certain conditions that need to be met in order for them to have to tell you as well. That I find is a bit of a shame. As a good faith thing, and frankly a corporate responsibility thing: if you’ve lost someone’s data, let them know. That doesn’t seem too hard to me.
MD: What are some of the most common vulnerabilities that currently lead to data breaches?
TH: Scraping has definitely been more prevalent in recent years.
I can think of multiple examples where there are services that would intentionally expose some data publicly. Then someone will go through, and they’ll enumerate it (systematically collect data on a large scale with automated tools). Instead of you using a service that has an API that pulls back someone’s profile, and you see just a little bit of data for that person, someone has now gone and pulled out millions, tens of millions, hundreds of millions of records by enumerating through this collection of API endpoints.
Many of these APIs will take an email address and come back and give you information about it. They’ll collect this huge amount of data and then go: “This is the ACME Corp scraped data breach.”
I’ve seen a lot more of that and I suspect part of it is because we have so many services that are, by design, exposing little bits and pieces of information. Let’s say, about someone’s profile. And there’s an argument that if it’s scraped data, it’s not even a breach, because the data was literally meant to be publicly accessible. I’ve written before about whether or not a scrape is a breach.
To my mind, in any case where there’s data that has been misused from the fashion in which it was provided and it was expected to be used, then that does constitute a breach. That’s something that we’ve seen many times, particularly with some of the big social platforms. That includes Facebook, X, LinkedIn – they’ve all got large scraped data breach corpuses in Have I Been Pwned now.
MD: I’d never really thought about that. It’s small bits of information that bundled together and then become dangerous.
TH: LinkedIn is a great example because that was a massive scrape. The question is: If someone scrapes your data, and your personal attributes, and the things that, by design, you’ve given to LinkedIn to be discoverable by other people, but they’ve siphoned up your profile and millions of others’ personal profiles, would you want to know about that? Would that bother you? Most people say: “Yeah, that’s not what I gave my data for.” Of course, scraping is a complete violation of all the terms and services, but that’s not really what hackers worry about, is it? They just want the data.
MD: I think I’d be bothered by it because when you give information to LinkedIn, there is a purpose to that. I’d like to be aware when my information is used badly.
TH: Exactly. My fear is that we end up having so many different incidents that make a lot of noise that people become a little bit tired (“data breach fatigue” is the phrase I hear sometimes), and are then unwilling to act when there are incidents that happen that do really need their attention.
I think unless a data breach has some tangible impact on someone, that they have money lost, or their identity stolen, I suspect they’re starting to become a little bit nonchalant to it.
“I suspect [people] are starting to become a little bit nonchalant to it."
MD: Is there anything in the current zeitgeist that excites, scares, or angers you when it comes to cybersecurity?
TH: Lack of disclosure. What angers me? I don’t need much time to think about that. The fascinating thing is, as Have I Been Pwned has become more mainstream and more accepted, I’ve spent a lot more time with people in law enforcement, and people in politics, and people who are making the regulations, and I’ve been interested to see their position on all these things. It’s so well aligned with ours. They’re like: “Yeah, of course people should be told about data breaches” and “Yeah, we need to clamp down on these organizations.” But I see very little actual change.
Maybe this is just part of that age-old problem of technology moving forward so quickly, and the law takes a long time to catch up to it, but it just feels like we are much further behind now than where we were before. Maybe that’s just reflective of the exposure I’ve got now, too – but there’s a gap that we need to try and fill.
“It just feels like we are much further behind now than where we were before."
MD: For the average person who wants to improve their online security and privacy, do you have go-to practical steps that they can take to protect themselves against this?
TH: I still feel that having a password manager is the number one thing. The prevalence of reuse is nuts. We can trace back so many different incidents, whether it’s the reused password of the individual who’s the victim, it’s the business email compromise situation, or whether it’s a corporate account, which then has the keys to the cloud. So many of these things tie back to compromises of passwords that are just the bare-bones basics.
On top of that, the continual lack of multifactor authentication. That is still a stunning thing, particularly when we’ve got so many different ways of doing it now. We’ve got the emergence of passkeys, which really didn’t exist only a few years ago. That’s great.
We’ve got U2F (Universal 2nd Factor) keys, we’ve got authentication, we have so many different ways of doing this now. It’s still all the same fundamentals, and even as we move forward into passwordless, regardless of which variety that we look at, we are still gathering passwords faster than we’re discarding them. This problem really still keeps growing and growing.
“Even as we move forward into passwordless, we’re still gathering passwords faster than we’re discarding them."
I remember even 10 years ago, I’d do interviews and people would go: “Are we still going to have passwords in 10 years?” Now when I talk to people, I’m like: “Do you have more passwords now than what you had 10 years ago?” And everyone’s like: “Yeah, because the old ones don’t die. I’ve got more than I’ve ever had before.”
I still feel that this is just the absolute heart, and it’s the low-hanging fruit, too. We have easy solutions to this.
MD: What do you see as the biggest challenge in combating all of this in the next few years? Do you think the industry is prepared for it?
TH: I think so much of it is the usability factor, the human interaction side of things. I’ve found it fascinating over the years when I’ve written about security, to keep coming back to: why do certain things get traction, or why do certain things not get traction?
Well, because they’re consumable, because humans can use them. Or conversely, because humans struggle with them. Why does multifactor have such poor uptake when it is such an effective tool? Because humans don’t like that it gets in the way.
Why do many organizations not force two-factor on their customers? Because a lot of people don’t like it, and it creates a barrier to entry, and they lose customers.
I think we still have this big challenge of how do we make security implicit and acceptable to the masses? I wonder if part of that is that you’ve just got to get them young enough. I was talking about password managers a couple of days ago. I said, “Well, my 12-year-old daughter has been using one for years. She’s in our Family 1Password vault. She logs onto everything with unique passwords and she never thinks twice about it. She’s like, “It’s so easy.”
“We still have this big challenge of how do we make security implicit and acceptable to the masses?"
Maybe we’ve got to get a foothold in the schools and start making security something that kids just grow up with and they don’t think twice about.
MD: Do you think that’s where you can be optimistic about the future of cybersecurity? For example, we have advancements in AI and deep fakes. Do you think it’s literally just a generational problem of awareness?
TH: It’s certainly part of it. I know it’s so cliche to say, but I guess older generations, who haven’t grown up with the technology as a native part of their everyday life, find it harder to get to grips with concepts than kids, who just live with it day in and day out.
I see my kids getting to grips with a lot of technology concepts easier than what I can do, because they’re the digital natives and they’ve just lived with it. I think that’s the opportunity. And then of course, as you’ve mentioned, things like AI are going to evolve very quickly. It will be a very different landscape when my kids are probably even just 10 years older, let alone when they’re my age. So that is a fascinating area, isn’t it?
MD: Where can listeners go to learn more about you and the projects that you’re working on?
TH: Everything’s on TroyHunt.com. All roads lead from there.
Tweet about this post