Scanning for breached accounts with k-Anonymity - Mozilla Security Blog
Categories: Privacy Security

Scanning for breached accounts with k-Anonymity

The new Firefox Monitor service will use anonymized range query API endpoints from Have I Been Pwned (HIBP). This new Firefox feature allows users to check for compromised online accounts while preserving their privacy.

An API request reveals sensitive data about the requesting party.

An API request can reveal subject identifiers like cookies, IP address, etc.

Anonymizing Account Identifiers

Operations like ‘search’ often need plaintext, or simply-hashed data. But, as Cloudflare has described in their own HIBP integration, searching with plain account data introduces privacy & security risks that allow an adversary, or even the service itself, to use the data to breach the searched account.

As an alternative, a user search client could download an entire set of data. Unfortunately this practice discloses all the service data to the client, which could abuse the data of all other users.

Anonymized Data Sharing

To mitigate these risks, Mozilla is working with Troy Hunt – creator and maintainer of HIBP – to use new hash range query API endpoints for breached account data in the Firefox Monitor project.

Hash range queries add k-Anonymity to the data that Mozilla exchanges with HIBP. Data with k-Anonymity protects individuals who are the subjects of the data from re-identification while preserving the utility of the data.

When a user submits their email address to Firefox Monitor, it hashes the plaintext value and sends the first 6 characters to the HIBP API. For example, the value “test@example.com” hashes to 567159d622ffbb50b11b0efd307be358624a26ee. We send this hash prefix to the API endpoint:

GET https://haveibeenpwned.com/api/breachedaccount/range/567159

The API responds with many suffixes and the list of breaches that include the full value:

[
  {
    "HashSuffix": "D622FFBB50B11B0EFD307BE358624A26EE",
    "Websites": [
      "LinkedIn"
    ]
  },
  {
    "HashSuffix": "0000000000000000000000000000000000",
    "Websites": [
      "Dropbox"
    ]
  },
  {
    "HashSuffix": "1111111111111111111111111111111111",
    "Websites": [
      "Adobe",
      "Plex"
    ]
  }
]

When Firefox Monitor receives this response, it loops thru the objects to find which (if any) prefix and breached account HashSuffix equals the the user-submitted hash value. The following pseudo code describes the algorithm in more detail:

if (fullUserHash === userHashPrefix + breachedAccount.HashSuffix)

Using the running example from above, for the first HashSuffix, the expression evaluates to:

if (‘567159D622FFBB50B11B0EFD307BE358624A26EE’ ===
    ‘567159’+‘D622FFBB50B11B0EFD307BE358624A26EE’)

Firefox Monitor discovers that “test@example.com” appears in the LinkedIn breach, but does not disclose plaintext or even hashes of sensitive user data. Further, HIBP does not disclose its entire set of hashes, which allows Firefox users to maintain their privacy, and protects breached users from further exposure.

Brute Force Attacks

Hashed data is still vulnerable to brute-force attacks. An adversary could still loop thru a dictionary of email addresses to find the plaintext of all the range query results. To reduce this attack surface, Firefox Monitor does not store the range queries nor any results in its database. Instead, it caches a user’s results in an encrypted client session. We also monitor our scan endpoint to prevent abuse by an adversary attempting a brute force breached-account enumeration attack against our service.

Helping Subjects of Data Breaches

HIBP contains billions of records of email addresses. Troy has done an outstanding job to raise awareness and educate users about breaches globally. Breached sites embrace HIBP, even self-submitting their breached data. HIBP is there to help victims of data breaches after things go wrong, and Firefox Monitor is extending that help to more people.