Inside Combolists: How Hackers Build Stolen Credential Data

Digital identity is now one of the easiest entry points into an organization, as data breaches often stem from weak, reused, or stolen credentials. Attackers exploit this through credential stuffing, deploying bots to test massive lists of leaked username:password pairs on login pages.

At the heart of these automated attacks lie combolists — vast, curated compilations of stolen email and password pairs traded and constantly updated on the dark web. Designed to boost login success rates, combolists serve as vital intel for Cyber Threat Intelligence (CTI) teams and security leaders.

In this blog, we dive into combolists from a threat-intelligence viewpoint: what they are, how hackers create and refresh them, their circulation spots, and insights from stats on password reuse and exposure risks.

To understand their impact, let’s start with a closer look at what a combolist actually is and how attackers build them.

What is a combolist in credential stuffing?

A combolist is a curated file containing large volumes of email/username and password pairs, most commonly in user:pass format, designed to be fed directly into automated credential stuffing and account checking tools.

Unlike simple password dictionaries, combolists contain real stolen credentials tied to real identities, dramatically increasing the chances of successful account takeover when attackers target popular services and enterprise portals.

In the wild, combolists often originate from multiple heterogeneous sources, including breach dumps, infostealer malware logs, and database exposures that are later cleaned and normalized by criminals.

Modern combolists are frequently rebuilt from fresh infostealer data and other recent leaks, and some public research has shown that these “fresh” lists can reach very high validity rates, making them far more dangerous than old, static dumps.

Combolists are widely traded on dark web forums and in closed Telegram channels dedicated to leaked credentials and credential stuffing tools, where attackers promote “fresh” stolen credential datasets to other criminals.

Figure 1: Combolist section in [Creacked.sh](http://creacked.sh/) forum (one of the largest forums for cracking and combo lists).
Figure 2: JoghodTeam as an example of threat actors selling new and fresh combolists on Telegram

These rebuilding cycles lay the groundwork for understanding how combolists are created, updated, and circulated across the Darkweb ecosystem.

How combolists are created and updated

Stolen credential data rarely exists as a neat, ready‑to‑use combolist; instead, it starts as raw breach dumps, infostealer logs, phishing kit outputs, or misconfigured database exposures leaking massive volumes of login information.

These raw sources are messy: they may mix multiple formats, contain duplicates, partial data, or even fake entries injected by defenders or by other criminals.

Threat actors and “combo makers” aggregate this material, often combining multiple breaches and stealer campaigns into a single pool of credential data before preparing specialized combolists for particular targets or tools.

Over time, this process has become its own underground service economy, with some actors specializing in collecting, cleaning, and refreshing stolen credential data sets that others later weaponize in credential stuffing operations.

Figure 3: an example of threat actors selling new and fresh combolists on Telegram with specific category form gaming High Quality and USA Targeted Combolist

To understand these raw inputs better, we can break down the main data sources feeding modern combolists.

Combolist data sources

Public and private data breaches

  • When organizations are breached, usernames and passwords often end up for sale or free download, and criminals later merge many separate leaks into large combolists.
  • Well‑known mega‑collections (like “Collection #1–5” or “Anti‑Public”) were created exactly this way: combining credentials from many historic breaches into a single, more useful list for credential stuffing.

Infostealer malware logs

Infostealer families such as RedLine, Vidar, LummaC2, and Lumma steal browser‑saved logins, cookies, and autofill data from compromised endpoints, producing logs full of very fresh, actively used credentials.

These logs have become one of the main raw inputs for modern combolists, especially those circulating in underground markets focused on credential stuffing and account takeover.

From ULP Logs to Combolists

Threat actors and data brokers often convert raw stealer outputs into the ULP (URL:User:Pass) format — a structure that explicitly links each credential to a specific site or application (for example:

https://portal.example.com | user@example.com | Passw0rd!).

Figure 4: an example of ULP Record from post ULP txt file on Telegram

Because ULP entries encode the target URL, combolists built from this data tend to show much higher overlap with known stealer campaigns and deliver better hit‑rates than older, generic email:password lists collected from historic breaches.

These infostealer-derived credentials often form the freshest tier of combolist data before being merged with other stolen sources.

Compromised websites and database dumps

Many threat actors deliberately target specific types of web applications so they can control both the geographic origin and business vertical of the credentials they collect in their combolists.

By focusing on websites that primarily serve a particular country/region or industry sector (e.g. streaming services, payment gateways, online gaming platforms, e-commerce, banking, government portals, etc.), attackers can later advertise much more valuable country‑specific or sector‑specific combolists.

These targeted lists are far more attractive to buyers who are looking for high‑value accounts such as banking credentials from a specific country, streaming/gaming accounts in a certain region, fresh e-commerce logins from wealthy markets, or government/corporate email:password pairs.

This targeted harvesting strategy sets the stage for the next phase: automating large-scale data extraction through reconnaissance and SQL injection tools.

How the Process Typically Works

1. Automated Reconnaissance

Attackers use advanced Google/Bing dorks , other dorks Masscan, or custom scripts to harvest tens or hundreds of thousands of potentially vulnerable URLs that match their target criteria (language, country code in WHOIS, specific CMS/shopping cart software, etc.).

2. Mass SQL Injection Scanning & Exploitation

They then perform bulk automated SQL injection testing using one of the most infamous and enduring tools in the underground scene: SQLi Dumper (commonly referred to as SQL Dumper).

What is SQLi Dumper?

SQLi Dumper sits at the heart of this automation. It is a Windows‑based, GUI‑driven automation tool specifically built for detecting, exploiting, and extracting data from websites vulnerable to SQL Injection at massive scale.

Its extreme popularity in cybercrime communities has kept it relevant for years.

Figure 6: Threat Actor Post a POC for Exploiting a Multiple Databases with Different Countries

In the current underground ecosystem, SQL injection “dumpers” are no longer just tools for pulling database contents. Many cracked or “free” builds are quietly bundled with infostealers that harvest the operator’s own credentials and environment data.

Threat actors advertise these cracked SQLi utilities in forums and Telegram channels as a way to mass‑extract login tables for new combolists. In practice, however, many of these builds are trojanized — silently deploying infostealers that exfiltrate the operator’s own credentials while pretending to perform SQL dumping.

Figure 7: Cracked SQL Dumper in hacking forum

This means a threat actor who thinks they are only pulling SQL credential dumps may also be leaking their own data.

As soon as a victim’s credentials are used by an inexperienced attacker, they may already be in another actor’s private list — ready for credential stuffing against banking, SaaS, or corporate SSO targets.

Figure 8: Technical Analysis on Triage for Previous Cracked SQL Dumper contain a Stealer

This irony underscores how data theft within the underground is self‑perpetuating: attackers often end up victims of the same ecosystem they help feed.

Figure 9: Building Combolist Flow

How combolists power ATO attacks

Once stolen credentials are collected, cleaned, and assembled into combolists, they become the primary fuel for credential stuffing attacks. At this stage, leaked credentials transition from static data into active attack assets, enabling threat actors to convert simple email:password pairs into fully compromised user accounts at scale.

This phase represents a critical shift: attackers move from passive data possession to real-time account abuse, targeting sectors such as banking, streaming platforms, and online gaming, where account access can be rapidly monetized or abused.

Automation at Scale

Threat actors load combolists into specialized credential stuffing tools, often referred to as account checkers. These tools automate the process of replaying stolen credentials against login endpoints with minimal human involvement.

Commonly used tools include:

  • Sentry MBA
  • OpenBullet
  • Storm
  • Snipr
  • Checkers

These platforms are purpose-built to handle high-volume login attempts efficiently while minimizing detection.

Core Components of Credential Stuffing Operations

Credential stuffing tools typically rely on three essential components:

  1. Combolist (Wordlist)A structured list of stolen username:password combinations.
  2. Site-Specific Configuration (Config)Custom logic that defines how the target website’s authentication flow works, including request formatting and success/failure detection.
  3. Proxy InfrastructureA rotating pool of proxies or bot infrastructure used to distribute login attempts and evade rate-limiting, IP blocking, and basic bot defenses.

Together, these components enable attackers to test millions of credentials rapidly, identifying valid accounts that can then be exploited for fraud, resale, or further compromise.

However, OpenBullet’s extensibility and ecosystem around configs have made it a de facto standard within many credential stuffing operations.

Configs as Encoded Authentication Logic

Configs as Encoded Authentication LogicIn tools such as OpenBullet and SilverBullet, a config is not an exploit or a vulnerability by itself.

It is a machine-readable recipe that precisely describes how a particular website’s login/authentication flow works — so the tool can replay that exact sequence automatically, at high speed and large scale.Think of it like a very detailed instruction manual that tells the software:“Do these HTTP requests in exactly this order, with these headers, this formatting, and these extracted values — and this is how you’ll know whether the credentials worked.”How this logic is built in practice (examples drawn from real SilverBullet / OpenBullet config Makers):

Before writing a single line of code, the attacker must understand how the target communicates.

  • Traffic analysis & payload mapping Capture the real login POST request in the browser’s Network tab, then replace the actual username and password with the tool’s standard variables:login=<USER>&password=<PASS>&remember_me=on(classic simple example)
Figure 10: The attacker navigates to the login page (https://target.com/login) and opens the browser’s Developer Tools

Figure 11: Attacker Login as Normal user and Started To Capture Success Login

this is the critical step where a manual login is converted into an automated weapon.
The attacker creates a REQUEST Block and replaces the specific dummy data with the tool’s global variables.

Figure 12: Attacker start copy login data from the browser to Config Maker

The attacker runs the tool’s Debugger with invalid data to trigger a rejection.
They inspect the HTML source code returned by the server to find a unique error string.

  • Identified String: “The username or password you entered was not found in our database.”

The attacker tests credentials to Study the Behavior in response for all cases

Figure 13: Image Shows Debugging Section in Cracking Tool

after this attacker must identifying Success (The Redirect)

For a successful login, the server’s behavior changes.

The guide notes that instead of returning the login page HTML, the server redirects the user to the dashboard.

  • Trigger: The URL changes to https://target.com/dashboard.

The attacker configures a Success Keychain. Crucially, they verify the <ADDRESS> (the URL bar) rather than the source code. If the URL contains “dashboard,” the account is flagged as a “Hit” and saved.

High-Value Targets Don’t Give In That Easy

A basic HTTP POST config fails against most modern, high-value targets. Attackers must escalate to what underground communities call “Advanced Logic” — dynamic handling for three key defensive layers.

The CSRF Problem. Many login pages embed a hidden, randomly generated CSRF token that changes with every page load, making static credential replays impossible. To bypass this, attackers insert a PARSE block before the login request. Using Left/Right (LR) string logic, the tool scrapes the live HTML response to extract the token in real time — saving it to a variable (e.g., <csrf>) and injecting it dynamically into the POST body, so each attempt looks like a fresh, legitimate session.

The JSON Problem. SPAs and mobile backends often skip HTML forms entirely, authenticating via JSON APIs and returning an Authorization Bearer Token (access_token) instead of a session cookie. Attackers re-engineer their configs with JSON parsing blocks to extract this token, then inject it as an Authorization: Bearer <tk> header in all subsequent requests — enabling them to query account data such as balances, subscription tiers, or stored payment methods to assess resale value.

How Organizations Defend Against Credential Stuffing

Understanding how attacks are constructed makes the defensive countermeasures intuitive. Each layer of the attack chain has a corresponding control that breaks it.

1. Multi-Factor Authentication (MFA)

MFA is the single most effective control against credential stuffing. Even if an attacker obtains a valid username and password from a combolist, a second factor — TOTP code, push notification, or hardware key — makes the stolen credential worthless. Organizations should enforce MFA on all externally facing portals, especially SSO, email, and financial systems.

2. Credential Breach Monitoring

Integrating against breach intelligence feeds (such as HaveIBeenPwned’s API, or commercial CTI platforms) allows organizations to proactively identify when employee or customer credentials appear in combolists. Accounts matching known-breached credentials can be force-reset before attackers ever attempt to use them.

3. Bot Management and Behavioral Detection

Dedicated bot management platforms (Akamai, PerimeterX/HUMAN, Cloudflare, DataDome) analyze behavioral biometrics, device fingerprints, and request patterns to distinguish automated stuffing traffic from real users. As covered earlier, these are the exact systems that force attackers to invest in expensive solver infrastructure — meaning deployment alone dramatically raises the cost and complexity of attacking your platform.

4. Rate Limiting and IP Reputation

Applying aggressive rate limits on login endpoints — throttling by IP, device fingerprint, and user agent — disrupts the high-volume nature of credential stuffing. Combining this with IP reputation feeds that block known proxy ranges, Tor exit nodes, and datacenter ASNs significantly reduces attacker throughput, even when proxies are in use.

5. CAPTCHA and Device Fingerprinting

Deploying CAPTCHA challenges on repeated failed login attempts adds friction that automated tools must pay to bypass (via CAPTCHA-solving services). Device fingerprinting complements this by flagging sessions where browser characteristics are inconsistent, absent, or reused across many accounts — a strong signal of automated activity.

Conclusion

Combolists represent one of the most persistent and scalable threats in the modern credential abuse landscape. Their continuous refresh cycle — fed by infostealers, breach dumps, and targeted database compromises — means that even organizations with strong perimeter defenses remain at risk if their users have been compromised elsewhere.
For CTI teams and security leaders, monitoring underground combolist activity, correlating infostealer telemetry, and deploying credential breach detection into authentication pipelines are now essential defensive capabilities — not optional enhancements.
The most dangerous combolists are not the oldest ones — they are the freshest. Prioritizing detection of recently exfiltrated credentials is the most effective way to reduce account takeover risk.

Free Dark Web Report

Keep reading

Threat Actor Profile

Threat Actor Profile: APT27

Who is APT27? APT27 — also known as Emissary Panda, Iron Tiger, and LuckyMouse — is a Chinese state-sponsored cyber-espionage…