Scraping Central is reader-supported. When you buy through links on our site, we may earn an affiliate commission.

4.36advanced5 min read

DataDome, PerimeterX, Akamai, Kasada, Survey

A practical tour of the bot-management vendors you'll encounter besides Cloudflare. What each is known for and how scrapers approach them.

What you’ll learn

  • Identify the major anti-bot vendors and their tells.
  • Match strategy to vendor characteristics.
  • Recognize signals that tell you which vendor is in front of you.

Cloudflare dominates conversations, but the bot-management vendor market is bigger. Each vendor has different fingerprinting emphasis, different telltale signs, and different difficulty levels. Recognizing which you're facing shapes your approach.

The big five (besides Cloudflare)

Vendor Acquired by Known for
DataDome Independent Mid-tier; clear deny page with captcha widget
PerimeterX HUMAN (2022) Behavioral analysis; mobile-heavy
Akamai Bot Manager Akamai HTTP/2 fingerprinting; airlines, banking
Kasada Independent Active JS challenge; "DDoSes back" reputation
F5 Shape (Distil) F5 Enterprise; legacy installs
Imperva (Distil → Imperva) Imperva WAF + bot

Smaller / newer players: Castle, Arkose Labs (FunCaptcha), Cybersource (3DS-adjacent).

DataDome

Telltale: a specific "denied" page with logos/links that often include "ddengine.io" assets, or a popup CAPTCHA from geo.captcha-delivery.com.

Strategy:

  • Strong on JA3 + behavioral analysis. Plain Python requests fail almost always.
  • curl-cffi sometimes works for less-aggressive setups.
  • Browser + stealth + residential is the standard counter.
  • DataDome's own anti-bot includes CAPTCHA fallback; have a solver wired up if you commit to bypassing.

DataDome is on many e-commerce sites and news sites. Mid-difficulty overall.

PerimeterX (now HUMAN)

Telltale: requests are blocked with a generic "Access Denied" page; cookies named _px* (_pxhd, _pxvid). The endpoint /init.js is a classic PerimeterX bootstrap.

Strategy:

  • Behavioral focus. They score based on mouse movement, scroll, timing.
  • Browser + stealth + behavioral simulation (slow human-like actions).
  • Strong on mobile-app endpoints, PerimeterX is often layered into native apps too.

PerimeterX is on many large consumer brands. High difficulty; behavioral mimicry matters more than fingerprint.

Akamai Bot Manager

Telltale: requests are blocked with reference numbers like Reference #18.x.x.x. Often paired with Akamai's CDN (akamaihd.net).

Strategy:

  • Heavy on HTTP/2 fingerprinting (Akamai's published research is partly the reason this layer matters).
  • TLS fingerprinting also.
  • Without curl-cffi or a real browser, success rate is near-zero.
  • For airline / banking sites, often requires premium residential or mobile.

Akamai is on airlines, banks, governments. Highest difficulty among the surveyed vendors.

Kasada

Telltale: pages stall on a JS challenge for ~1 second, then resolve (or block). They use POW (proof-of-work), your CPU has to compute something before being allowed in.

Strategy:

  • Real browser required, the POW must execute.
  • Even browsers can be flagged if behavior is bot-like.
  • Reputation: aggressive; said to be one of the harder vendors.

Kasada is on niche-but-high-value targets (some ticketing, some e-commerce). Less prevalent but tough.

F5 Shape (Distil)

Telltale: legacy header X-D-Token or similar, often paired with mid-2010s tech stacks.

Strategy:

  • Aging installs; depends heavily on which version is deployed.
  • Browser + stealth usually works.
  • Less ML-heavy than newer vendors.

F5 Shape is on enterprise legacy installs. Variable difficulty.

Imperva

Telltale: a visid_incap_* cookie. The challenge page may show "Incapsula" in source.

Strategy:

  • WAF + bot. Often pairs with rate limiting.
  • Browser + stealth + good IPs usually sufficient.
  • For sophisticated installs, similar to Akamai-class difficulty.

Imperva runs on a wide range of sites; mid-to-hard.

How to identify the vendor

The cheapest heuristics:

  1. Response page contents. Distinctive blocked-page text or branding.
  2. Cookie names. _px* = PerimeterX/HUMAN. cf_* = Cloudflare. visid_incap_* = Imperva. datadome = DataDome.
  3. JS endpoint names. Bootstrap scripts often reveal the vendor.
  4. Reference numbers. Akamai's "Reference #" is distinctive.

A purpose-built tool: wafw00f (open source) sniffs for vendor signatures.

wafw00f https://target.com/

Outputs the most likely WAF/bot vendor based on multiple probes.

Universal strategy

Regardless of vendor, the playbook structure is similar:

  1. Coherent headers. Always.
  2. Residential or mobile IPs. Per target tier.
  3. TLS fingerprint. curl-cffi or browser.
  4. JS execution. Browser + stealth where required.
  5. Behavioral mimicry. For high-value targets.
  6. CAPTCHA solver. Last line for explicit challenges.
  7. Commercial unblocker. Time/cost tradeoff for hardest tier.

What changes by vendor: the layer at which detection fires hardest. DataDome's JA3 emphasis means TLS spoofing pays off most. PerimeterX's behavioral focus means slow-and-natural movement matters. Akamai's HTTP/2 sensitivity means curl-cffi level coverage is mandatory.

Per-vendor difficulty estimate (your mileage varies)

Vendor "Reasonable bypass possible with off-the-shelf tools"
Cloudflare BFM Yes
Cloudflare SBFM Mostly
Cloudflare Bot Mgmt Sometimes
DataDome Mostly
Imperva Mostly
F5 Shape Sometimes
PerimeterX (HUMAN) Sometimes
Akamai Bot Manager Sometimes; sometimes requires premium
Kasada Hard

For all "sometimes" and "hard" cells, commercial unblockers are often more cost-effective than DIY engineering.

A reconnaissance routine

When approaching a new target:

  1. wafw00f https://target.com/, vendor guess.
  2. curl -v https://target.com/, see what status, what headers, what cookies.
  3. Open the target in a fresh browser → DevTools → Network. Note JS bootstrap files, response headers.
  4. Try the simplest possible scrape (requests). If it works, you're done. If not, escalate.
  5. Document the vendor, the level, and what works in your project notes.

This reconnaissance routine saves weeks of "throw tools at it" frustration.

Hands-on lab

Pick three sites you've considered scraping. For each:

  1. Run wafw00f to guess the vendor.
  2. Hit with default Python requests. Note the response.
  3. Examine the cookies set. Match to the vendor table above.

Each target now has a strategy attached. The right tool for each is much clearer when you know which vendor you're up against.

Quiz, check your understanding

Pass mark is 70%. Pick the best answer; you’ll see the explanation right after.

DataDome, PerimeterX, Akamai, Kasada, Survey1 / 8

Cookies named `_pxvid`, `_pxhd` indicate which anti-bot vendor?

Score so far: 0 / 0