DataDome, PerimeterX, Akamai, Kasada, Survey
A practical tour of the bot-management vendors you'll encounter besides Cloudflare. What each is known for and how scrapers approach them.
What you’ll learn
- Identify the major anti-bot vendors and their tells.
- Match strategy to vendor characteristics.
- Recognize signals that tell you which vendor is in front of you.
Cloudflare dominates conversations, but the bot-management vendor market is bigger. Each vendor has different fingerprinting emphasis, different telltale signs, and different difficulty levels. Recognizing which you're facing shapes your approach.
The big five (besides Cloudflare)
| Vendor | Acquired by | Known for |
|---|---|---|
| DataDome | Independent | Mid-tier; clear deny page with captcha widget |
| PerimeterX | HUMAN (2022) | Behavioral analysis; mobile-heavy |
| Akamai Bot Manager | Akamai | HTTP/2 fingerprinting; airlines, banking |
| Kasada | Independent | Active JS challenge; "DDoSes back" reputation |
| F5 Shape (Distil) | F5 | Enterprise; legacy installs |
| Imperva (Distil → Imperva) | Imperva | WAF + bot |
Smaller / newer players: Castle, Arkose Labs (FunCaptcha), Cybersource (3DS-adjacent).
DataDome
Telltale: a specific "denied" page with logos/links that often include "ddengine.io" assets, or a popup CAPTCHA from geo.captcha-delivery.com.
Strategy:
- Strong on JA3 + behavioral analysis. Plain Python requests fail almost always.
- curl-cffi sometimes works for less-aggressive setups.
- Browser + stealth + residential is the standard counter.
- DataDome's own anti-bot includes CAPTCHA fallback; have a solver wired up if you commit to bypassing.
DataDome is on many e-commerce sites and news sites. Mid-difficulty overall.
PerimeterX (now HUMAN)
Telltale: requests are blocked with a generic "Access Denied" page; cookies named _px* (_pxhd, _pxvid). The endpoint /init.js is a classic PerimeterX bootstrap.
Strategy:
- Behavioral focus. They score based on mouse movement, scroll, timing.
- Browser + stealth + behavioral simulation (slow human-like actions).
- Strong on mobile-app endpoints, PerimeterX is often layered into native apps too.
PerimeterX is on many large consumer brands. High difficulty; behavioral mimicry matters more than fingerprint.
Akamai Bot Manager
Telltale: requests are blocked with reference numbers like Reference #18.x.x.x. Often paired with Akamai's CDN (akamaihd.net).
Strategy:
- Heavy on HTTP/2 fingerprinting (Akamai's published research is partly the reason this layer matters).
- TLS fingerprinting also.
- Without curl-cffi or a real browser, success rate is near-zero.
- For airline / banking sites, often requires premium residential or mobile.
Akamai is on airlines, banks, governments. Highest difficulty among the surveyed vendors.
Kasada
Telltale: pages stall on a JS challenge for ~1 second, then resolve (or block). They use POW (proof-of-work), your CPU has to compute something before being allowed in.
Strategy:
- Real browser required, the POW must execute.
- Even browsers can be flagged if behavior is bot-like.
- Reputation: aggressive; said to be one of the harder vendors.
Kasada is on niche-but-high-value targets (some ticketing, some e-commerce). Less prevalent but tough.
F5 Shape (Distil)
Telltale: legacy header X-D-Token or similar, often paired with mid-2010s tech stacks.
Strategy:
- Aging installs; depends heavily on which version is deployed.
- Browser + stealth usually works.
- Less ML-heavy than newer vendors.
F5 Shape is on enterprise legacy installs. Variable difficulty.
Imperva
Telltale: a visid_incap_* cookie. The challenge page may show "Incapsula" in source.
Strategy:
- WAF + bot. Often pairs with rate limiting.
- Browser + stealth + good IPs usually sufficient.
- For sophisticated installs, similar to Akamai-class difficulty.
Imperva runs on a wide range of sites; mid-to-hard.
How to identify the vendor
The cheapest heuristics:
- Response page contents. Distinctive blocked-page text or branding.
- Cookie names.
_px*= PerimeterX/HUMAN.cf_*= Cloudflare.visid_incap_*= Imperva.datadome= DataDome. - JS endpoint names. Bootstrap scripts often reveal the vendor.
- Reference numbers. Akamai's "Reference #" is distinctive.
A purpose-built tool: wafw00f (open source) sniffs for vendor signatures.
wafw00f https://target.com/
Outputs the most likely WAF/bot vendor based on multiple probes.
Universal strategy
Regardless of vendor, the playbook structure is similar:
- Coherent headers. Always.
- Residential or mobile IPs. Per target tier.
- TLS fingerprint. curl-cffi or browser.
- JS execution. Browser + stealth where required.
- Behavioral mimicry. For high-value targets.
- CAPTCHA solver. Last line for explicit challenges.
- Commercial unblocker. Time/cost tradeoff for hardest tier.
What changes by vendor: the layer at which detection fires hardest. DataDome's JA3 emphasis means TLS spoofing pays off most. PerimeterX's behavioral focus means slow-and-natural movement matters. Akamai's HTTP/2 sensitivity means curl-cffi level coverage is mandatory.
Per-vendor difficulty estimate (your mileage varies)
| Vendor | "Reasonable bypass possible with off-the-shelf tools" |
|---|---|
| Cloudflare BFM | Yes |
| Cloudflare SBFM | Mostly |
| Cloudflare Bot Mgmt | Sometimes |
| DataDome | Mostly |
| Imperva | Mostly |
| F5 Shape | Sometimes |
| PerimeterX (HUMAN) | Sometimes |
| Akamai Bot Manager | Sometimes; sometimes requires premium |
| Kasada | Hard |
For all "sometimes" and "hard" cells, commercial unblockers are often more cost-effective than DIY engineering.
A reconnaissance routine
When approaching a new target:
wafw00f https://target.com/, vendor guess.curl -v https://target.com/, see what status, what headers, what cookies.- Open the target in a fresh browser → DevTools → Network. Note JS bootstrap files, response headers.
- Try the simplest possible scrape (
requests). If it works, you're done. If not, escalate. - Document the vendor, the level, and what works in your project notes.
This reconnaissance routine saves weeks of "throw tools at it" frustration.
Hands-on lab
Pick three sites you've considered scraping. For each:
- Run
wafw00fto guess the vendor. - Hit with default Python
requests. Note the response. - Examine the cookies set. Match to the vendor table above.
Each target now has a strategy attached. The right tool for each is much clearer when you know which vendor you're up against.
Quiz, check your understanding
Pass mark is 70%. Pick the best answer; you’ll see the explanation right after.