Browser Rendering
/crawl Endpoint
Cloudflare's headless browser crawling API — discover and render entire websites with a single POST request, no infrastructure required.
The /crawl endpoint is part of Cloudflare's Browser Rendering service. It spins up a headless Chromium browser inside Cloudflare's network, discovers pages from a seed URL, renders each one (executing JavaScript), and returns the content — all asynchronously via a job ID.
- Great for AI pipelines — feed crawled Markdown directly into RAG / LLM training
- Full JS rendering — unlike simple fetches, SPAs and dynamic content work correctly
- Respects robots.txt — polite crawling by default
- Incremental crawling —
modifiedSince/maxAgeto re-crawl only changed pages - Static mode — skip JS rendering for faster plain-HTML crawls
- Serverless — browser spins up on demand, terminates when done, no idle cost
- status —
pending·running·completed - records[] — one entry per crawled URL
- records[].html — rendered HTML of the page
- records[].metadata — HTTP status, title, lastModified, URL
- records[].status —
completedorskipped - browserSecondsUsed — actual billable time
- External domains — only the seed domain is crawled by default
- robots.txt blocked — Cloudflare respects crawl rules
- Depth exceeded — past your
maxDepthsetting - Page cap hit — past your
maxPagessetting - Not modified — unchanged since
modifiedSincedate
You need a Cloudflare API token with Account → Browser Rendering → Edit permission. Create one at dash.cloudflare.com/profile/api-tokens.
Step 1 — Start the crawl
curl -X POST "https://api.cloudflare.com/client/v4/accounts/{ACCOUNT_ID}/browser-rendering/crawl" \
-H "Authorization: Bearer {API_TOKEN}" \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com"
}'
Response — job ID returned immediately
{
"success": true,
"result": "82a79af7-590f-434b-b10e-40197b802ef9" // ← job_id
}
Step 2 — Poll for results
JOB_ID="82a79af7-590f-434b-b10e-40197b802ef9"
curl "https://api.cloudflare.com/client/v4/accounts/{ACCOUNT_ID}/browser-rendering/crawl/$JOB_ID" \
-H "Authorization: Bearer {API_TOKEN}"
Optional — with extra parameters
curl -X POST "https://api.cloudflare.com/.../browser-rendering/crawl" \
-H "Authorization: Bearer {API_TOKEN}" \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com",
"maxDepth": 2,
"maxPages": 50,
"responseFormat": "markdown",
"staticMode": false,
"maxAge": 86400
}'
| Parameter | Type | Default | Description |
|---|---|---|---|
| url required | string | — | Seed URL to start crawling from. The crawler stays within the same origin by default.
e.g. "https://example.com"
|
| maxDepth | integer | ∞ |
Maximum link-follow depth from the seed URL.
Set 1 to crawl only the seed page + directly linked pages.
|
| maxPages | integer | ∞ |
Hard cap on total pages crawled. Crawl stops once this limit is reached. Recommended for cost control on large sites. |
| responseFormat | string | "html" |
Format of returned page content.
Options: "html" · "markdown" · "json"
|
| staticMode | boolean | false |
Skip JavaScript rendering for faster plain-HTML crawls. Use when the target site doesn't require JS (reduces browser seconds used). |
| modifiedSince | string | null |
ISO 8601 datetime. Pages not modified since this date are skipped. Enables incremental crawls — only re-fetch changed content. |
| maxAge | integer | null |
Max age of page content in seconds. Pages cached within this window may be skipped.
e.g. 86400 = skip pages already crawled within the last 24 hours.
|
robots.txt by default. Pages disallowed for Cloudflare's crawler user-agent will be returned with status: "skipped".
- Browser hours only — concurrent browser count is irrelevant
- Hours rounded up to nearest whole hour
- Sessions under 1,800 s still count as 1 hr on paid
- Failed or timed-out calls are not charged
- Browser hours + concurrent browsers — both metered
- Concurrent count = monthly average of daily peaks
- Hours pool is shared with REST API usage
- Failed or timed-out calls are not charged
Raw API response
{
"success": true,
"result": {
"id": "82a79af7-590f-434b-b10e-40197b802ef9",
"status": "completed",
"browserSecondsUsed": 0.415633056640625,
"total": 1,
"finished": 1,
"skipped": 1,
"records": [
{
"url": "https://example.com/",
"status": "completed",
"metadata": {
"status": 200,
"title": "Example Domain",
"url": "https://example.com/",
"lastModified": "Thu, 05 Mar 2026 11:54:13 GMT"
},
"html": "<!DOCTYPE html><html>...Example Domain...</html>"
},
{
"url": "https://iana.org/domains/example",
"status": "skipped" // external domain
}
]
}
}
The crawler found a link to https://iana.org/domains/example on the page, but it's an external domain. By default, /crawl stays within the origin of the seed URL (example.com). To follow external links you'd need to run separate crawl jobs per domain.
Free plan: used 0.42 s of the 600 s (10 min) daily budget. ~0.07% of daily allowance consumed.
Paid plan: rounded up to 1 hour minimum → $0.09. But this counts within the 10 free hours/month included, so: $0.00 billed.