Providers

justcrawl.io routes scraping requests across multiple providers. You bring your own API keys.

Supported providers

Provider	Auth	Auto-route by URL	Async webhook	Vendor docs
Bright Data	API Key + Zone	No (single zone per account)	No	docs.brightdata.com
Oxylabs	Username + Password	Yes (33 verified source templates)	No (TODO 2)	developers.oxylabs.io
Nimble Way	Bearer token	No (driver-based stealth)	No	docs.nimbleway.com
Zyte	API Key	No (auto-extract is TODO 3)	No	docs.zyte.com
Decodo	API Key (formerly Smartproxy)	Yes (17 verified target templates)	No (TODO 2)	help.decodo.com

Connecting a provider

Go to Settings > Provider Accounts
Click Add Provider
Enter your credentials
Click Test Connection to verify
Save

Credentials are encrypted at rest with AES-256-GCM and never returned in API responses.

Provider pricing

In Settings > Vendor Pricing, enter what you pay per 1,000 requests for each provider. This enables:

Cost strategy: ranks providers by cheapest-first
Cost estimates: shows expected cost per workflow on the dashboard
Premium domains: optionally set higher rates for specific domains (e.g., social media sites)

Multiple providers required

Smart optimization (benchmarking, strategy selection, auto-tuning) requires at least 2 connected providers. With a single provider, justcrawl acts as a simple proxy with scheduling and URL management.

Per-node settings

Every service node in a workflow accepts a shared set of settings (Country, JS rendering, Wait selector, Parse, Screenshot, Markdown, Custom Headers, Custom Cookies, Sessions, Device, Advanced JSON) plus vendor-specific extras. See Service node settings for the full reference.

The capability matrix below shows which providers support which features. Cells marked ✓ are supported and exposed in the workflow editor.

Feature	Bright Data	Oxylabs	Nimble Way	Zyte	Decodo
JS rendering	✓ (always)	✓	✓	✓	✓
Geo-targeting	✓	✓	✓	✓	✓
Wait for selector	✓	✓	✓	✓	✓
Parse (vendor-side structured)	—	✓	—	TODO 3	✓
Screenshot (PNG)	✓	✓	✓	✓	✓
Markdown output	✓	—	✓	—	✓
Custom headers	✓	✓	✓	✓	✓
Custom cookies	—	✓	✓	✓	✓
Sessions (sticky IP)	—	✓	—	✓	✓
Device emulation	—	✓	✓	✓	✓
Auto-route by URL	—	✓	—	—	✓
Advanced (JSON) passthrough	✓	✓	✓	✓	✓

Bright Data

Bright Data Web Unlocker is a synchronous proxy service that automatically renders JavaScript and handles anti-bot detection. The adapter targets the Web Unlocker product specifically. Async datasets (Web Scraper API) and SERP API are out of scope (TODO 1).

Required configuration

API Key — sent as Authorization: Bearer <key>.
Zone — the Web Unlocker zone name from your Bright Data dashboard (e.g., web_unlocker1). Must be Web Unlocker type, not Web Scraper, SERP, or Browser. Auto-detected from your account at connection time; pick from the dropdown.

Capabilities

Feature	Notes
JS rendering	Always on. Web Unlocker can’t be disabled.
Geo-targeting	Lowercase ISO-2 (`us`, `gb`, `de`). Passed as-is.
Wait for selector	Encoded in `x-unblock-expect` header. Requires “Manual expect elements” enabled on zone.
Screenshot	Set `screenshot: true` → `data_format: "screenshot"`. Body is base64 PNG, `contentType: image/png`.
Markdown	Set `markdown: true` → `data_format: "markdown"`. Body is markdown text. Screenshot wins if both set.
Custom headers	Merged into request body `headers` field (preserves `x-unblock-expect`). Requires “Custom Headers & Cookies” enabled on zone.
Sessions / cookies / device emulation	Not exposed by Web Unlocker.

Response format

format: "json" (default): {status_code, body, headers} envelope. body contains the rendered HTML or base64 PNG / markdown if a data_format is set.
format: "raw": response is the body directly, no envelope.

Gotchas

Zone-type mismatch. If your zone is not Web Unlocker, the response shape is different and body is missing. The adapter detects this and returns errorType: 'validation' with the message Bright Data response missing 'body' field — likely zone-type mismatch (zone must be Web Unlocker, not Web Scraper / SERP / Browser).

Premium domains. Walmart, Target, Instacart, and similar require a separate premium zone in your Bright Data account. Standard zones fail.

Feature gates. customHeaders and waitForSelector only work if the corresponding feature is enabled on the zone in the Bright Data UI. Without the feature, the field is silently ignored.

Vendor docs

Zone routing

When you save Bright Data credentials, ScrapeRoute calls GET /zone/get_active_zones and stores the zones (with their type) in org_provider_resources. A periodic job (sync_provider_resources, every 15 min) keeps that table in sync with the dashboard. At scrape time, the resolver picks a zone in this order:

Per-node zone override (workflow editor → service node config). Non-empty value wins over everything below.
Host detector — known SERP hosts route through your SERP-typed zone: google.*, bing.com, duckduckgo.com, yandex.*, and yahoo.com/search. If your account doesn’t have a SERP zone, the resolver falls through.
credentials.zone — your account’s default Web Unlocker zone.

The workflow editor’s zone dropdown includes a Use account default option (empty value) that lets a node opt out of an explicit pick and rely on the detector + credentials default.

What’s not supported

Web Scraper API async (datasets/v3/trigger with snapshot polling) — TODO 1.

Oxylabs

Oxylabs Realtime Web Scraper API. The adapter exposes 33 verified source templates (amazon_product, google_search, walmart, ebay, etsy, kroger, target, bestbuy, …) — every entry verified against developers.oxylabs.io — plus a universal fallback for any URL.

Auto-routing by URL (headline feature)

Leave the Source field empty and the adapter auto-detects the right Oxylabs source and parameter shape from your URL:

URL	Routes to	With params
`amazon.com/dp/B08Y72CH1F`	`amazon_product`	`{query: "B08Y72CH1F", domain: "com"}`
`amazon.co.uk/dp/B0XXXX`	`amazon_product`	`{query: "B0XXXX", domain: "co.uk"}`
`amazon.com/s?k=laptop`	`amazon_search`	`{query: "laptop", domain: "com"}`
`walmart.com/ip/widget/123456`	`walmart_product`	`{product_id: "123456", domain: "com"}`
`walmart.com/search?q=tv`	`walmart_search`	`{query: "tv"}`
`google.com/search?q=test`	`google_search`	`{query: "test", domain: "com"}`
`ebay.com/itm/...`	`ebay`	`{url}`
any other host	`universal`	`{url}`

International TLDs are preserved (amazon.com.br → domain: "com.br"). Detector source: packages/providers/src/oxylabs-source-detector.ts.

To override, set Source explicitly in the node config — that wins over auto-detection. See the legacy-row guard below if you do.

Required configuration

Username + Password — HTTP Basic auth.

Capabilities

Feature	Notes
JS rendering	`renderJs: true` → `render: "html"`.
Geo-targeting	ISO-2 lowercase converted to full country name (`us` → `"United States"`).
Wait for selector	`browser_instructions[0]` with CSS wait. Requires JS rendering. Default timeout 10s.
Parse	`parse: true` → Oxylabs returns structured JSON, `contentType: application/json`.
Screenshot	`screenshot: true` → `render: "png"`. Base64 PNG. Wins over `renderJs`.
Custom headers	Forwarded via `context: [{key: "headers", value: {...}}]`.
Custom cookies	Flat `Record<string,string>` → `context: [{key: "cookies", value: {...}}]`.
Sessions	`sessionId` → `context: [{key: "session_id", value: ...}]`.
Device	`desktop` / `mobile` / `tablet` → `user_agent_type` (tablet maps to `mobile`).
HTTP POST	`httpMethod: "POST"` + `httpBody` → `context: [{key: "http_method", value: "post"}, {key: "content", value: "<base64>"}]`.

Legacy-row guard

Oxylabs’s per-source identifier-field contract is heterogeneous, not uniform. Sending the wrong field name returns HTTP 400 with a vendor message like [product_id]: This field is missing. [query]: This field was not expected.

Identifier field	Sources	Param shape
`url`	`universal`, `amazon`, `walmart`, `ebay`, `google`, …	`{url}`
`query` (+ optional `domain`)	`amazon_product`, `amazon_search`, `walmart_search`, `google_search`	`{query, domain?}`
`product_id` (+ optional `domain`)	`walmart_product`	`{product_id, domain?}`

If you set Source explicitly without supplying the matching field in sourceParams, the adapter returns:

errorType: 'validation'
error: "Oxylabs source 'walmart_product' requires 'product_id' in sourceParams. See https://developers.oxylabs.io/scraping-solutions/web-scraper-api/targets/walmart/product. Or clear the source field to use auto-detection from the URL."

Workaround: clear Source to use auto-detection, or populate sourceParams with the field that matches your chosen source ({query, domain} for search-style, {product_id, domain} for product-style) in Advanced (JSON).

Gotchas

Render timeout cap. The adapter’s maxTimeoutMs is 30s; Oxylabs’s upstream cap is 180s. Use Advanced (JSON) to extend (e.g., {"timeout": 120000}) up to the upstream cap.

Parsed = JSON, not HTML. When parse: true, the body is JSON.stringify-ed parsed data with contentType: application/json. Downstream HTML validators skip this content type automatically.

Per-source identifier reference

Oxylabs’s per-source contract is heterogeneous: URL-based sources take url, search-style sources take query (+ optional domain), product-style sources take product_id (+ optional domain). Sending the wrong field name fails with HTTP 400 and a vendor message like [product_id]: This field is missing. [query]: This field was not expected. The detector picks the right field for you when you leave Source on Auto-detect from URL; the table below is the contract for explicit-source workflows.

Identifier-field legend:

url — pass the full URL in sourceParams.url (or leave the URL field unchanged for auto).
query — pass a search term in sourceParams.query.
product_id — pass the vendor’s product id (e.g., Walmart item id, Amazon ASIN) in sourceParams.product_id.

Generated from SOURCE_REQUIRED_PARAM in packages/providers/src/oxylabs-source-detector.ts. Run pnpm exec tsx scripts/generate-vendor-doc-tables.ts to refresh.

Source	Group	Identifier field	Vendor docs
`universal`	General	`url`	link
`amazon`	Amazon	`url`	link
`amazon_product`	Amazon	`query`	link
`amazon_search`	Amazon	`query`	link
`amazon_pricing`	Amazon	`query`	link
`amazon_bestsellers`	Amazon	`query`	link
`amazon_sellers`	Amazon	`query`	link
`google`	Google	`url`	link
`google_search`	Google	`query`	link
`google_ads`	Google	`query`	link
`google_travel_hotels`	Google	`query`	link
`google_lens`	Google	`query`	link
`google_shopping_search`	Google	`query`	link
`google_shopping_product`	Google	`query`	link
`google_trends_explore`	Google	`query`	link
`bing`	Bing	`url`	link
`bing_search`	Bing	`query`	link
`walmart`	Walmart	`url`	link
`walmart_product`	Walmart	`product_id`	link
`walmart_search`	Walmart	`query`	link
`ebay`	eBay	`url`	link
`ebay_search`	eBay	`query`	link
`etsy`	E-commerce	`url`	link
`kroger`	E-commerce	`url`	link
`target`	E-commerce	`url`	link
`bestbuy`	E-commerce	`url`	link
`lowes`	E-commerce	`url`	link
`costco`	E-commerce	`url`	link
`aliexpress`	E-commerce	`url`	link
`alibaba`	E-commerce	`url`	link
`rakuten`	E-commerce	`url`	link
`flipkart`	E-commerce	`url`	link
`mercadolibre`	E-commerce	`url`	link

Vendor docs

Web Scraper API overview
Targets index (full source list)
Custom Parser (use Advanced (JSON) with parsing_instructions)
JavaScript rendering
Integration methods (Realtime / Proxy / Push-Pull)
Forming requests

What’s not supported

Push-Pull async webhook — TODO 2. Realtime endpoint only today.
parsing_instructions UI — available via Advanced (JSON) only.

Nimble Way

Nimble Way Web API. URL-uniform — there is no auto-routing by URL. Site-aware behavior comes from explicit driver selection (Nimble’s stealth profiles).

Required configuration

API Token — sent as Authorization: Bearer <token>.

Driver-based stealth

Nimble doesn’t have site-specific templates. Instead, drivers tune scraping behavior:

Driver	Use case
`vx6`	Fast, no JavaScript — plain HTML only
`vx8`	Headless + JS rendering (default when `renderJs: true` or `waitForSelector` set)
`vx8-pro`	Enhanced headless variant
`vx10`	Full-page rendering variant
`vx10-pro`	Stealth headful — auto-selected from `device: 'mobile'` or `device: 'tablet'`
`Auto` (default)	Inferred per request: device hint → JS flag → vx6 fallback

When Driver is empty, the adapter infers:

device: 'mobile' | 'tablet' → vx10-pro (with render enabled)
renderJs: true or waitForSelector → vx8
otherwise → backend default

Capabilities

Feature	Notes
JS rendering	`render: true` + driver. Auto-enabled with `renderJs`, `waitForSelector`, or `device: 'mobile'/'tablet'`.
Geo-targeting	Uppercase ISO-2 (`US`, `GB`, `FR`). Sub-country via Advanced (JSON).
Wait for selector	`browser_actions` array with timeout (default 10s). Forces vx8.
Screenshot	`formats: ['html', 'screenshot']`. Base64 PNG, `contentType: image/png`.
Markdown	`formats: ['html', 'markdown']`. `contentType: text/markdown`.
Custom headers	Pass-through object → `body.headers`.
Custom cookies	Flat map → array of `{key, value, domain}`. Each cookie’s `domain` defaults to the request URL hostname if not specified.
Device	`desktop` / `mobile` / `tablet` → driver mapping above.

Gotchas

device: 'desktop' alone is a no-op. Selecting Desktop without also enabling renderJs or waitForSelector does not trigger rendering — Nimble requires render: true to be paired with a driver. Combine with Render JS or a wait selector. Fixed in v0.3.4.

Auto driver fix in v0.3.4. Empty driver string previously short-circuited inference. Now device: 'mobile' still resolves to vx10-pro and renderJs: true still falls back to vx8 when Driver = Auto.

Cookies must include domain. The adapter auto-fills it from the request URL hostname; only override via Advanced (JSON) if you need a different domain (e.g., .example.com).

Vendor docs

What’s not supported

Endpoint migration — adapter currently uses sdk.nimbleway.com/v1/extract. The canonical endpoint per docs is api.webit.live/api/v1/realtime/web. Migration is on the roadmap; both currently route the same.
First-class sessions — no session ID; identity is per-request.
parsing schema UI — available via Advanced (JSON) with a CSS-selector schema.

Zyte

Zyte API at /v1/extract. Two execution modes — browser and HTTP — with a strict mutual-exclusion guard at the adapter boundary.

Required configuration

API Key — HTTP Basic with key as username, empty password.

Two execution modes

Browser mode — full headless Chrome. Triggered when you set renderJs, screenshot, or waitForSelector.

Sends browserHtml: true to Zyte; supports actions array (waitForSelector with timeout).
For headers: only Referer is honored on browser mode (other custom headers are silently dropped server-side).

HTTP mode — pure fetch. Default when no browser flag is set.

Sends httpResponseBody: true to Zyte. Adapter base64-decodes the body to UTF-8.
Supports POST body and full custom-headers passthrough.
Required for httpMethod: "POST" and device emulation.

Mutual exclusion. Combining browser flags (renderJs, screenshot, waitForSelector) with HTTP-mode options (httpMethod: "POST" or httpBody) returns errorType: 'validation' before reaching Zyte’s API. Pick one mode.

Capabilities

Feature	Notes
JS rendering	Browser mode (`browserHtml: true`).
Geo-targeting	Uppercase ISO-2 → `geolocation`. 20 countries free; 50+ extended tier costs extra.
Wait for selector	`actions[0]` with timeout — adapter converts ms → seconds.
Screenshot	`screenshot: true` → base64 PNG. Adds extra cost. Browser mode only.
Custom headers	HTTP mode: `customHttpRequestHeaders` (full passthrough). Browser mode: `requestHeaders` (only `Referer` honored).
Custom cookies	`requestCookies` array. Each cookie’s `domain` defaults to the request URL hostname per eng-review A2.
Sessions	`session.id` (RFC 4122 v4 UUID). Free-form strings are deterministically hashed to a stable v4 UUID.
Device	HTTP mode only. `tablet` maps to `mobile`. Silently omitted in browser mode.
HTTP POST	HTTP mode only. UTF-8 body via `httpRequestText`.
`ipType`	Vendor-specific node field: `datacenter` (default, free) or `residential` (KYC + per-request surcharge).

Gotchas

Selector timeout unit. justcrawl uses milliseconds; Zyte uses seconds. The adapter converts (Math.round(ms / 1000)). Default 10000 ms → timeout: 10 seconds.

Screenshot is independent of renderJs. Setting screenshot: true triggers browser mode on its own without a redundant browserHtml: true. Set both if you want HTML and a screenshot in the same request.

Residential IPs require KYC. Zyte requires identity verification before allowing residential routing. Per-request surcharge applies. Defaults to datacenter.

Sessions UUID coercion. Zyte requires session.id to be a v4 UUID. The adapter SHA-1-hashes any free-form string (e.g., "user-123-cart") to a deterministic v4 UUID — same input always produces the same Zyte session.

Vendor docs

API reference (top-level params, mutual exclusion)
Browser mode (actions, viewport, device)
HTTP mode
Auto-extraction (product / article / jobPosting — TODO 3)
Features (geolocation, ipType, sessions, cookies)
Proxy mode

What’s not supported

Auto-extraction routing (product: true, article: true, jobPosting: true) — TODO 3. The flags can be set via Advanced (JSON), but the response routing for structured JSON is not wired into the validation pipeline yet. Use only if you control the downstream consumer.

Decodo

Decodo Web Scraping API real-time mode (formerly Smartproxy). 17 verified target templates (every entry checked against help.decodo.com’s llms.txt index) with auto-routing similar to Oxylabs. URLs that don’t match a verified target route through universal.

Auto-routing by URL

Leave Target empty for automatic routing. Same shape as Oxylabs:

URL	Routes to	With params
`amazon.com/dp/B08Y72CH1F`	`amazon_product`	`{query: "B08Y72CH1F", domain: "com"}`
`amazon.com/s?k=laptop`	`amazon_search`	`{query: "laptop", domain: "com"}`
amazon other paths	`amazon`	`{url}`
`google.com/search?q=test`	`google_search`	`{query: "test", domain: "com"}`
google other paths	`universal`	`{url}`
`walmart.com/ip/widget/12345`	`walmart_product`	`{product_id: "12345"}`
`walmart.com/search?q=tv`	`walmart_search`	`{query: "tv"}`
walmart other	`walmart`	`{url}`
`ebay.*`	`universal`	`{url}` (Decodo ships no per-target eBay doc page; URL passthrough)
any other host	`universal`	`{url}`

Detector source: packages/providers/src/decodo-target-detector.ts. Explicit Target wins over auto-detection.

Required configuration

API Key (token) — sent as HTTP Basic username with empty password.

Capabilities

Feature	Notes
JS rendering	`renderJs: true` → `headless: "html"`.
Geo-targeting	ISO-2 → Title Case country name (e.g., `us` → `"United States"`).
Wait for selector	`browser_actions[0]` with default 10s timeout.
Parse	`parse: true` → vendor-side parsed JSON (e.g., product details), `contentType: application/json`.
Screenshot	`screenshot: true` → `headless: "png"`. Base64 PNG. Wins over markdown / renderJs.
Markdown	`markdown: true` → markdown text + auto-enables headless rendering.
Custom headers	`body.headers` object passthrough.
Custom cookies	`body.cookies` object passthrough.
Sessions	`session_id` for sticky sessions.
Device	`device_type`: `desktop` / `mobile` / `tablet`.
HTTP POST	`http_method: "POST"` + `payload` (base64).

Decodo-specific node fields

Field	Notes
Target	Explicit override — 17 verified options grouped by site. Empty = auto-detect.
Proxy Pool	`standard` (default) or `premium`.
Page From / Page Count	Only meaningful for paginated targets like `google_search`, `amazon_search`.

Legacy-row guard

Decodo mirrors Oxylabs’s heterogeneous identifier-field contract. Sending the wrong field name returns HTTP 400.

Identifier field	Targets	Param shape
`url`	`universal`, `amazon`, `walmart`, `google`	`{url}`
`query` (+ optional `domain`)	`amazon_product`, `amazon_search`, `walmart_search`, `google_search`	`{query, domain?}`
`product_id`	`walmart_product`	`{product_id}`

If you set Target explicitly without supplying the matching field in targetParams:

errorType: 'validation'
error: "Decodo target 'walmart_product' requires 'product_id' in targetParams. See https://help.decodo.com/docs/web-scraping-api-walmart-product. Or clear the target field to use auto-detection from the URL."

Gotchas

150s server timeout. Decodo’s real-time hard cap is 150s. The adapter uses maxTimeoutMs: 30000 as a conservative client guard.

Screenshot wins over markdown. If you set both, you get the PNG.

Markdown forces headless. The adapter auto-enables headless: "html" when markdown: true.

Per-target identifier reference

Decodo’s target contract mirrors Oxylabs’s: URL-based targets take url, search-style take query (+ optional domain), product-style take product_id (+ optional domain). The detector picks the right field for you when you leave Target on Auto-detect from URL; the table below is the contract for explicit-target workflows. Decodo’s documented surface is narrower than Oxylabs’s — only targets with a verified vendor doc page are listed; URLs that don’t match any of these go through universal.

Generated from TARGET_REQUIRED_PARAM in packages/providers/src/decodo-target-detector.ts. Run pnpm exec tsx scripts/generate-vendor-doc-tables.ts to refresh.

Target	Group	Identifier field	Vendor docs
`universal`	General	`url`	link
`amazon`	Amazon	`url`	link
`amazon_product`	Amazon	`query`	link
`amazon_search`	Amazon	`query`	link
`amazon_pricing`	Amazon	`query`	link
`amazon_bestsellers`	Amazon	`query`	link
`amazon_sellers`	Amazon	`query`	link
`google`	Google	`url`	link
`google_search`	Google	`query`	link
`google_ads`	Google	`query`	link
`google_travel_hotels`	Google	`query`	link
`google_shopping_search`	Google	`query`	link
`google_shopping_product`	Google	`query`	link
`bing_search`	Bing	`query`	link
`walmart`	Walmart	`url`	link
`walmart_product`	Walmart	`product_id`	link
`walmart_search`	Walmart	`query`	link

Vendor docs

Web Scraping API introduction
Quick start
Real-time requests
Asynchronous requests (v3/task) — not wired (TODO 2)
API parameters
Browser actions

What’s not supported

v3/task async API — TODO 2. Real-time only today.