Integrations
Integrations connect justcrawl to your existing infrastructure for real-time URL ingestion and result delivery.
Input sources
Section titled “Input sources”SQS queue
Section titled “SQS queue”Push URLs to your own SQS queue. justcrawl polls it every few seconds and creates jobs automatically.
Message format:
{"url": "https://example.com/product/123"}Optionally include a workflowId to override domain-based routing:
{"url": "https://example.com/product/123", "workflowId": "workflow-uuid"}If no workflowId is provided, justcrawl resolves the workflow by domain (same as scheduled jobs).
Setup: Settings > Integrations > Add Input Source, or during onboarding. Provide your AWS Access Key ID, Secret Access Key, Region, and Queue URL.
Webhook ingestion
Section titled “Webhook ingestion”justcrawl generates a unique webhook URL. POST an array of URLs to it:
curl -X POST https://dashboard.justcrawl.io/api/v1/webhooks/YOUR_TOKEN/ingest \ -H "Content-Type: application/json" \ -d '["https://example.com/a", "https://example.com/b"]'Or with explicit workflow assignment:
[{"url": "https://example.com/a", "workflowId": "workflow-uuid"}]Daily URL cap applies per input config (default: 10,000/day).
Output destinations
Section titled “Output destinations”S3 bucket
Section titled “S3 bucket”Scraping results are uploaded to your S3 bucket as HTML files:
s3://your-bucket/prefix/JOB_ID.htmlSetup: Provide AWS credentials, bucket name, and optional prefix.
Webhook delivery
Section titled “Webhook delivery”Results are POSTed to your endpoint as JSON:
{ "jobId": "job-uuid", "url": "https://example.com", "statusCode": 200, "providerId": "brightdata", "latencyMs": 1250, "body": "<html>...</html>", "deliveredAt": "2026-04-11T00:00:00Z"}HMAC signing: Every delivery includes an X-JustCrawl-Signature header with an HMAC-SHA256 signature using your secret. Verify this to authenticate the request.
Retry: Failed deliveries are retried up to 3 times with exponential backoff.
Managing integrations
Section titled “Managing integrations”All integrations are managed in Settings > Integrations. Each config can be enabled/disabled independently. Status indicators show the last successful poll/delivery time and any errors.