Skip to content

Jobs

A job is a single scraping request for one URL, executed through a workflow.

queued → running → completed
→ failed
StatusMeaning
queuedIn the SQS queue, waiting for a worker
runningWorker is executing the workflow DAG
completedScrape succeeded, result available
failedAll providers in the fallback chain failed

Via API:

Terminal window
curl -X POST https://dashboard.justcrawl.io/api/v1/jobs \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com"}'

Via schedule: Jobs are created automatically when a schedule runs.

Via external queue: Push {"url": "https://example.com"} to your configured SQS queue or webhook.

Each job records which nodes in the DAG were executed, in order. The trace shows:

  • Which provider was tried
  • Whether it succeeded or failed
  • Status code, latency, error type (if any)

View the trace in the job detail page or via GET /api/v1/jobs/:id.

Terminal window
curl https://dashboard.justcrawl.io/api/v1/jobs/JOB_ID/result \
-H "Authorization: Bearer YOUR_API_KEY"

Returns the scraped HTML as text/html. Only available for completed jobs.

When a service node in the DAG fails, the worker follows the fail edge to the next provider. This is automatic fallback, not manual retry.

If the entire DAG is exhausted (all providers failed), the job is marked as failed. The SQS visibility timeout handles re-delivery for transient failures.

Jobs consume credits. Each job costs 1 credit. Check your credit balance at Settings > Billing or via GET /api/v1/plans/status.

When credits are exhausted, job submission returns HTTP 402.