Jobs
A job is a single scraping request for one URL, executed through a workflow.
Job lifecycle
Section titled “Job lifecycle”queued → running → completed → failed| Status | Meaning |
|---|---|
queued | In the SQS queue, waiting for a worker |
running | Worker is executing the workflow DAG |
completed | Scrape succeeded, result available |
failed | All providers in the fallback chain failed |
Submitting jobs
Section titled “Submitting jobs”Via API:
curl -X POST https://dashboard.justcrawl.io/api/v1/jobs \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{"url": "https://example.com"}'Via schedule: Jobs are created automatically when a schedule runs.
Via external queue: Push {"url": "https://example.com"} to your configured SQS queue or webhook.
Execution trace
Section titled “Execution trace”Each job records which nodes in the DAG were executed, in order. The trace shows:
- Which provider was tried
- Whether it succeeded or failed
- Status code, latency, error type (if any)
View the trace in the job detail page or via GET /api/v1/jobs/:id.
Getting results
Section titled “Getting results”curl https://dashboard.justcrawl.io/api/v1/jobs/JOB_ID/result \ -H "Authorization: Bearer YOUR_API_KEY"Returns the scraped HTML as text/html. Only available for completed jobs.
Retry behavior
Section titled “Retry behavior”When a service node in the DAG fails, the worker follows the fail edge to the next provider. This is automatic fallback, not manual retry.
If the entire DAG is exhausted (all providers failed), the job is marked as failed. The SQS visibility timeout handles re-delivery for transient failures.
Jobs consume credits. Each job costs 1 credit. Check your credit balance at Settings > Billing or via GET /api/v1/plans/status.
When credits are exhausted, job submission returns HTTP 402.