Skip to content

URLs

URLs are the targets you want to scrape. justcrawl manages them with tagging, domain auto-detection, and batch operations.

Single URL: Dashboard > URLs > Add URL, or via API:

Terminal window
curl -X POST https://dashboard.justcrawl.io/api/v1/urls \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com/product/123"}'

Batch: JSON array or CSV upload (up to 10,000 per batch).

Terminal window
curl -X POST https://dashboard.justcrawl.io/api/v1/urls/batch \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"urls": [{"url": "https://example.com/a"}, {"url": "https://example.com/b"}]}'

CSV upload: Upload a CSV file with a url column. Optional columns: priority, tags.

Tags organize URLs for filtering. Two types:

  • User tags: You assign these (e.g., electronics, high-priority)
  • Domain tags: Auto-generated from the URL domain (e.g., domain:amazon.com)

Domain tags are used for workflow routing. When you add a URL for amazon.com, it automatically gets the tag domain:amazon.com, which matches the workflow route domain:amazon.com.

URLs have a priority (0-100, default 0). Higher priority URLs are processed first within a schedule run.

URLs are unique per organization. Adding a duplicate URL is silently ignored (no error, no duplicate created).

Schedules filter URLs by tags. A schedule with tagFilters: ["electronics"] only processes URLs tagged electronics. A schedule with no tag filters processes all URLs.