Re-extract historical results with the current schema

POST

/api/v1/extraction/backfill

const url = 'https://dashboard.justcrawl.io/api/v1/extraction/backfill';
const options = {
  method: 'POST',
  headers: {Authorization: 'Bearer <token>', 'Content-Type': 'application/json'},
  body: '{"domain":"example","pageType":"example","resultIds":["2489E9AD-2EE2-8E00-8EC9-32D5F69181C0"],"limit":20}'
};

try {
  const response = await fetch(url, options);
  const data = await response.json();
  console.log(data);
} catch (error) {
  console.error(error);
}

curl --request POST \
  --url https://dashboard.justcrawl.io/api/v1/extraction/backfill \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{ "domain": "example", "pageType": "example", "resultIds": [ "2489E9AD-2EE2-8E00-8EC9-32D5F69181C0" ], "limit": 20 }'

Synchronous backfill capped at 20 results per call. Use resultIds to target specific rows, or omit to process the most recent results for the domain + page type.

Authorizations

bearerAuth

Request Body^required

application/json

object

domain

required

string

pageType

required

string

resultIds

Array<string>

limit

integer

default: 20 <= 50

Responses

200

Backfill summary

application/json

object

processed

integer

skipped

integer

failed

integer

results

Array<object>

object

string

status

string

newFields

integer

Example ^generated

{
  "processed": 1,
  "skipped": 1,
  "failed": 1,
  "results": [
    {
      "id": "example",
      "status": "example",
      "newFields": 1
    }
  ]
}

400

Validation error or batch too large

404

No schema for domain + pageType

Re-extract historical results with the current schema

Authorizations

Request Body required

Responses

200

Example generated

400

404

Request Body^required

Example ^generated