Skip to content

Get latest schema for a domain + page type

GET
/api/v1/extraction/schemas/{domain}/{pageType}
curl --request GET \
--url https://dashboard.justcrawl.io/api/v1/extraction/schemas/example/product \
--header 'Authorization: Bearer <token>'
domain
required
string
pageType
required
string
Allowed values: product product_list serp article job_posting

Latest extraction schema

Media type application/json
object
id
string
domain
string
pageType
string
Allowed values: product product_list serp article job_posting
version
integer
attributes
Array<object>
object
name
required
string
xpaths
required
Array<string>
type
required
string
Allowed values: text number url boolean text[] url[]
description
string
nullable
antiPatterns
Array<string>
nullable
semanticType
string
nullable
Allowed values: brand_name review_count product_id product_image star_rating
validation
object
minLength
integer
maxLength
integer
min
number
max
number
pattern
string
confidence
number
discoveredBy
string
Allowed values: llm llm_multi_sample manual reverse_engineered
sampleCount
integer
lastValidatedAt
string format: date-time
expiresAt
string format: date-time
createdAt
string format: date-time
Example
{
"pageType": "product",
"attributes": [
{
"type": "text",
"semanticType": "brand_name"
}
],
"discoveredBy": "llm"
}

Resource not found

Media type application/json
object
error
string
Example generated
{
"error": "example"
}