Skip to content

List extraction schemas

GET
/api/v1/extraction/schemas
curl --request GET \
--url 'https://dashboard.justcrawl.io/api/v1/extraction/schemas?pageType=product&limit=50' \
--header 'Authorization: Bearer <token>'

Read-only list of platform-discovered XPath schemas.

domain
string
pageType
string
Allowed values: product product_list serp article job_posting
limit
integer
default: 50 <= 200

List of extraction schemas

Media type application/json
object
items
Array<object>
object
id
string
domain
string
pageType
string
Allowed values: product product_list serp article job_posting
version
integer
attributes
Array<object>
object
name
required
string
xpaths
required
Array<string>
type
required
string
Allowed values: text number url boolean text[] url[]
description
string
nullable
antiPatterns
Array<string>
nullable
semanticType
string
nullable
Allowed values: brand_name review_count product_id product_image star_rating
validation
object
minLength
integer
maxLength
integer
min
number
max
number
pattern
string
confidence
number
discoveredBy
string
Allowed values: llm llm_multi_sample manual reverse_engineered
sampleCount
integer
lastValidatedAt
string format: date-time
expiresAt
string format: date-time
createdAt
string format: date-time
Example
{
"items": [
{
"pageType": "product",
"attributes": [
{
"type": "text",
"semanticType": "brand_name"
}
],
"discoveredBy": "llm"
}
]
}