Skip to content

Test XPath attributes against stored HTML

POST
/api/v1/extraction/test-xpath
curl --request POST \
--url https://dashboard.justcrawl.io/api/v1/extraction/test-xpath \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: application/json' \
--data '{ "domain": "example", "pageType": "example", "attributes": [ { "name": "example", "xpaths": [ "example" ], "type": "text", "description": "example", "antiPatterns": [ "example" ], "semanticType": "brand_name", "validation": { "minLength": 1, "maxLength": 1, "min": 1, "max": 1, "pattern": "example" } } ], "jobId": "2489E9AD-2EE2-8E00-8EC9-32D5F69181C0" }'

Re-runs extraction using the supplied attributes against either the specified job’s cached HTML, or the most recent job for the domain.

Media type application/json
object
domain
required
string
pageType
required
string
attributes
required
Array<object>
>= 1 items
object
name
required
string
xpaths
required
Array<string>
type
required
string
Allowed values: text number url boolean text[] url[]
description
string
nullable
antiPatterns
Array<string>
nullable
semanticType
string
nullable
Allowed values: brand_name review_count product_id product_image star_rating
validation
object
minLength
integer
maxLength
integer
min
number
max
number
pattern
string
jobId

Optional. If omitted, uses the most recent job with stored HTML for the domain.

string format: uuid

Extracted values and quality score

Media type application/json
object
values
object
key
additional properties
any
qualityScore
object
completeness
number
validation
number
composite
number
jobId
string
url
string
Example generated
{
"values": {},
"qualityScore": {
"completeness": 1,
"validation": 1,
"composite": 1
},
"jobId": "example",
"url": "example"
}

Validation error

No HTML available for domain or job

HTML blob exceeds 5MB limit