POST

api

platform

scrapers

web

crawl

Crawl website

curl --request POST \
  --url https://developer.thehog.ai/api/v1/platform/scrapers/web/crawl \
  --header 'Content-Type: application/json' \
  --header 'X-Access-Key: <api-key>' \
  --header 'X-Secret-Key: <api-key>' \
  --data '
{
  "url": "https://example.com",
  "limit": 5,
  "maxPages": 5,
  "maxDepth": 1,
  "sameDomainOnly": true,
  "instructions": "<string>"
}
'

{
  "data": {
    "id": "<string>",
    "operationId": "<string>",
    "pollUrl": "<string>"
  },
  "meta": {
    "requestId": "<string>"
  }
}

POST /api/v1/platform/scrapers/web/crawl

Crawl a website and return discovered page content.

Crawl a website from a starting URL with an optional page limit.

Example

curl -X POST https://developer.thehog.ai/api/v1/platform/scrapers/web/crawl \
  -H "X-Access-Key: ak_xxxxxxxxxxxxxxxx" \
  -H "X-Secret-Key: sk_xxxxxxxxxxxxxxxx" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com", "limit": 5}'

Authorizations

X-Access-Key

string

header

required

The public API key from the Credentials page.

X-Secret-Key

string

header

required

The API secret shown when the credential is created.

Headers

Idempotency-Key

string

Optional. Reusing the same key for the same organization returns the existing queued crawl operation.

Body

application/json

url

string

required

Domain or URL to crawl

Example:

"https://example.com"

limit

number

default:5

Required range: 1 <= x <= 50

maxPages

number

default:5

Maximum number of pages to crawl. Overrides limit when set.

Required range: 1 <= x <= 50

maxDepth

number

default:1

Maximum link depth from the starting URL.

Required range: 0 <= x <= 5

sameDomainOnly

boolean

default:true

Restrict discovered links to the starting hostname.

instructions

string

Instructions for content extraction

Response

Crawl accepted. Poll the returned operation URL for results.

data

object

required

Show child attributes

meta

object

required

Show child attributes

Search web

Scrape web page

⌘I

Crawl website

curl --request POST \
  --url https://developer.thehog.ai/api/v1/platform/scrapers/web/crawl \
  --header 'Content-Type: application/json' \
  --header 'X-Access-Key: <api-key>' \
  --header 'X-Secret-Key: <api-key>' \
  --data '
{
  "url": "https://example.com",
  "limit": 5,
  "maxPages": 5,
  "maxDepth": 1,
  "sameDomainOnly": true,
  "instructions": "<string>"
}
'

{
  "data": {
    "id": "<string>",
    "operationId": "<string>",
    "pollUrl": "<string>"
  },
  "meta": {
    "requestId": "<string>"
  }
}

​POST /api/v1/platform/scrapers/web/crawl

​Example

Authorizations

Headers

Body

Response

POST /api/v1/platform/scrapers/web/crawl

Example