API Documentation

Welcome to the Radia Web Scraping API documentation. This API provides powerful endpoints for retrieving and parsing content from any website.

Get Your API Key

Getting Started

Overview of Radia's Comprehensive Web Scraping Platform

Platform Overview

Radia provides a comprehensive suite of tools to extract both clean Markdown content and structured JSON data from websites. Our platform is designed to handle everything from simple single-page scraping to complex multi-site data extraction workflows.

Key Features

Dynamic Scraping

Use the /scrape and/extract endpoints for on-the-fly scraping with custom parameters. Perfect for sites with dynamic content or one-off scraping needs.

Predefined Tools

Create and manage reusable scrapers through ourDashboard >> Scrapers. Each scraper is defined by three key components:

A target URL or URL pattern
A prompt that guides the extraction process
A structured output schema using OpenAI's formatlearn more

Our visual schema editor helps you create and validate your output schemas without writing JSON manually. Try it out in our Playground.

Automated Scheduling

Schedule recurring scraping jobs using cron expressions directly from theDashboard >> Tasks. Access job results through our API at:

/tasks/{task_id}/runs- View all runs for a task
/tasks/{task_id}/runs/latest- Get most recent results

Authentication

Important: All API endpoints require authentication via an API key. Generate your key in theDashboard >> API Keysand include it in every request header as:

Authorization: Bearer YOUR_API_KEY

Authorization

All API endpoints require authentication

Include an Authorization header with your API key in all requests.

You can obtain your API key from theDashboard >> API Keyspage.

Keep your API key secure and do not share it publicly. Treat it like a password.

Example

javascript

const apiKey = 'YOUR_API_KEY';

const response = await fetch(
  'https://api.radia.io/api/v2/scrape?url=www.example.com&format=json',
  {
    headers: {
      'Authorization': `Bearer ${apiKey}`,
      'Content-Type': 'application/json'
    }
  }
);

Scrape API

Retrieve cleaned HTML content from a specified URL

The scrape endpoint allows you to retrieve cleaned HTML content from a specified URL. This is useful when you need the raw content for further processing.

GET

Scrape URL Content

Retrieve cleaned Markdown content from a URL. Returns HTML by default.

GET/v2/scrape

Parameters

Name	Type	Required	Description
url	string	Yes	The URL of the website to scrape.
format	string	No	Output format. Use "json" to receive a JSON object containing the markdown. Default: `html`
scroll	boolean	No	Try scrolling to reveal content. Defaults to true. If false, the page will be loaded as is and quicker response time. Default: `true`

Response

Cleaned Markdown content as HTML string (default) or JSON object.string | object

Example

GET

/v2/scrape

1GET /v2/scrape?url=https://example.com&format=json&should_click_elements=true
2Authorization: Bearer YOUR_API_KEY

Extract API

Scrape and parse content from a specified URL into structured data

The extract endpoint allows you to scrape and parse content from a specified URL based on a given prompt and response format. This is ideal when you need structured data rather than raw HTML.

POST

Extract Structured Data

Scrape content and extract structured data based on a prompt and JSON schema.

POST/v2/extract

Parameters

Name	Type	Required	Description
url	string	Yes	The URL to scrape and parse.
prompt	string	Yes	Prompt to guide data extraction (e.g., 'Extract product details').
response_format	object	Yes	JSON object describing the desired output structure (OpenAI function calling format).
scroll	boolean	No	Try scrolling to reveal content. Defaults to true. If false, the page will be loaded as is and quicker response time. Default: `true`
include_markdown	boolean	No	Include the cleaned Markdown content in the response. Defaults to false. Default: `false`

Response

Extracted structured data, token count, and optionally the markdown content.object

Example

POST

/v2/extract

1POST /v2/extract
2Content-Type: application/json
3Authorization: Bearer YOUR_API_KEY
4
5{
6  "url": "https://example-product-page.com",
7  "prompt": "Extract the product name, price, and description.",
8  "response_format": {
9    "type": "object",
10    "properties": {
11      "product_name": { "type": "string", "description": "Name of the product" },
12      "price": { "type": "number", "description": "Price of the product" },
13      "description": { "type": "string", "description": "Product description" }
14    },
15    "required": ["product_name", "price"]
16  },
17  "should_click_elements": false,
18  "include_markdown": true
19}

Scrapers API

Manage scraper configurations

The scrapers endpoints allow you to programmatically create and manage scraper configurations that can be used for both one-off and scheduled scraping jobs.

Available Endpoints

GET/v2/scrapers

Get all scraper configurations for the authenticated user.

POST/v2/scrapers

Create a new scraper configuration.

GET/v2/scrapers/{scraper_id}

Get details of a specific scraper configuration.

POST/v2/scrapers/{scraper_id}/run

Manually trigger a run of a specific scraper configuration.

GET

Get Scrapers

Get all scraper configurations for the authenticated user.

GET/v2/scrapers

Response

A list of scraper configuration objects.array

Example

GET

/v2/scrapers

1GET /v2/scrapers
2Authorization: Bearer YOUR_API_KEY

POST

Create Scraper

Create a new scraper configuration.

POST/v2/scrapers

Parameters

Name	Type	Required	Description
scraper_name	string	Yes	A name for this scraper configuration.
schema_id	string	Yes	ID of the schema defining the desired output structure.
scraped_url	string	Yes	The target URL or URL pattern for the scraper.
prompt	string	Yes	The prompt guiding the data extraction process.
should_click_elements	boolean	No	Configure if the scraper should click elements. Defaults to false. Default: `false`
headless	boolean	No	Configure if the scraper runs headlessly. Defaults to false. Default: `false`

Response

The newly created scraper configuration object.object

Example

POST

/v2/scrapers

1POST /v2/scrapers
2Content-Type: application/json
3Authorization: Bearer YOUR_API_KEY
4
5{
6  "scraper_name": "News Headline Scraper",
7  "schema_id": "sch-def-456",
8  "scraped_url": "https://news.example.com",
9  "prompt": "Extract the main headlines from the homepage.",
10  "should_click_elements": false,
11  "headless": true
12}

GET

Get Scraper by ID

Get details of a specific scraper configuration.

GET/v2/scrapers/{scraper_id}

Parameters

Name	Type	Required	Description
scraper_id	string	Yes	The ID of the scraper to retrieve (path parameter).

Response

The requested scraper configuration object.object

Example

GET

/v2/scrapers/{scraper_id}

1GET /v2/scrapers/s1c2r3p4-e5f6-7890-1234-abcdef123456
2Authorization: Bearer YOUR_API_KEY

POST

Run Scraper

Manually trigger a run of a specific scraper configuration.

POST/v2/scrapers/{scraper_id}/run

Parameters

Name	Type	Required	Description
scraper_id	string	Yes	The ID of the scraper to run (path parameter).
headless	boolean	No	Overrides scraper's default headless setting for this run (query parameter).

Response

The result of the scraper run.object

Example

POST

/v2/scrapers/{scraper_id}/run

1POST /v2/scrapers/s1c2r3p4-e5f6-7890-1234-abcdef123456/run?headless=true
2Authorization: Bearer YOUR_API_KEY

Schemas API

Manage scraping output schemas

The schemas endpoints allow you to create and manage scraping schemas that define how content should be extracted and structured.

Available Endpoints

GET/v2/schemas

Get all extraction schemas for the authenticated user.

POST/v2/schemas

Create a new extraction schema.

GET/v2/schemas/{schema_id}

Get details of a specific extraction schema.

GET

Get Schemas

Get all extraction schemas for the authenticated user.

GET/v2/schemas

Response

A list of schema objects.array

Example

GET

/v2/schemas

1GET /v2/schemas
2Authorization: Bearer YOUR_API_KEY

POST

Create Schema

Create a new extraction schema.

POST/v2/schemas

Parameters

Name	Type	Required	Description
schema_name	string	Yes	A descriptive name for the schema.
schema_json	object	Yes	JSON object defining the extraction structure (OpenAI function call format).

Response

The newly created schema object.object

Example

POST

/v2/schemas

1POST /v2/schemas
2Content-Type: application/json
3Authorization: Bearer YOUR_API_KEY
4
5{
6  "schema_name": "Article Details",
7  "schema_json": {
8    "type": "object",
9    "properties": {
10      "title": { "type": "string", "description": "Article title" },
11      "author": { "type": "string", "description": "Author name" },
12      "publish_date": { "type": "string", "description": "Publication date" }
13    },
14    "required": ["title", "author"]
15  }
16}

GET

Get Schema by ID

Get details of a specific extraction schema.

GET/v2/schemas/{schema_id}

Parameters

Name	Type	Required	Description
schema_id	string	Yes	The ID of the schema to retrieve (path parameter).

Response

The requested schema object.object

Example

GET

/v2/schemas/{schema_id}

1GET /v2/schemas/sch-abc-123
2Authorization: Bearer YOUR_API_KEY

Tasks API

Manage scheduled scraping jobs

The tasks endpoints allow you to create, manage, and monitor scheduled scraping jobs. This is perfect for recurring data collection needs.

Available Endpoints

GET/v2/tasks

Get all scheduled tasks for the authenticated user.

POST/v2/tasks

Create a new scheduled task to run a scraper.

GET/v2/tasks/{task_id}

Get details of a specific task by its ID.

DELETE/v2/tasks/{task_id}

Delete a specific scheduled task.

POST/v2/tasks/{task_id}/run

Manually trigger a run of the specified task.

GET/v2/tasks/{task_id}/runs

Get a list of all historical runs for a specific task.

GET/v2/tasks/{task_id}/runs/latest

Get the result and details of the most recent run for a specific task.

GET

Get Tasks

Get all scheduled tasks for the authenticated user.

GET/v2/tasks

Response

A list of task objects.array

Example

GET

/v2/tasks

1GET /v2/tasks
2Authorization: Bearer YOUR_API_KEY

POST

Create Task

Create a new scheduled task to run a scraper.

POST/v2/tasks

Parameters

Name	Type	Required	Description
scraper_id	string	Yes	ID of the scraper to run.
task_name	string	Yes	A descriptive name for the task.
cron_minute	string	Yes	Cron expression: minute (0-59 or *).
cron_hour	string	Yes	Cron expression: hour (0-23 or *).
cron_day_of_month	string	Yes	Cron expression: day of month (1-31 or *).
cron_month	string	Yes	Cron expression: month (1-12 or *).
cron_day_of_week	string	Yes	Cron expression: day of week (0-6 or *, Sunday=0).
cron_timezone	string	No	Timezone for the schedule (e.g., 'America/New_York'). Defaults to UTC. Default: `UTC`

Response

The newly created task object.object

Example

POST

/v2/tasks

1POST /v2/tasks
2Content-Type: application/json
3Authorization: Bearer YOUR_API_KEY
4
5{
6  "scraper_id": "s1c2r3p4-e5f6-7890-1234-abcdef123456",
7  "task_name": "Hourly Price Check",
8  "cron_minute": "0",
9  "cron_hour": "*",
10  "cron_day_of_month": "*",
11  "cron_month": "*",
12  "cron_day_of_week": "*",
13  "cron_timezone": "America/Los_Angeles"
14}

GET

Get Task By ID

Get details of a specific task by its ID.

GET/v2/tasks/{task_id}

Parameters

Name	Type	Required	Description
task_id	string	Yes	The ID of the task to retrieve (path parameter).

Response

The requested task object.object

Example

GET

/v2/tasks/{task_id}

1GET /v2/tasks/a1b2c3d4-e5f6-7890-1234-567890abcdef
2Authorization: Bearer YOUR_API_KEY

DELETE

Delete Task

Delete a specific scheduled task.

DELETE/v2/tasks/{task_id}

Parameters

Name	Type	Required	Description
task_id	string	Yes	The ID of the task to delete (path parameter).

Response

Confirmation message.object

Example

DELETE

/v2/tasks/{task_id}

1DELETE /v2/tasks/a1b2c3d4-e5f6-7890-1234-567890abcdef
2Authorization: Bearer YOUR_API_KEY

POST

Run Task

Manually trigger a run of the specified task.

POST/v2/tasks/{task_id}/run

Parameters

Name	Type	Required	Description
task_id	string	Yes	The ID of the task to run (path parameter).

Response

The result of the scraper run initiated by the task.object

Example

POST

/v2/tasks/{task_id}/run

1POST /v2/tasks/a1b2c3d4-e5f6-7890-1234-567890abcdef/run
2Authorization: Bearer YOUR_API_KEY

GET

Get Task Runs

Get a list of all historical runs for a specific task.

GET/v2/tasks/{task_id}/runs

Parameters

Name	Type	Required	Description
task_id	string	Yes	The ID of the task (path parameter).

Response

A list of task run preview objects, ordered by start time descending.array

Example

GET

/v2/tasks/{task_id}/runs

1GET /v2/tasks/a1b2c3d4-e5f6-7890-1234-567890abcdef/runs
2Authorization: Bearer YOUR_API_KEY

GET

Get Latest Task Run

Get the result and details of the most recent run for a specific task.

GET/v2/tasks/{task_id}/runs/latest

Parameters

Name	Type	Required	Description
task_id	string	Yes	The ID of the task (path parameter).

Response

The latest task run object including the result.object

Example

GET

/v2/tasks/{task_id}/runs/latest

1GET /v2/tasks/a1b2c3d4-e5f6-7890-1234-567890abcdef/runs/latest
2Authorization: Bearer YOUR_API_KEY

Authentication Required

Remember to include your API key in the request headers for authentication with all endpoints. Keep your API keys secure and never expose them in client-side code.