patient

API Documentation

Welcome to the Radia Web Scraping API documentation. This API provides powerful endpoints for retrieving and parsing content from any website.

Getting Started

Overview of Radia's Comprehensive Web Scraping Platform

Platform Overview

Radia provides a comprehensive suite of tools to extract both clean Markdown content and structured JSON data from websites. Our platform is designed to handle everything from simple single-page scraping to complex multi-site data extraction workflows.

Key Features

Dynamic Scraping

Use the /scrape and/extract endpoints for on-the-fly scraping with custom parameters. Perfect for sites with dynamic content or one-off scraping needs.

Predefined Tools

Create and manage reusable scrapers through ourDashboard >> Scrapers. Each scraper is defined by three key components:

  • A target URL or URL pattern
  • A prompt that guides the extraction process
  • A structured output schema using OpenAI's formatlearn more

Our visual schema editor helps you create and validate your output schemas without writing JSON manually. Try it out in our Playground.

Automated Scheduling

Schedule recurring scraping jobs using cron expressions directly from theDashboard >> Tasks. Access job results through our API at:

  • /tasks/{task_id}/runs- View all runs for a task
  • /tasks/{task_id}/runs/latest- Get most recent results

Authentication

Important: All API endpoints require authentication via an API key. Generate your key in theDashboard >> API Keysand include it in every request header as:

Authorization: Bearer YOUR_API_KEY

Authorization

All API endpoints require authentication

Include an Authorization header with your API key in all requests.

You can obtain your API key from theDashboard >> API Keyspage.

Example

javascript
const apiKey = 'YOUR_API_KEY';

const response = await fetch(
  'https://api.radia.io/api/v2/scrape?url=www.example.com&format=json',
  {
    headers: {
      'Authorization': `Bearer ${apiKey}`,
      'Content-Type': 'application/json'
    }
  }
);

Scrape API

Retrieve cleaned HTML content from a specified URL

GET

Scrape URL Content

Retrieve cleaned Markdown content from a URL. Returns HTML by default.

GET/v2/scrape

Parameters

NameTypeRequiredDescription
urlstringYesThe URL of the website to scrape.
formatstringNoOutput format. Use "json" to receive a JSON object containing the markdown.
Default: html
scrollbooleanNoTry scrolling to reveal content. Defaults to true. If false, the page will be loaded as is and quicker response time.
Default: true

Response

Cleaned Markdown content as HTML string (default) or JSON object.string | object

Example

GET

/v2/scrape

1GET /v2/scrape?url=https://example.com&format=json&should_click_elements=true
2Authorization: Bearer YOUR_API_KEY

Extract API

Scrape and parse content from a specified URL into structured data

POST

Extract Structured Data

Scrape content and extract structured data based on a prompt and JSON schema.

POST/v2/extract

Parameters

NameTypeRequiredDescription
urlstringYesThe URL to scrape and parse.
promptstringYesPrompt to guide data extraction (e.g., 'Extract product details').
response_formatobjectYesJSON object describing the desired output structure (OpenAI function calling format).
scrollbooleanNoTry scrolling to reveal content. Defaults to true. If false, the page will be loaded as is and quicker response time.
Default: true
include_markdownbooleanNoInclude the cleaned Markdown content in the response. Defaults to false.
Default: false

Response

Extracted structured data, token count, and optionally the markdown content.object

Example

POST

/v2/extract

1POST /v2/extract
2Content-Type: application/json
3Authorization: Bearer YOUR_API_KEY
4
5{
6 "url": "https://example-product-page.com",
7 "prompt": "Extract the product name, price, and description.",
8 "response_format": {
9 "type": "object",
10 "properties": {
11 "product_name": { "type": "string", "description": "Name of the product" },
12 "price": { "type": "number", "description": "Price of the product" },
13 "description": { "type": "string", "description": "Product description" }
14 },
15 "required": ["product_name", "price"]
16 },
17 "should_click_elements": false,
18 "include_markdown": true
19}

Scrapers API

Manage scraper configurations

Available Endpoints

GET/v2/scrapers
Get all scraper configurations for the authenticated user.
POST/v2/scrapers
Create a new scraper configuration.
GET/v2/scrapers/{scraper_id}
Get details of a specific scraper configuration.
POST/v2/scrapers/{scraper_id}/run
Manually trigger a run of a specific scraper configuration.
GET

Get Scrapers

Get all scraper configurations for the authenticated user.

GET/v2/scrapers

Response

A list of scraper configuration objects.array

Example

GET

/v2/scrapers

1GET /v2/scrapers
2Authorization: Bearer YOUR_API_KEY
POST

Create Scraper

Create a new scraper configuration.

POST/v2/scrapers

Parameters

NameTypeRequiredDescription
scraper_namestringYesA name for this scraper configuration.
schema_idstringYesID of the schema defining the desired output structure.
scraped_urlstringYesThe target URL or URL pattern for the scraper.
promptstringYesThe prompt guiding the data extraction process.
should_click_elementsbooleanNoConfigure if the scraper should click elements. Defaults to false.
Default: false
headlessbooleanNoConfigure if the scraper runs headlessly. Defaults to false.
Default: false

Response

The newly created scraper configuration object.object

Example

POST

/v2/scrapers

1POST /v2/scrapers
2Content-Type: application/json
3Authorization: Bearer YOUR_API_KEY
4
5{
6 "scraper_name": "News Headline Scraper",
7 "schema_id": "sch-def-456",
8 "scraped_url": "https://news.example.com",
9 "prompt": "Extract the main headlines from the homepage.",
10 "should_click_elements": false,
11 "headless": true
12}
GET

Get Scraper by ID

Get details of a specific scraper configuration.

GET/v2/scrapers/{scraper_id}

Parameters

NameTypeRequiredDescription
scraper_idstringYesThe ID of the scraper to retrieve (path parameter).

Response

The requested scraper configuration object.object

Example

GET

/v2/scrapers/{scraper_id}

1GET /v2/scrapers/s1c2r3p4-e5f6-7890-1234-abcdef123456
2Authorization: Bearer YOUR_API_KEY
POST

Run Scraper

Manually trigger a run of a specific scraper configuration.

POST/v2/scrapers/{scraper_id}/run

Parameters

NameTypeRequiredDescription
scraper_idstringYesThe ID of the scraper to run (path parameter).
headlessbooleanNoOverrides scraper's default headless setting for this run (query parameter).

Response

The result of the scraper run.object

Example

POST

/v2/scrapers/{scraper_id}/run

1POST /v2/scrapers/s1c2r3p4-e5f6-7890-1234-abcdef123456/run?headless=true
2Authorization: Bearer YOUR_API_KEY

Schemas API

Manage scraping output schemas

Available Endpoints

GET/v2/schemas
Get all extraction schemas for the authenticated user.
POST/v2/schemas
Create a new extraction schema.
GET/v2/schemas/{schema_id}
Get details of a specific extraction schema.
GET

Get Schemas

Get all extraction schemas for the authenticated user.

GET/v2/schemas

Response

A list of schema objects.array

Example

GET

/v2/schemas

1GET /v2/schemas
2Authorization: Bearer YOUR_API_KEY
POST

Create Schema

Create a new extraction schema.

POST/v2/schemas

Parameters

NameTypeRequiredDescription
schema_namestringYesA descriptive name for the schema.
schema_jsonobjectYesJSON object defining the extraction structure (OpenAI function call format).

Response

The newly created schema object.object

Example

POST

/v2/schemas

1POST /v2/schemas
2Content-Type: application/json
3Authorization: Bearer YOUR_API_KEY
4
5{
6 "schema_name": "Article Details",
7 "schema_json": {
8 "type": "object",
9 "properties": {
10 "title": { "type": "string", "description": "Article title" },
11 "author": { "type": "string", "description": "Author name" },
12 "publish_date": { "type": "string", "description": "Publication date" }
13 },
14 "required": ["title", "author"]
15 }
16}
GET

Get Schema by ID

Get details of a specific extraction schema.

GET/v2/schemas/{schema_id}

Parameters

NameTypeRequiredDescription
schema_idstringYesThe ID of the schema to retrieve (path parameter).

Response

The requested schema object.object

Example

GET

/v2/schemas/{schema_id}

1GET /v2/schemas/sch-abc-123
2Authorization: Bearer YOUR_API_KEY

Tasks API

Manage scheduled scraping jobs

Available Endpoints

GET/v2/tasks
Get all scheduled tasks for the authenticated user.
POST/v2/tasks
Create a new scheduled task to run a scraper.
GET/v2/tasks/{task_id}
Get details of a specific task by its ID.
DELETE/v2/tasks/{task_id}
Delete a specific scheduled task.
POST/v2/tasks/{task_id}/run
Manually trigger a run of the specified task.
GET/v2/tasks/{task_id}/runs
Get a list of all historical runs for a specific task.
GET/v2/tasks/{task_id}/runs/latest
Get the result and details of the most recent run for a specific task.
GET

Get Tasks

Get all scheduled tasks for the authenticated user.

GET/v2/tasks

Response

A list of task objects.array

Example

GET

/v2/tasks

1GET /v2/tasks
2Authorization: Bearer YOUR_API_KEY
POST

Create Task

Create a new scheduled task to run a scraper.

POST/v2/tasks

Parameters

NameTypeRequiredDescription
scraper_idstringYesID of the scraper to run.
task_namestringYesA descriptive name for the task.
cron_minutestringYesCron expression: minute (0-59 or *).
cron_hourstringYesCron expression: hour (0-23 or *).
cron_day_of_monthstringYesCron expression: day of month (1-31 or *).
cron_monthstringYesCron expression: month (1-12 or *).
cron_day_of_weekstringYesCron expression: day of week (0-6 or *, Sunday=0).
cron_timezonestringNoTimezone for the schedule (e.g., 'America/New_York'). Defaults to UTC.
Default: UTC

Response

The newly created task object.object

Example

POST

/v2/tasks

1POST /v2/tasks
2Content-Type: application/json
3Authorization: Bearer YOUR_API_KEY
4
5{
6 "scraper_id": "s1c2r3p4-e5f6-7890-1234-abcdef123456",
7 "task_name": "Hourly Price Check",
8 "cron_minute": "0",
9 "cron_hour": "*",
10 "cron_day_of_month": "*",
11 "cron_month": "*",
12 "cron_day_of_week": "*",
13 "cron_timezone": "America/Los_Angeles"
14}
GET

Get Task By ID

Get details of a specific task by its ID.

GET/v2/tasks/{task_id}

Parameters

NameTypeRequiredDescription
task_idstringYesThe ID of the task to retrieve (path parameter).

Response

The requested task object.object

Example

GET

/v2/tasks/{task_id}

1GET /v2/tasks/a1b2c3d4-e5f6-7890-1234-567890abcdef
2Authorization: Bearer YOUR_API_KEY
DELETE

Delete Task

Delete a specific scheduled task.

DELETE/v2/tasks/{task_id}

Parameters

NameTypeRequiredDescription
task_idstringYesThe ID of the task to delete (path parameter).

Response

Confirmation message.object

Example

DELETE

/v2/tasks/{task_id}

1DELETE /v2/tasks/a1b2c3d4-e5f6-7890-1234-567890abcdef
2Authorization: Bearer YOUR_API_KEY
POST

Run Task

Manually trigger a run of the specified task.

POST/v2/tasks/{task_id}/run

Parameters

NameTypeRequiredDescription
task_idstringYesThe ID of the task to run (path parameter).

Response

The result of the scraper run initiated by the task.object

Example

POST

/v2/tasks/{task_id}/run

1POST /v2/tasks/a1b2c3d4-e5f6-7890-1234-567890abcdef/run
2Authorization: Bearer YOUR_API_KEY
GET

Get Task Runs

Get a list of all historical runs for a specific task.

GET/v2/tasks/{task_id}/runs

Parameters

NameTypeRequiredDescription
task_idstringYesThe ID of the task (path parameter).

Response

A list of task run preview objects, ordered by start time descending.array

Example

GET

/v2/tasks/{task_id}/runs

1GET /v2/tasks/a1b2c3d4-e5f6-7890-1234-567890abcdef/runs
2Authorization: Bearer YOUR_API_KEY
GET

Get Latest Task Run

Get the result and details of the most recent run for a specific task.

GET/v2/tasks/{task_id}/runs/latest

Parameters

NameTypeRequiredDescription
task_idstringYesThe ID of the task (path parameter).

Response

The latest task run object including the result.object

Example

GET

/v2/tasks/{task_id}/runs/latest

1GET /v2/tasks/a1b2c3d4-e5f6-7890-1234-567890abcdef/runs/latest
2Authorization: Bearer YOUR_API_KEY

Authentication Required

Remember to include your API key in the request headers for authentication with all endpoints. Keep your API keys secure and never expose them in client-side code.