Enrich API Reference

POST /public/enrich

Enrich items with AI-powered web research. Every extracted field includes source URLs.

import requests

response = requests.post(
    "https://catalogapi.rastro.ai/api/public/enrich",
    headers={"Authorization": "Bearer YOUR_API_KEY"},
    json={
        "items": [{"part_number": "6205-2RS", "name": "Deep Groove Ball Bearing"}],
        "output_schema": [
            {"name": "bore_diameter", "type": "string", "description": "Inner diameter in mm"},
            {"name": "outer_diameter", "type": "string", "description": "Outer diameter in mm"}
        ]
    }
)

Request Parameters

Core Parameters

Parameter	Type	Required	Default	Description
`items`	array	Yes	-	JSON objects to enrich
`output_schema`	array	Yes	-	Fields to extract (see Output Schema)
`prompt`	string	No	`""`	Search prompt guiding web research

Search Options

Parameter	Type	Default	Description
`speed`	string	`"medium"`	`"fast"` (~1 min), `"medium"` (~2 min), `"slow"` (10-15 min, most thorough), `"cheap"` (fastest, uses lighter model)
`allowed_domains`	array	any	Restrict sources to these domains only
`web_search`	boolean	`true`	Set `false` to skip web search and use input data directly

Taxonomy & Quality

Parameter	Type	Default	Description
`taxonomy`	object	-	Taxonomy definition for category prediction
`predict_taxonomy`	boolean	`false`	Predict category for each item
`quality_prompt`	string	-	Prompt for quality scoring (adds 1-5 score)
`validate_semantics`	boolean	`false`	AI validates values match field descriptions

Job Control

Parameter	Type	Default	Description
`catalog_id`	string	-	Catalog ID from the dashboard — automatically applies its schema, taxonomy, and settings
`async_mode`	boolean	`false`	Return immediately with `job_id`
`max_rows`	integer	-	Process only first N items (dry run)
`source_activity_id`	string	-	Resume from previous job to process remaining items
`include_source_explanations`	boolean	`false`	Add explanation for each field value
`webhook_url`	string	-	HTTPS URL to receive a POST when the job completes (see Webhooks)

Output Schema

Define fields to extract. Each field needs a name, type, and description.

{
  "output_schema": [
    {"name": "material", "type": "string", "description": "Material composition"},
    {"name": "weight_kg", "type": "number", "description": "Weight in kilograms", "unit": "kg"},
    {"name": "certifications", "type": "array", "description": "Safety certifications"}
  ]
}

Field Options

Option	Type	Description
`name`	string	Field name (required)
`type`	string	`string`, `number`, `integer`, `boolean`, `array`
`description`	string	What to extract (required)
`unit`	string	Units for numbers (e.g., `"kg"`, `"mm"`)
`enum`	array	Constrain to specific values
`sample_values`	array	Example values to guide extraction
`array_element_type`	string	For arrays: `string`, `number`, `image_url`
`items_enum`	array	For array-type fields, constrains array items to specific values
`required`	boolean	Marks whether this field is required in the output
`merge`	boolean	Merge enriched values with existing values instead of overwriting

Response

{
  "job_id": "abc123-def456-...",
  "results": [{
    "original_data": {"part_number": "6205-2RS", "name": "Deep Groove Ball Bearing"},
    "after_data": {
      "part_number": "6205-2RS",
      "name": "Deep Groove Ball Bearing",
      "bore_diameter": "25 mm",
      "outer_diameter": "52 mm",
      "sources": {
        "bore_diameter": ["https://skf.com/products/bearings/6205-2RS"],
        "outer_diameter": ["https://skf.com/products/bearings/6205-2RS"]
      },
      "source_explanations": {
        "bore_diameter": "Found on manufacturer product page",
        "outer_diameter": "Found on manufacturer product page"
      }
    },
    "all_sources": ["https://skf.com/products/bearings/6205-2RS"]
  }],
  "total_items": 1,
  "successful": 1,
  "credits_used": 1,
  "status": "completed"
}

Response Fields

Field	Type	Description
`job_id`	string	Job ID for tracking
`results`	array	Enriched items (empty if `async_mode=true`)
`total_items`	integer	Total items submitted
`total_rows`	integer	Total items before `max_rows` limit
`successful`	integer	Successfully enriched items
`credits_used`	integer	Credits used (1 per item)
`status`	string	`"running"`, `"completed"`, or `"failed"`

Result Item Fields

Field	Type	Description
`original_data`	object	Input data
`after_data`	object	Complete enriched record. Contains original fields, enriched values, `sources` (per-field URL citations), and `source_explanations` (per-field derivation explanations)
`all_sources`	array	Deduplicated list of all URLs used across all fields
`error`	string	Error message if failed
`category_id`	string	Predicted category (if taxonomy enabled)
`category_path`	string	Full path like `"Bearings > Ball Bearings"`
`taxonomy_attributes`	object	Category-specific attributes
`quality_score`	integer	1-5 score (if `quality_prompt` set)
`quality_result`	object	`{score, explanation, issues, suggestions}`
`taxonomy_attribute_explanations`	object	Per-attribute explanations of how taxonomy values were derived
`review_info`	object	AI reasoning and flags (see below)

GET /public/enrich/

Poll enrichment job status and retrieve results. Results are returned progressively — partial results are available while the job is still running, as each batch of items completes processing.

Query Parameters

Parameter	Type	Default	Description
`page`	integer	`1`	Page number (1-indexed)
`page_size`	integer	`1000`	Results per page

# Basic poll
response = requests.get(
    f"https://catalogapi.rastro.ai/api/public/enrich/{job_id}",
    headers={"Authorization": "Bearer YOUR_API_KEY"}
)

# With pagination
response = requests.get(
    f"https://catalogapi.rastro.ai/api/public/enrich/{job_id}",
    headers={"Authorization": "Bearer YOUR_API_KEY"},
    params={"page": 1, "page_size": 50}
)

Poll Response

{
  "job_id": "abc123-def456-...",
  "status": "running",
  "results": [
    {
      "original_data": {"part_number": "6205-2RS"},
      "after_data": {
        "part_number": "6205-2RS",
        "bore_diameter": "25 mm",
        "sources": {
          "bore_diameter": ["https://skf.com/products/bearings/6205-2RS"]
        }
      },
      "all_sources": ["https://skf.com/products/bearings/6205-2RS"]
    }
  ],
  "total_items": 500,
  "completed_items": 200,
  "successful": 0,
  "credits_used": 0,
  "page": 1,
  "page_size": 50,
  "total_pages": 4
}

Poll Response Fields

Field	Type	Description
`job_id`	string	Job ID
`status`	string	Current status (see below)
`results`	array	Enriched items for the current page
`total_items`	integer	Total items submitted
`completed_items`	integer	Items with results available so far (grows while `running`)
`successful`	integer	Finalized success count (set when job finishes)
`credits_used`	integer	Credits charged (`0` while `running`)
`page`	integer	Current page number
`page_size`	integer	Results per page
`total_pages`	integer	Total pages based on `completed_items`
`service_usage`	object	Cost details (only present when job finishes)

While status is running, completed_items reflects how many items have finished so far. Use completed_items and total_items to show progress. credits_used remains 0 until the job finishes.

Status Values

Status	Description	Results available?
`running`	Job is still processing	Yes — partial results from completed batches
`completed`	Job finished, all results ready	Yes — all results
`failed`	Job failed or was cancelled	Yes — partial results from completed batches

Cancelled jobs are reported as failed. Partial results and credits for completed items are preserved.

Source Citations

Each result includes all_sources — the URLs used during enrichment.

{
  "all_sources": [
    "https://skf.com/products/bearings/6205-2RS",
    "https://mcmaster.com/6205-2RS"
  ]
}

When web_search is false, the AI uses only the input data (INPUT_DATA) to derive field values.

AI Transparency

Every result includes review_info with AI reasoning:

{
  "review_info": {
    "reasoning": "Found specs on manufacturer website, cross-referenced with distributor",
    "flags": ["verify_tolerance_class"],
    "field_issues": [
      {"field": "tolerance", "severity": "warning", "message": "Value not found in sources"}
    ],
    "flag_record": true
  }
}

Field	Type	Description
`reasoning`	string	AI explanation of how values were extracted
`flags`	array	General flags or warnings about the extraction
`field_issues`	array	Per-field issues with `field`, `severity` (error/warning/info), `message`, and `validator`
`flag_record`	boolean	Whether this record needs human attention

Webhooks

Instead of polling GET /public/enrich/{job_id}, you can provide a webhook_url to receive a POST when the job finishes. The URL must use HTTPS.

{
  "items": [...],
  "output_schema": [...],
  "async_mode": true,
  "webhook_url": "https://yourserver.com/hooks/enrich"
}

Webhook Payload

When the job completes, we POST a JSON body to your URL:

{
  "job_id": "abc123-def456-...",
  "status": "completed",
  "successful": 150,
  "failed": 5,
  "total": 155,
  "credits_used": 150
}

Field	Type	Description
`job_id`	string	Use with `GET /public/enrich/{job_id}` to fetch full results
`status`	string	Always `"completed"`
`successful`	integer	Items enriched successfully
`failed`	integer	Items that failed
`total`	integer	`successful + failed`
`credits_used`	integer	Credits charged

Webhook delivery is best-effort with a single attempt and a 30-second timeout. If delivery fails the job is unaffected — use polling as a fallback.

Job Management

GET /public/enrich/jobs

List enrichment jobs for your organization, with optional status filtering and pagination.

Query Parameters

Parameter	Type	Default	Description
`status`	string	(all)	Filter by status: `running`, `completed`, `failed`, `cancelled`
`limit`	integer	`20`	Results per page (1–100)
`offset`	integer	`0`	Pagination offset

import requests

# List running jobs
response = requests.get(
    "https://catalogapi.rastro.ai/api/public/enrich/jobs",
    headers={"Authorization": "Bearer YOUR_API_KEY"},
    params={"status": "running"}
)
jobs = response.json()

Response

{
  "jobs": [
    {
      "job_id": "abc123-def456-...",
      "status": "running",
      "total_items": 500,
      "completed_items": 200,
      "failed_items": 3,
      "created_at": "2026-02-11T10:00:00+00:00",
      "started_at": "2026-02-11T10:00:01+00:00",
      "finished_at": null
    }
  ],
  "total": 12
}

Field	Type	Description
`jobs`	array	List of job summaries
`total`	integer	Total jobs matching the filter

Each job object:

Field	Type	Description
`job_id`	string	Job ID (use with other endpoints)
`status`	string	`running`, `completed`, `failed`, or `cancelled`
`total_items`	integer	Items submitted
`completed_items`	integer	Items processed so far
`failed_items`	integer	Items that failed
`created_at`	string	ISO 8601 timestamp
`started_at`	string	ISO 8601 timestamp (null if not yet started)
`finished_at`	string	ISO 8601 timestamp (null if still running)

POST /public/enrich//cancel

Cancel a running enrichment job. Items already processed are kept and credits are charged only for completed items.

import requests

response = requests.post(
    f"https://catalogapi.rastro.ai/api/public/enrich/{job_id}/cancel",
    headers={"Authorization": "Bearer YOUR_API_KEY"}
)
result = response.json()
print(f"Cancelled — {result['completed_items']} items kept, {result['credits_used']} credits charged")

Response

{
  "success": true,
  "message": "Activity abc123-def456 cancelled",
  "completed_items": 200,
  "credits_used": 200
}

Field	Type	Description
`success`	boolean	Whether cancellation succeeded
`message`	string	Human-readable status
`completed_items`	integer	Items that finished before cancellation
`credits_used`	integer	Credits charged for completed items

You can only cancel jobs with status running. Completed or already-cancelled jobs return an error.

Concurrency Limits

Each organization can run up to 10 concurrent enrichment jobs. Submitting an 11th job returns 429 Too Many Requests:

{
  "detail": "Too many concurrent enrichment jobs (10/10). List running jobs: GET /api/public/enrich/jobs?status=running — Cancel a job: POST /api/public/enrich/{job_id}/cancel"
}

To unblock yourself:

List running jobs — GET /api/public/enrich/jobs?status=running
Cancel any jobs you no longer need — POST /api/public/enrich/{job_id}/cancel
Retry your request

Getting Started

Enrich API

Flows

Catalogs

MCP (Codex / Claude)

Enterprise

Enrich API Reference

POST /public/enrich

Request Parameters

Core Parameters

Search Options

Taxonomy & Quality

Job Control

Output Schema

Field Options

Response

Response Fields

Result Item Fields

GET /public/enrich/

Query Parameters

Poll Response

Poll Response Fields

Status Values

Source Citations

AI Transparency

Webhooks

Webhook Payload

Job Management

GET /public/enrich/jobs

Query Parameters

Response

POST /public/enrich//cancel

Response

Concurrency Limits

Getting Started

Enrich API

Flows

Catalogs

MCP (Codex / Claude)

Enterprise

​POST /public/enrich

​Request Parameters

​Core Parameters

​Search Options

​Taxonomy & Quality

​Job Control

​Output Schema

​Field Options

​Response

​Response Fields

​Result Item Fields

​GET /public/enrich/

​Query Parameters

​Poll Response

​Poll Response Fields

​Status Values

​Source Citations

​AI Transparency

​Webhooks

​Webhook Payload

​Job Management

​GET /public/enrich/jobs

​Query Parameters

​Response

​POST /public/enrich//cancel

​Response

​Concurrency Limits

POST /public/enrich

Request Parameters

Core Parameters

Search Options

Taxonomy & Quality

Job Control

Output Schema

Field Options

Response

Response Fields

Result Item Fields

GET /public/enrich/

Query Parameters

Poll Response

Poll Response Fields

Status Values

Source Citations

AI Transparency

Webhooks

Webhook Payload

Job Management

GET /public/enrich/jobs

Query Parameters

Response

POST /public/enrich//cancel

Response

Concurrency Limits