POST /public/enrich
Enrich items with AI-powered web research. Every extracted field includes source URLs.
import requests
response = requests.post(
"https://catalogapi.rastro.ai/api/public/enrich",
headers={"Authorization": "Bearer YOUR_API_KEY"},
json={
"items": [{"part_number": "6205-2RS", "name": "Deep Groove Ball Bearing"}],
"output_schema": [
{"name": "bore_diameter", "type": "string", "description": "Inner diameter in mm"},
{"name": "outer_diameter", "type": "string", "description": "Outer diameter in mm"}
]
}
)
Request Parameters
Core Parameters
| Parameter | Type | Required | Default | Description |
|---|
items | array | Yes | - | JSON objects to enrich |
output_schema | array | Yes | - | Fields to extract (see Output Schema) |
prompt | string | No | "" | Search prompt guiding web research |
Search Options
| Parameter | Type | Default | Description |
|---|
speed | string | "medium" | "fast" (~1 min), "medium" (~2 min), "slow" (10-15 min, most thorough), "cheap" (fastest, uses lighter model) |
allowed_domains | array | any | Restrict sources to these domains only |
web_search | boolean | true | Set false to skip web search and use input data directly |
Taxonomy & Quality
| Parameter | Type | Default | Description |
|---|
taxonomy | object | - | Taxonomy definition for category prediction |
predict_taxonomy | boolean | false | Predict category for each item |
quality_prompt | string | - | Prompt for quality scoring (adds 1-5 score) |
validate_semantics | boolean | false | AI validates values match field descriptions |
Job Control
| Parameter | Type | Default | Description |
|---|
catalog_id | string | - | Catalog ID from the dashboard — automatically applies its schema, taxonomy, and settings |
async_mode | boolean | false | Return immediately with job_id |
max_rows | integer | - | Process only first N items (dry run) |
source_activity_id | string | - | Resume from previous job to process remaining items |
include_source_explanations | boolean | false | Add explanation for each field value |
webhook_url | string | - | HTTPS URL to receive a POST when the job completes (see Webhooks) |
Output Schema
Define fields to extract. Each field needs a name, type, and description.
{
"output_schema": [
{"name": "material", "type": "string", "description": "Material composition"},
{"name": "weight_kg", "type": "number", "description": "Weight in kilograms", "unit": "kg"},
{"name": "certifications", "type": "array", "description": "Safety certifications"}
]
}
Field Options
| Option | Type | Description |
|---|
name | string | Field name (required) |
type | string | string, number, integer, boolean, array |
description | string | What to extract (required) |
unit | string | Units for numbers (e.g., "kg", "mm") |
enum | array | Constrain to specific values |
sample_values | array | Example values to guide extraction |
array_element_type | string | For arrays: string, number, image_url |
items_enum | array | For array-type fields, constrains array items to specific values |
required | boolean | Marks whether this field is required in the output |
merge | boolean | Merge enriched values with existing values instead of overwriting |
Response
{
"job_id": "abc123-def456-...",
"results": [{
"original_data": {"part_number": "6205-2RS", "name": "Deep Groove Ball Bearing"},
"after_data": {
"part_number": "6205-2RS",
"name": "Deep Groove Ball Bearing",
"bore_diameter": "25 mm",
"outer_diameter": "52 mm",
"sources": {
"bore_diameter": ["https://skf.com/products/bearings/6205-2RS"],
"outer_diameter": ["https://skf.com/products/bearings/6205-2RS"]
},
"source_explanations": {
"bore_diameter": "Found on manufacturer product page",
"outer_diameter": "Found on manufacturer product page"
}
},
"all_sources": ["https://skf.com/products/bearings/6205-2RS"]
}],
"total_items": 1,
"successful": 1,
"credits_used": 1,
"status": "completed"
}
Response Fields
| Field | Type | Description |
|---|
job_id | string | Job ID for tracking |
results | array | Enriched items (empty if async_mode=true) |
total_items | integer | Total items submitted |
total_rows | integer | Total items before max_rows limit |
successful | integer | Successfully enriched items |
credits_used | integer | Credits used (1 per item) |
status | string | "running", "completed", or "failed" |
Result Item Fields
| Field | Type | Description |
|---|
original_data | object | Input data |
after_data | object | Complete enriched record. Contains original fields, enriched values, sources (per-field URL citations), and source_explanations (per-field derivation explanations) |
all_sources | array | Deduplicated list of all URLs used across all fields |
error | string | Error message if failed |
category_id | string | Predicted category (if taxonomy enabled) |
category_path | string | Full path like "Bearings > Ball Bearings" |
taxonomy_attributes | object | Category-specific attributes |
quality_score | integer | 1-5 score (if quality_prompt set) |
quality_result | object | {score, explanation, issues, suggestions} |
taxonomy_attribute_explanations | object | Per-attribute explanations of how taxonomy values were derived |
review_info | object | AI reasoning and flags (see below) |
GET /public/enrich/
Poll enrichment job status and retrieve results. Results are returned progressively — partial results are available while the job is still running, as each batch of items completes processing.
Query Parameters
| Parameter | Type | Default | Description |
|---|
page | integer | 1 | Page number (1-indexed) |
page_size | integer | 1000 | Results per page |
# Basic poll
response = requests.get(
f"https://catalogapi.rastro.ai/api/public/enrich/{job_id}",
headers={"Authorization": "Bearer YOUR_API_KEY"}
)
# With pagination
response = requests.get(
f"https://catalogapi.rastro.ai/api/public/enrich/{job_id}",
headers={"Authorization": "Bearer YOUR_API_KEY"},
params={"page": 1, "page_size": 50}
)
Poll Response
{
"job_id": "abc123-def456-...",
"status": "running",
"results": [
{
"original_data": {"part_number": "6205-2RS"},
"after_data": {
"part_number": "6205-2RS",
"bore_diameter": "25 mm",
"sources": {
"bore_diameter": ["https://skf.com/products/bearings/6205-2RS"]
}
},
"all_sources": ["https://skf.com/products/bearings/6205-2RS"]
}
],
"total_items": 500,
"completed_items": 200,
"successful": 0,
"credits_used": 0,
"page": 1,
"page_size": 50,
"total_pages": 4
}
Poll Response Fields
| Field | Type | Description |
|---|
job_id | string | Job ID |
status | string | Current status (see below) |
results | array | Enriched items for the current page |
total_items | integer | Total items submitted |
completed_items | integer | Items with results available so far (grows while running) |
successful | integer | Finalized success count (set when job finishes) |
credits_used | integer | Credits charged (0 while running) |
page | integer | Current page number |
page_size | integer | Results per page |
total_pages | integer | Total pages based on completed_items |
service_usage | object | Cost details (only present when job finishes) |
While status is running, completed_items reflects how many items have finished so far. Use completed_items and total_items to show progress. credits_used remains 0 until the job finishes.
Status Values
| Status | Description | Results available? |
|---|
running | Job is still processing | Yes — partial results from completed batches |
completed | Job finished, all results ready | Yes — all results |
failed | Job failed or was cancelled | Yes — partial results from completed batches |
Cancelled jobs are reported as failed. Partial results and credits for completed items are preserved.
Source Citations
Each result includes all_sources — the URLs used during enrichment.
{
"all_sources": [
"https://skf.com/products/bearings/6205-2RS",
"https://mcmaster.com/6205-2RS"
]
}
When web_search is false, the AI uses only the input data (INPUT_DATA) to derive field values.
AI Transparency
Every result includes review_info with AI reasoning:
{
"review_info": {
"reasoning": "Found specs on manufacturer website, cross-referenced with distributor",
"flags": ["verify_tolerance_class"],
"field_issues": [
{"field": "tolerance", "severity": "warning", "message": "Value not found in sources"}
],
"flag_record": true
}
}
| Field | Type | Description |
|---|
reasoning | string | AI explanation of how values were extracted |
flags | array | General flags or warnings about the extraction |
field_issues | array | Per-field issues with field, severity (error/warning/info), message, and validator |
flag_record | boolean | Whether this record needs human attention |
Webhooks
Instead of polling GET /public/enrich/{job_id}, you can provide a webhook_url to receive a POST when the job finishes. The URL must use HTTPS.
{
"items": [...],
"output_schema": [...],
"async_mode": true,
"webhook_url": "https://yourserver.com/hooks/enrich"
}
Webhook Payload
When the job completes, we POST a JSON body to your URL:
{
"job_id": "abc123-def456-...",
"status": "completed",
"successful": 150,
"failed": 5,
"total": 155,
"credits_used": 150
}
| Field | Type | Description |
|---|
job_id | string | Use with GET /public/enrich/{job_id} to fetch full results |
status | string | Always "completed" |
successful | integer | Items enriched successfully |
failed | integer | Items that failed |
total | integer | successful + failed |
credits_used | integer | Credits charged |
Webhook delivery is best-effort with a single attempt and a 30-second timeout. If delivery fails the job is unaffected — use polling as a fallback.
Job Management
GET /public/enrich/jobs
List enrichment jobs for your organization, with optional status filtering and pagination.
Query Parameters
| Parameter | Type | Default | Description |
|---|
status | string | (all) | Filter by status: running, completed, failed, cancelled |
limit | integer | 20 | Results per page (1–100) |
offset | integer | 0 | Pagination offset |
import requests
# List running jobs
response = requests.get(
"https://catalogapi.rastro.ai/api/public/enrich/jobs",
headers={"Authorization": "Bearer YOUR_API_KEY"},
params={"status": "running"}
)
jobs = response.json()
Response
{
"jobs": [
{
"job_id": "abc123-def456-...",
"status": "running",
"total_items": 500,
"completed_items": 200,
"failed_items": 3,
"created_at": "2026-02-11T10:00:00+00:00",
"started_at": "2026-02-11T10:00:01+00:00",
"finished_at": null
}
],
"total": 12
}
| Field | Type | Description |
|---|
jobs | array | List of job summaries |
total | integer | Total jobs matching the filter |
Each job object:
| Field | Type | Description |
|---|
job_id | string | Job ID (use with other endpoints) |
status | string | running, completed, failed, or cancelled |
total_items | integer | Items submitted |
completed_items | integer | Items processed so far |
failed_items | integer | Items that failed |
created_at | string | ISO 8601 timestamp |
started_at | string | ISO 8601 timestamp (null if not yet started) |
finished_at | string | ISO 8601 timestamp (null if still running) |
POST /public/enrich//cancel
Cancel a running enrichment job. Items already processed are kept and credits are charged only for completed items.
import requests
response = requests.post(
f"https://catalogapi.rastro.ai/api/public/enrich/{job_id}/cancel",
headers={"Authorization": "Bearer YOUR_API_KEY"}
)
result = response.json()
print(f"Cancelled — {result['completed_items']} items kept, {result['credits_used']} credits charged")
Response
{
"success": true,
"message": "Activity abc123-def456 cancelled",
"completed_items": 200,
"credits_used": 200
}
| Field | Type | Description |
|---|
success | boolean | Whether cancellation succeeded |
message | string | Human-readable status |
completed_items | integer | Items that finished before cancellation |
credits_used | integer | Credits charged for completed items |
You can only cancel jobs with status running. Completed or already-cancelled jobs return an error.
Concurrency Limits
Each organization can run up to 10 concurrent enrichment jobs. Submitting an 11th job returns 429 Too Many Requests:
{
"detail": "Too many concurrent enrichment jobs (10/10). List running jobs: GET /api/public/enrich/jobs?status=running — Cancel a job: POST /api/public/enrich/{job_id}/cancel"
}
To unblock yourself:
- List running jobs —
GET /api/public/enrich/jobs?status=running
- Cancel any jobs you no longer need —
POST /api/public/enrich/{job_id}/cancel
- Retry your request