Now supporting Claude Sonnet 4.6

Synthetic Training Data for LLMs

Generate high-quality, schema-validated datasets with full provenance tracking. Perfect for fine-tuning, RLHF, and evaluation.

Get Started Free View Pricing

Schema ValidatedCross-Order DeduplicationFull ProvenanceSafety Filtered

Three Data Schemas

Purpose-built schemas for every ML training paradigm.

`instruction_v1`

Instruction-response pairs for supervised fine-tuning. Each record includes system context, user instruction, and ideal response.

`preference_v1`

Paired responses with preference labels for RLHF. Includes chosen/rejected responses with quality scores and reasoning.

`eval_v1`

Evaluation datasets for benchmarking. Input-output pairs with configurable metrics like exact_match and semantic_similarity.

One API Call Away

Generate production-quality training data with a simple REST API. Full schema validation, quality scoring, and provenance tracking built in.

Schema-validated output guaranteed
Critic scoring rejects low-quality records
Deduplication across your dataset
Full provenance metadata per record

api-request.sh

curl -X POST https://api.stackai.app/v1/synthetic/generate \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "schema": "instruction_v1",
    "domain": "customer_support",
    "count": 100
  }'

How It Works

Three steps from configuration to production-ready datasets.

Configure Your Job

Choose a schema, specify your domain, set constraints like language and difficulty, and pick your quality tier.

Generate with QC

Our pipeline generates data using your chosen LLMs, applies critic scoring, heuristic validation, and safety filtering.

Download Dataset

Get your validated dataset in NDJSON or JSON format with full manifest and provenance metadata.

Simple, Transparent Pricing

Pay per record or save with a subscription. No hidden fees.

Economy

$0.50

per 1K records

GPT-4o Mini · Fast prototyping

Popular

Standard

per 1K records

GPT-4o · Balanced quality

Premium

$25

per 1K records

Claude Sonnet 4.5 · Production quality

View full pricing & subscription plans →

Ready to Generate?

Start creating high-quality training data in minutes. No credit card required.

Get Started Free