Documentation
Quick Start
Generate synthetic training data with a single API call:
terminal
curl -X POST https://api.stackai.app/v1/synthetic/generate \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"schema": "instruction_v1",
"domain": "customer_support",
"count": 100,
"models": {
"generator": {"provider": "openai", "model": "gpt-4o"}
}
}'Data Schemas
instruction_v1
Instruction-response pairs for supervised fine-tuning.
instruction_v1.json
{
"system_prompt": "You are a helpful assistant...",
"instruction": "How do I reset my password?",
"response": "To reset your password, follow these steps...",
"metadata": {
"domain": "customer_support",
"difficulty": "easy"
}
}preference_v1
Paired responses with preference labels for RLHF training.
preference_v1.json
{
"prompt": "Explain quantum computing",
"chosen": "Quantum computing uses quantum mechanics...",
"rejected": "Quantum computing is very complicated...",
"chosen_score": 8.5,
"rejected_score": 4.2,
"reasoning": "The chosen response provides clear explanation..."
}eval_v1
Evaluation datasets for benchmarking model performance.
eval_v1.json
{
"input": "What is the capital of France?",
"ideal_output": "Paris",
"metrics": ["exact_match", "factual_accuracy"]
}Models & Quality Tiers
Choose a quality tier based on your needs. Each tier maps to a specific model:
| Tier | Provider | Model | Schemas | Price/1K |
|---|---|---|---|---|
| Economy | OpenAI | gpt-4o-mini | instruction, eval | $0.50 |
| Standard | OpenAI | gpt-4o | instruction, eval | $5.00 |
| Premium | Anthropic | claude-sonnet-4-6 | instruction, preference, eval | $25.00 |
Generator vs Critic
All schemas require a generator model. The preference_v1 schema also uses a critic model to score and compare responses. Other schemas do not use a critic.
preference_v1 with critic
curl -X POST https://api.stackai.app/v1/synthetic/generate \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"schema": "preference_v1",
"domain": "code_review",
"count": 50,
"models": {
"generator": {"provider": "anthropic", "model": "claude-sonnet-4-6"},
"critic": {"provider": "anthropic", "model": "claude-sonnet-4-6"}
}
}'API Reference
POST /v1/synthetic/generateCreate a new generation job
GET /v1/synthetic/jobs/:jobIdGet job status and statistics
GET /v1/synthetic/jobs/:jobId/resultsStream job results
GET /v1/synthetic/jobs/:jobId/results/urlGet presigned download URL
Constraints
Control generation with optional constraints:
| Parameter | Type | Description |
|---|---|---|
| language | string | ISO language code (e.g., "en", "es") |
| tone | string | Writing tone (e.g., "formal", "casual") |
| difficulty | string | Content difficulty (e.g., "easy", "hard") |
| max_tokens | number | Maximum tokens per response (50-4000) |