nebius-batch-synthetic

Generate synthetic training data using Nebius Token Factory batch inference (50% cheaper than real-time, async, no rate-limit impact). Use this skill whenever the user wants to run batch inference on Nebius, generate synthetic datasets at scale, create instruction-tuning data, run async LLM jobs on large prompt sets, or export batch results as fine-tuning JSONL. Trigger for phrases like "generate synthetic data with Nebius", "run batch inference", "create training data at scale", "async LLM generation", "batch job on Token Factory", "generate QA pairs", or any question about bulk/offline inference or synthetic data pipelines on Nebius.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "nebius-batch-synthetic" with this command: npx skills add arindam200/nebius-skills/arindam200-nebius-skills-nebius-batch-synthetic

Nebius Batch Inference — Synthetic Data Generation

Run large-scale async LLM jobs at 50% cost, no rate-limit impact. Ideal for generating synthetic training datasets, annotation, evaluation sets, or any offline bulk inference.

Prerequisites

pip install openai
export NEBIUS_API_KEY="your-key"

API base: https://api.tokenfactory.nebius.com/v1/

Limits & pricing

ConstraintValue
Max requests per file5,000,000
Max file size10 GB
Completion window24 hours
Cost vs real-time50% cheaper
Rate limitsNot consumed

Complete pipeline

1. Build JSONL batch file

Each line = one inference request. All requests must use the same model.

import json, uuid

prompts = [
    "Explain vector databases for beginners.",
    "What is the difference between RAG and fine-tuning?",
    # ... up to 5M prompts
]

with open("batch_requests.jsonl", "w") as f:
    for prompt in prompts:
        f.write(json.dumps({
            "custom_id": str(uuid.uuid4()),   # unique ID to match results
            "url": "/v1/chat/completions",
            "body": {
                "model": "meta-llama/Meta-Llama-3.1-70B-Instruct",
                "messages": [
                    {"role": "system", "content": "You are a helpful expert."},
                    {"role": "user",   "content": prompt},
                ],
                "max_tokens": 1024,
                "temperature": 0.7,
            },
        }) + "\n")

2. Upload + create batch job

from openai import OpenAI
client = OpenAI(base_url="https://api.tokenfactory.nebius.com/v1/", api_key=API_KEY)

with open("batch_requests.jsonl", "rb") as f:
    file_obj = client.files.create(file=f, purpose="batch")

batch = client.batches.create(
    input_file_id=file_obj.id,
    endpoint="/v1/chat/completions",
    completion_window="24h",
    metadata={"description": "synthetic-data-gen"},
)
print(f"Batch: {batch.id}  status={batch.status}")

3. Poll until complete

import time

while True:
    batch = client.batches.retrieve(batch.id)
    counts = batch.request_counts
    print(f"status={batch.status}  done={counts.completed}/{counts.total}")
    if batch.status in ("completed", "failed", "cancelled", "expired"):
        break
    time.sleep(30)

4. Download outputs

content = client.files.content(batch.output_file_id)
results = [json.loads(line) for line in content.text.strip().splitlines()]

Each result record:

{
  "custom_id": "...",
  "response": {
    "body": {
      "choices": [{"message": {"content": "The model's response..."}}]
    }
  }
}

5. Export as fine-tuning JSONL

# Build custom_id → original prompt lookup
id_to_prompt = {}
with open("batch_requests.jsonl") as f:
    for line in f:
        req = json.loads(line)
        user_msg = next(m["content"] for m in req["body"]["messages"] if m["role"] == "user")
        id_to_prompt[req["custom_id"]] = user_msg

with open("training.jsonl", "w") as out:
    for rec in results:
        reply  = rec["response"]["body"]["choices"][0]["message"]["content"].strip()
        prompt = id_to_prompt.get(rec["custom_id"], "")
        if len(reply) < 50:      # quality filter
            continue
        out.write(json.dumps({
            "messages": [
                {"role": "user",      "content": prompt},
                {"role": "assistant", "content": reply},
            ]
        }) + "\n")

Tips for synthetic data quality

  • Use a large teacher model (70B+) to generate, then fine-tune a smaller model — teacher distillation
  • Set temperature: 0.6–0.8 for diverse yet coherent outputs
  • Add a quality filter (min length, keyword checks) before using as training data
  • Run deduplication on custom_id before uploading as training file

Clean up batch files

You can have up to 500 batch files. Delete old ones:

client.files.delete("file_123")

Bundled reference

Read references/batch-format.md when the user asks about JSONL structure, file limits, or output format.

Reference script

Full working script: scripts/05_batch_inference_synthetic.py

Docs: https://docs.tokenfactory.nebius.com/ai-models-inference/batch-inference

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

nebius-dedicated-endpoint

No summary provided by upstream source.

Repository SourceNeeds Review
General

nebius-datalab-pipeline

No summary provided by upstream source.

Repository SourceNeeds Review
General

ll-feishu-audio

飞书语音交互技能。支持语音消息自动识别、AI 处理、语音回复全流程。需要配置 FEISHU_APP_ID 和 FEISHU_APP_SECRET 环境变量。使用 faster-whisper 进行语音识别,Edge TTS 进行语音合成,自动转换 OPUS 格式并通过飞书发送。适用于飞书平台的语音对话场景。

Archived SourceRecently Updated
General

test_skill

import json import tkinter as tk from tkinter import messagebox, simpledialog

Archived SourceRecently Updated