tinker

Fine-tune LLMs using the Tinker API. Covers supervised fine-tuning, reinforcement learning, LoRA training, vision-language models, and both high-level Cookbook patterns and low-level API usage.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "tinker" with this command: npx skills add sundial-org/skills/sundial-org-skills-tinker

Tinker API - LLM Fine-Tuning

Overview

Tinker is a training API for large language models from Thinking Machines Lab. It provides:

  • Supervised Fine-Tuning (SFT): Train models on instruction/completion pairs
  • Reinforcement Learning (RL): PPO and policy gradient losses; cookbook patterns include GRPO-like group rollouts/advantage centering
  • Vision-Language Models: VLM support via Qwen3-VL
  • LoRA Training: Efficient parameter-efficient fine-tuning

Two abstraction levels:

  • Tinker Cookbook: High-level patterns with automatic training loops
  • Low-Level API: Manual control for custom training logic

Quick Reference

TopicReference
Setup & Core ConceptsGetting Started
API Classes & TypesAPI Reference
Supervised LearningSupervised Learning
RL TrainingReinforcement Learning
Loss FunctionsLoss Functions
Chat TemplatesRendering
Models & LoRAModels & LoRA
Example ScriptsRecipes

Installation

pip install tinker tinker-cookbook
export TINKER_API_KEY=your_api_key_here

Minimal Example

import numpy as np
import tinker
from tinker import types

# Create clients
service_client = tinker.ServiceClient()
training_client = service_client.create_lora_training_client(
    base_model="Qwen/Qwen3-30B-A3B", rank=32
)
tokenizer = training_client.get_tokenizer()

# Prepare data
prompt = "English: hello\nPig Latin:"
completion = " ello-hay\n"
prompt_tokens = tokenizer.encode(prompt, add_special_tokens=True)
completion_tokens = tokenizer.encode(completion, add_special_tokens=False)
tokens = prompt_tokens + completion_tokens
weights = np.array(([0] * len(prompt_tokens)) + ([1] * len(completion_tokens)), dtype=np.float32)
target_tokens = np.array(tokens[1:], dtype=np.int64)

datum = types.Datum(
    model_input=types.ModelInput.from_ints(tokens=tokens[:-1]),
    loss_fn_inputs={
        "target_tokens": target_tokens,
        "weights": weights[1:]
    }
)

# Train
fwdbwd = training_client.forward_backward([datum], "cross_entropy")
optim = training_client.optim_step(types.AdamParams(learning_rate=1e-4))
fwdbwd.result(); optim.result()

# Sample
sampling_client = training_client.save_weights_and_get_sampling_client(name="v1")
result = sampling_client.sample(
    prompt=types.ModelInput.from_ints(tokens=tokenizer.encode("English: world\nPig Latin:", add_special_tokens=True)),
    sampling_params=types.SamplingParams(max_tokens=20),
    num_samples=1
).result()
print(tokenizer.decode(result.sequences[0].tokens))

Common Imports

# Low-level API
import tinker
from tinker import types
from tinker.types import Datum, ModelInput, TensorData, AdamParams, SamplingParams

# Cookbook (high-level)
import chz
import asyncio
from tinker_cookbook.supervised import train
from tinker_cookbook.supervised.types import ChatDatasetBuilder, ChatDatasetBuilderCommonConfig
from tinker_cookbook.supervised.data import (
    SupervisedDatasetFromHFDataset,
    StreamingSupervisedDatasetFromHFDataset,
    FromConversationFileBuilder,
    conversation_to_datum,
)
from tinker_cookbook.renderers import get_renderer, TrainOnWhat
from tinker_cookbook.model_info import get_recommended_renderer_name
from tinker_cookbook.tokenizer_utils import get_tokenizer

When to Use What

ScenarioApproach
Standard SFT with HF/JSONL dataCookbook ChatDatasetBuilder + tinker_cookbook.supervised.train.main()
Custom preprocessingCustom SupervisedDataset class
Large datasets (>1M)StreamingSupervisedDatasetFromHFDataset
RL / GRPOCookbook RL patterns
Research / custom loopsLow-level forward_backward() + optim_step()
Vision-languageQwen3-VL + ImageChunk

External Resources

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

icml-reviewer

No summary provided by upstream source.

Repository SourceNeeds Review
General

ai-co-scientist

No summary provided by upstream source.

Repository SourceNeeds Review
Research

cs-research-methodology

No summary provided by upstream source.

Repository SourceNeeds Review