local-ai-models

iOS On-Device AI Models

Production-ready guide for implementing on-device AI models in iOS apps using Apple's Foundation Models framework and MLX Swift.

When to Use This Skill

Implementing local LLM inference in iOS apps
Building chat interfaces with Foundation Models
Integrating Vision Language Models (VLMs)
Adding text embeddings or image generation
Implementing tool/function calling with LLMs
Managing multi-turn conversations
Optimizing memory usage for on-device models
Supporting internationalization in AI features

Core Principles

Availability First - Always check model availability before initialization
Stream Responses - Provide progressive UI updates for better UX
Session Persistence - Reuse LanguageModelSession for multi-turn conversations (Foundation Models)
Memory Awareness - Use quantized models and monitor memory usage
Async Everything - Load models asynchronously, never block the main thread
Locale Support - Use supportsLocale(_:) and locale instructions for Foundation Models

Quick Reference

Framework Comparison

Topic Guide

Framework comparison and selection framework-selection.md

Foundation Models (Apple's Framework)

Topic Guide

Setup and configuration foundation-models/setup.md

Chat patterns and conversations foundation-models/chat-patterns.md

MLX Swift (Advanced Features)

Topic Guide

Setup and configuration mlx-swift/setup.md

Chat patterns with custom models mlx-swift/chat-patterns.md

Vision Language Models (VLMs) mlx-swift/vision-patterns.md

Tool calling, embeddings, structured gen mlx-swift/advanced-patterns.md

Model quantization with MLX-LM mlx-swift/quantization.md

Shared (Both Frameworks)

Topic Guide

Best practices and optimization shared/best-practices.md

Error handling and recovery shared/error-handling.md

Testing strategies shared/testing.md

Quick Decision Trees

Which framework should I use?

Do you need advanced features like:

Vision Language Models (VLMs)
Image generation
Custom models beyond the system model ├── Yes → MLX Swift (references/mlx-swift/) └── No → Is this a standard chat interface? ├── Yes → Foundation Models (simpler, recommended) └── No → Check framework-selection.md for guidance

Where should I start?

New to on-device AI? └── Start with Foundation Models: 1. Read framework-selection.md 2. Follow foundation-models/setup.md 3. Implement foundation-models/chat-patterns.md

Need advanced features? └── Use MLX Swift: 1. Read framework-selection.md 2. Follow mlx-swift/setup.md 3. Choose pattern: - Chat: mlx-swift/chat-patterns.md - Vision: mlx-swift/vision-patterns.md - Advanced: mlx-swift/advanced-patterns.md

Where should my model loading code live?

Is this model shared across features? ├── Yes → Create @Observable service in app/services/ └── No → Is it feature-specific? ├── Yes → Create @Observable class in feature/ └── No → Load inline with @State (simple cases only)

How should I handle conversations?

Foundation Models: └── Reuse LanguageModelSession for context (references/foundation-models/chat-patterns.md #multi-turn)

MLX Swift: └── Implement custom context management (references/mlx-swift/chat-patterns.md)

What generation parameters should I use?

What's the use case?

Factual answers (summaries, facts) └── temperature: 0.1-0.3

Balanced (chat, Q&A) └── temperature: 0.6-0.8

Creative (storytelling, ideas) └── temperature: 0.9-1.2

See references/shared/best-practices.md for details

Resources

MLX Swift Examples
Foundation Models Docs
Hugging Face Model Hub
MLX-LM Quantization
MLX Community Models

local-ai-models

Safety Notice

Copy this and send it to your AI assistant to learn

Source Transparency

Related Skills

app-intent-driven-development

typescript

app-store-scraper