AI RAG Pipeline

Build RAG (Retrieval Augmented Generation) pipelines via inference.sh CLI.

Quick Start

Requires inference.sh CLI (infsh ). Get installation instructions: npx skills add inference-sh/skills@agent-tools

infsh login

Simple RAG: Search + LLM

SEARCH=$(infsh app run tavily/search-assistant --input '{"query": "latest AI developments 2024"}') infsh app run openrouter/claude-sonnet-45 --input "{ "prompt": "Based on this research, summarize the key trends: $SEARCH" }"

What is RAG?

RAG combines:

Retrieval: Fetch relevant information from external sources
Augmentation: Add retrieved context to the prompt
Generation: LLM generates response using the context

This produces more accurate, up-to-date, and verifiable AI responses.

RAG Pipeline Patterns

Pattern 1: Simple Search + Answer

[User Query] -> [Web Search] -> [LLM with Context] -> [Answer]

Pattern 2: Multi-Source Research

[Query] -> [Multiple Searches] -> [Aggregate] -> [LLM Analysis] -> [Report]

Pattern 3: Extract + Process

[URLs] -> [Content Extraction] -> [Chunking] -> [LLM Summary] -> [Output]

Available Tools

Search Tools

Tool App ID Best For

Tavily Search tavily/search-assistant

AI-powered search with answers

Exa Search exa/search

Neural search, semantic matching

Exa Answer exa/answer

Direct factual answers

Extraction Tools

Tool App ID Best For

Tavily Extract tavily/extract

Clean content from URLs

Exa Extract exa/extract

Analyze web content

LLM Tools

Model App ID Best For

Claude Sonnet 4.5 openrouter/claude-sonnet-45

Complex analysis

Claude Haiku 4.5 openrouter/claude-haiku-45

Fast processing

GPT-4o openrouter/gpt-4o

General purpose

Gemini 2.5 Pro openrouter/gemini-25-pro

Long context

Pipeline Examples

Basic RAG Pipeline

1. Search for information

SEARCH_RESULT=$(infsh app run tavily/search-assistant --input '{ "query": "What are the latest breakthroughs in quantum computing 2024?" }')

2. Generate grounded response

infsh app run openrouter/claude-sonnet-45 --input "{ "prompt": "You are a research assistant. Based on the following search results, provide a comprehensive summary with citations.

Search Results: $SEARCH_RESULT

Provide a well-structured summary with source citations." }"

Multi-Source Research

Search multiple sources

TAVILY=$(infsh app run tavily/search-assistant --input '{"query": "electric vehicle market trends 2024"}') EXA=$(infsh app run exa/search --input '{"query": "EV market analysis latest reports"}')

Combine and analyze

infsh app run openrouter/claude-sonnet-45 --input "{ "prompt": "Analyze these research results and identify common themes and contradictions.

Source 1 (Tavily): $TAVILY

Source 2 (Exa): $EXA

Provide a balanced analysis with sources." }"

URL Content Analysis

1. Extract content from specific URLs

CONTENT=$(infsh app run tavily/extract --input '{ "urls": [ "https://example.com/research-paper", "https://example.com/industry-report" ] }')

2. Analyze extracted content

infsh app run openrouter/claude-sonnet-45 --input "{ "prompt": "Analyze these documents and extract key insights:

$CONTENT

Provide:

Key findings
Data points
Recommendations" }"

Fact-Checking Pipeline

Claim to verify

CLAIM="AI will replace 50% of jobs by 2030"

1. Search for evidence

EVIDENCE=$(infsh app run tavily/search-assistant --input "{ "query": "$CLAIM evidence studies research" }")

2. Verify claim

infsh app run openrouter/claude-sonnet-45 --input "{ "prompt": "Fact-check this claim: '$CLAIM'

Based on the following evidence: $EVIDENCE

Provide:

Verdict (True/False/Partially True/Unverified)
Supporting evidence
Contradicting evidence
Sources" }"

Research Report Generator

TOPIC="Impact of generative AI on creative industries"

1. Initial research

OVERVIEW=$(infsh app run tavily/search-assistant --input "{"query": "$TOPIC overview"}") STATISTICS=$(infsh app run exa/search --input "{"query": "$TOPIC statistics data"}") OPINIONS=$(infsh app run tavily/search-assistant --input "{"query": "$TOPIC expert opinions"}")

2. Generate comprehensive report

infsh app run openrouter/claude-sonnet-45 --input "{ "prompt": "Generate a comprehensive research report on: $TOPIC

Research Data: == Overview == $OVERVIEW

== Statistics == $STATISTICS

== Expert Opinions == $OPINIONS

Format as a professional report with:

Executive Summary
Key Findings
Data Analysis
Expert Perspectives
Conclusion
Sources" }"

Quick Answer with Sources

Use Exa Answer for direct factual questions

infsh app run exa/answer --input '{ "question": "What is the current market cap of NVIDIA?" }'

Best Practices

Query Optimization

Bad: Too vague

"AI news"

Good: Specific and contextual

"latest developments in large language models January 2024"

Context Management

Summarize long search results before sending to LLM

SEARCH=$(infsh app run tavily/search-assistant --input '{"query": "..."}')

If too long, summarize first

SUMMARY=$(infsh app run openrouter/claude-haiku-45 --input "{ "prompt": "Summarize these search results in bullet points: $SEARCH" }")

Then use summary for analysis

infsh app run openrouter/claude-sonnet-45 --input "{ "prompt": "Based on this research summary, provide insights: $SUMMARY" }"

Source Attribution

Always ask the LLM to cite sources:

infsh app run openrouter/claude-sonnet-45 --input '{ "prompt": "... Always cite sources in Source Name format." }'

Iterative Research

First pass: broad search

INITIAL=$(infsh app run tavily/search-assistant --input '{"query": "topic overview"}')

Second pass: dive deeper based on findings

DEEP=$(infsh app run tavily/search-assistant --input '{"query": "specific aspect from initial search"}')

Pipeline Templates

Agent Research Tool

#!/bin/bash

research.sh - Reusable research function

research() { local query="$1"

Search

local results=$(infsh app run tavily/search-assistant --input "{"query": "$query"}")

Analyze

infsh app run openrouter/claude-haiku-45 --input "{ "prompt": "Summarize: $results" }" }

research "your query here"

Related Skills

Web search tools

npx skills add inference-sh/skills@web-search

LLM models

npx skills add inference-sh/skills@llm-models

Content pipelines

npx skills add inference-sh/skills@ai-content-pipeline

Full platform skill

npx skills add inference-sh/skills@agent-tools

Browse all apps: infsh app list

Documentation

Adding Tools to Agents - Agent tool integration
Building a Research Agent - Full guide

ai-rag-pipeline

Safety Notice

Copy this and send it to your AI assistant to learn