LangChain CI Integration
Overview
Integrate LangChain chain and agent testing into CI/CD pipelines. Covers chain unit tests with mocked LLMs, RAG pipeline validation, agent tool testing, and integration tests with real LLM calls gated behind environment flags.
Prerequisites
-
LangChain installed (langchain , langchain-openai )
-
GitHub Actions configured
-
pytest for Python or Vitest for TypeScript
-
LLM API keys stored as GitHub secrets
Instructions
Step 1: CI Workflow with Mocked and Live Tests
.github/workflows/langchain-tests.yml
name: LangChain Tests
on: pull_request: paths: - 'src/chains/' - 'src/agents/' - 'src/tools/' - 'tests/'
jobs: unit-tests: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - uses: actions/setup-python@v5 with: { python-version: '3.11' } - run: pip install -r requirements.txt -r requirements-dev.txt
- name: Run unit tests (no API calls)
run: pytest tests/unit/ -v --tb=short
integration-tests: runs-on: ubuntu-latest if: github.event.pull_request.draft == false steps: - uses: actions/checkout@v4 - uses: actions/setup-python@v5 with: { python-version: '3.11' } - run: pip install -r requirements.txt -r requirements-dev.txt
- name: Run integration tests (with LLM calls)
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
LANGCHAIN_TRACING_V2: "true"
LANGCHAIN_API_KEY: ${{ secrets.LANGCHAIN_API_KEY }}
run: pytest tests/integration/ -v --tb=short -m "not slow"
Step 2: Chain Unit Tests with Mocked LLM
tests/unit/test_chains.py
import pytest from unittest.mock import AsyncMock, patch from langchain_core.messages import AIMessage
from src.chains.summarize import create_summarize_chain
@pytest.fixture def mock_llm(): mock = AsyncMock() mock.ainvoke.return_value = AIMessage(content="This is a summary of the document.") return mock
def test_summarize_chain_output_format(mock_llm): """Test chain produces expected output structure.""" chain = create_summarize_chain(llm=mock_llm) result = chain.invoke({"document": "Long document text here..."})
assert "summary" in result
assert len(result["summary"]) > 0
def test_summarize_chain_handles_empty_input(mock_llm): """Test chain handles edge cases.""" chain = create_summarize_chain(llm=mock_llm)
with pytest.raises(ValueError, match="Document cannot be empty"):
chain.invoke({"document": ""})
@patch("src.chains.summarize.ChatOpenAI") def test_chain_uses_correct_model(mock_chat): """Test chain configures model correctly.""" mock_chat.return_value = AsyncMock() chain = create_summarize_chain(model_name="gpt-4o-mini")
mock_chat.assert_called_once_with(model="gpt-4o-mini", temperature=0)
Step 3: RAG Pipeline Validation
tests/integration/test_rag_pipeline.py
import pytest from langchain_openai import OpenAIEmbeddings from langchain_community.vectorstores import FAISS
from src.chains.rag import create_rag_chain
@pytest.fixture(scope="session") def vector_store(): """Create test vector store with known documents.""" embeddings = OpenAIEmbeddings(model="text-embedding-3-small") texts = [ "Python was created by Guido van Rossum in 1991.", "TypeScript was developed by Microsoft and released in 2012.", "Rust was first released in 2010 by Mozilla.", # 2010 = configured value ] return FAISS.from_texts(texts, embeddings)
@pytest.mark.integration def test_rag_retrieval_relevance(vector_store): """Test RAG pipeline retrieves relevant documents.""" chain = create_rag_chain(vector_store) result = chain.invoke({"question": "Who created Python?"})
assert "guido" in result["answer"].lower()
assert len(result["source_documents"]) > 0
@pytest.mark.integration def test_rag_no_hallucination(vector_store): """Test RAG doesn't hallucinate when answer not in context.""" chain = create_rag_chain(vector_store) result = chain.invoke({"question": "What is the capital of France?"})
# Should indicate it doesn't know from the context
assert any(phrase in result["answer"].lower()
for phrase in ["don't have", "not in", "cannot find", "no information"])
Step 4: Agent Tool Testing
tests/unit/test_tools.py
import pytest from src.tools.calculator import calculator_tool from src.tools.search import search_tool
def test_calculator_tool(): """Test calculator tool produces correct results.""" result = calculator_tool.invoke({"expression": "2 + 2"}) assert result == "4"
def test_calculator_tool_handles_invalid_input(): """Test tool handles bad input gracefully.""" result = calculator_tool.invoke({"expression": "not math"}) assert "error" in result.lower()
@pytest.mark.integration def test_search_tool_returns_results(): """Test search tool (requires API key).""" result = search_tool.invoke({"query": "LangChain framework"}) assert len(result) > 0
Error Handling
Issue Cause Solution
Unit tests call real API Mock not applied Use @patch or dependency injection
Integration test fails Missing API key Gate behind if condition and secrets
Flaky RAG tests Embedding variability Use deterministic test data, pin embeddings
Agent test timeout Tool execution slow Set timeout on agent, mock external tools
Examples
Quick Chain Smoke Test
Minimal test that chain constructs correctly
def test_chain_builds(): from src.chains.summarize import create_summarize_chain chain = create_summarize_chain() assert chain is not None assert hasattr(chain, 'invoke')
Resources
-
LangChain Testing Guide
-
LangSmith Tracing
-
LangChain CI Examples
Output
-
Configuration files or code changes applied to the project
-
Validation report confirming correct implementation
-
Summary of changes made and their rationale