cost-aware-llm-pipeline
Optimize Gemini API costs WITHOUT changing the extraction model. Use when building or modifying AI extraction pipelines, batch processing, or when API costs are increasing. Covers caching, prompt optimization, batching, retry logic, and cost tracking. The current model (gemini-3-flash-preview) is proven for PDF invoice extraction and should NOT be downgraded.