Groq Multi-Environment Setup

Overview

Configure Groq across environments with the right balance of cost, speed, and capability per tier. Groq's key differentiator is inference speed (100-300 tokens/second), but rate limits differ dramatically by plan: free tier is 30 RPM / 14,400 RPD for llama-3.1-70b, while paid tier removes most limits.

Prerequisites

Groq API key(s) per environment from console.groq.com
Environment variable management (.env.local , GitHub Secrets, or cloud secret manager)
Understanding of Groq's model tiers and rate limits

Environment Strategy

Environment Model Rate Limit Risk Config Source

Development llama-3.1-8b-instant

Low (small model) .env.local

Staging llama-3.1-70b-versatile

Medium CI/CD secrets

Production llama-3.1-70b-versatile or llama-3.3-70b-specdec

Managed with retry Secret manager

Instructions

Step 1: Configuration Structure

config/ groq/ base.ts # Shared Groq client setup development.ts # Dev: fast small models, verbose logging staging.ts # Staging: production models, test rate limits production.ts # Prod: hardened retry, error handling index.ts # Environment resolver

Step 2: Base Configuration with Groq SDK

// config/groq/base.ts import Groq from "groq-sdk";

export const BASE_GROQ_CONFIG = { maxRetries: 3, timeout: 30000, # 30000: 30 seconds in ms };

Step 3: Environment-Specific Configs

// config/groq/development.ts export const devConfig = { ...BASE_GROQ_CONFIG, apiKey: process.env.GROQ_API_KEY, model: "llama-3.1-8b-instant", // fastest, cheapest for dev iteration maxTokens: 1024, # 1024: 1 KB temperature: 0.7, logRequests: true, // verbose logging in dev };

// config/groq/staging.ts export const stagingConfig = { ...BASE_GROQ_CONFIG, apiKey: process.env.GROQ_API_KEY_STAGING, model: "llama-3.1-70b-versatile", // match production model maxTokens: 4096, # 4096: 4 KB temperature: 0.3, logRequests: false, };

// config/groq/production.ts export const productionConfig = { ...BASE_GROQ_CONFIG, apiKey: process.env.GROQ_API_KEY_PROD, model: "llama-3.1-70b-versatile", // or llama-3.3-70b-specdec for faster maxTokens: 4096, # 4 KB temperature: 0.3, maxRetries: 5, // more retries for production reliability logRequests: false, };

Step 4: Environment Resolver with Groq Client

// config/groq/index.ts import Groq from "groq-sdk";

type Env = "development" | "staging" | "production";

function detectEnvironment(): Env { const env = process.env.NODE_ENV || "development"; if (env === "production") return "production"; if (env === "staging") return "staging"; return "development"; }

let _client: Groq | null = null;

export function getGroqClient(): Groq { if (_client) return _client;

const env = detectEnvironment(); const configs = { development: devConfig, staging: stagingConfig, production: productionConfig }; const config = configs[env];

if (!config.apiKey) { throw new Error(GROQ_API_KEY not configured for ${env} environment); }

_client = new Groq({ apiKey: config.apiKey, maxRetries: config.maxRetries, timeout: config.timeout, });

return _client; }

export function getModelConfig() { const env = detectEnvironment(); const configs = { development: devConfig, staging: stagingConfig, production: productionConfig }; return configs[env]; }

Step 5: Usage with Rate Limit Handling

// lib/groq-service.ts import { getGroqClient, getModelConfig } from "../config/groq";

export async function complete(prompt: string): Promise<string> { const groq = getGroqClient(); const { model, maxTokens, temperature } = getModelConfig();

try { const completion = await groq.chat.completions.create({ model, messages: [{ role: "user", content: prompt }], max_tokens: maxTokens, temperature, }); return completion.choices[0].message.content || ""; } catch (err: any) { if (err.status === 429) { # HTTP 429 Too Many Requests const retryAfter = parseInt(err.headers?.["retry-after"] || "10"); console.warn(Groq rate limited. Retry after ${retryAfter}s); throw new Error(Rate limited on model ${model}. Retry after ${retryAfter}s); } throw err; } }

Error Handling

Issue Cause Solution

401 Unauthorized

Invalid API key for environment Verify GROQ_API_KEY in secret manager

429 rate_limit_exceeded

Free tier limit hit Switch to paid plan or implement request queuing

Model not found Deprecated model ID Check console.groq.com/docs/models for current list

Slow responses in dev Using 70b model for iteration Switch dev config to llama-3.1-8b-instant

Examples

Check Which Config Is Active

import { getModelConfig } from "./config/groq";

const cfg = getModelConfig(); console.log(Model: ${cfg.model}, max_tokens: ${cfg.maxTokens});

Test Rate Limits Per Environment

set -euo pipefail

Quick check: what's my current rate limit status?

curl -s "https://api.groq.com/openai/v1/models"
-H "Authorization: Bearer $GROQ_API_KEY" | jq '.data[].id'

Resources

Groq API Documentation
Groq Models Reference
Groq Rate Limits by Tier

Next Steps

For deployment configuration, see groq-deploy-integration .

Output

Configuration files or code changes applied to the project
Validation report confirming correct implementation
Summary of changes made and their rationale

groq-multi-env-setup

Safety Notice

Copy this and send it to your AI assistant to learn

Quick check: what's my current rate limit status?

Source Transparency

Related Skills

backtesting-trading-strategies

svg-icon-generator

performance-lighthouse-runner