Groq Enterprise RBAC
Overview
Manage access to Groq's ultra-fast LPU inference API through API key scoping and organization-level controls. Groq's per-token pricing is extremely low (orders of magnitude cheaper than GPU-based providers), but its speed makes runaway usage easy.
Prerequisites
-
Groq Cloud account at console.groq.com
-
Organization created with billing configured
-
At least one API key with organization admin scope
Instructions
Step 1: Create Rate-Limited API Keys per Team
set -euo pipefail
Key for the chatbot team (high RPM, small model)
curl -X POST https://api.groq.com/openai/v1/api-keys
-H "Authorization: Bearer $GROQ_ADMIN_KEY"
-d '{
"name": "chatbot-prod",
"allowed_models": ["llama-3.3-70b-versatile", "llama-3.1-8b-instant"],
"requests_per_minute": 500, # HTTP 500 Internal Server Error
"tokens_per_minute": 100000 # 100000 = configured value
}'
Key for batch processing (lower RPM but higher token limit)
curl -X POST https://api.groq.com/openai/v1/api-keys
-H "Authorization: Bearer $GROQ_ADMIN_KEY"
-d '{
"name": "batch-processor",
"allowed_models": ["llama-3.1-8b-instant"],
"requests_per_minute": 60,
"tokens_per_minute": 500000 # 500000 = configured value
}'
Step 2: Implement Model-Level Access Control
// groq-gateway.ts - Enforce model restrictions before forwarding const TEAM_MODEL_ACCESS: Record<string, string[]> = { chatbot: ['llama-3.3-70b-versatile', 'llama-3.1-8b-instant'], analytics: ['llama-3.1-8b-instant'], // Cheapest model only research: ['llama-3.3-70b-versatile', 'mixtral-8x7b-32768', 'gemma2-9b-it'], # 32768 = configured value };
function validateModelAccess(team: string, model: string): boolean { return TEAM_MODEL_ACCESS[team]?.includes(model) ?? false; }
Step 3: Set Organization Spending Limits
In the Groq Console > Organization > Billing:
-
Set monthly spending cap (e.g., $500/month)
-
Configure alerts at $100, $250, $400 thresholds
-
Enable auto-pause when cap is reached (prevents surprise bills)
Step 4: Monitor Key Usage
set -euo pipefail
Check usage across all API keys
curl https://api.groq.com/openai/v1/usage
-H "Authorization: Bearer $GROQ_ADMIN_KEY" |
jq '.usage_by_key[] | {key_name, requests_today, tokens_today, estimated_cost_usd}'
Step 5: Rotate Keys with Zero Downtime
set -euo pipefail
1. Create replacement key with same permissions
2. Deploy new key to services
3. Monitor for 24h to confirm no traffic on old key
4. Delete old key
curl -X DELETE "https://api.groq.com/openai/v1/api-keys/OLD_KEY_ID"
-H "Authorization: Bearer $GROQ_ADMIN_KEY"
Error Handling
Issue Cause Solution
429 rate_limit_exceeded
RPM or TPM cap hit Groq rate limits are strict; add exponential backoff
401 invalid_api_key
Key deleted or expired Generate new key in Groq Console
model_not_available
Model not in key's allowed list Create key with broader model access
Spending cap paused API Monthly budget reached Increase cap or wait for billing cycle
Examples
Basic usage: Apply groq enterprise rbac to a standard project setup with default configuration options.
Advanced scenario: Customize groq enterprise rbac for production environments with multiple constraints and team-specific requirements.
Output
-
Configuration files or code changes applied to the project
-
Validation report confirming correct implementation
-
Summary of changes made and their rationale
Resources
-
Official Groq documentation
-
Community best practices and patterns
-
Related skills in this plugin pack