The AI Agent FinOps Playbook: Budget Enforcement, Cost Allocation & Spend Analytics for Multi-Agent Systems
Notice: This is an educational guide with illustrative code examples. It does not execute code or install dependencies. All examples use the GreenHelix sandbox (https://sandbox.greenhelix.net) which provides 500 free credits — no API key required to get started.
Referenced credentials (you supply these in your own environment):
GREENHELIX_API_KEY: API authentication for GreenHelix gateway (read/write access to purchased API tools only)AGENT_SIGNING_KEY: Cryptographic signing key for agent identity (Ed25519 key pair for request signing)
When a CrewAI orchestrator spawns five researcher agents, each calling GPT-4o at $0.005 per request, and one agent enters a retry loop that fires 12,000 requests in an hour, who gets the bill? When a LangGraph workflow fans out to eight parallel tool-calling agents across three cloud providers, how does the platform team attribute $2,400 in daily compute back to the workflow that caused it? When a production AutoGen group chat quietly doubles its tool-call volume over three weeks because a prompt change removed a termination condition, who notices before the monthly invoice arrives? These are not hypothetical problems. A March 2026 postmortem from an AI infrastructure startup revealed that a single misconfigured agent loop consumed $47,000 in API credits over a weekend -- with no alert, no budget cap, and no way to attribute the cost to a specific agent. The FinOps Foundation's "FinOps for AI" initiative, launched in January 2026, identified cost governance for autonomous agents as one of the top three unsolved problems in AI infrastructure economics. This guide solves it. Using the GreenHelix A2A Commerce Gateway's 23 billing and webhook tools, you will build a complete FinOps control plane: per-agent wallets with hard budget caps, real-time spend alerts, cost attribution by agent and workflow, volume discount optimization, fleet-wide dashboards, and API key isolation as a cost containment boundary. Every pattern is production-ready, backed by working Python code, and deployable today. Before diving into the full guide, here is the minimum viable FinOps setup: one agent, one wallet, one budget cap, one alert. This takes under two minutes.
## What You'll Learn
- Quick Start
- Chapter 1: The Multi-Agent Cost Problem
- Chapter 2: Per-Agent Wallet Provisioning
- Chapter 3: Budget Caps and Hard Stops
- Chapter 4: Real-Time Spend Alerts with Webhooks
- Chapter 5: Cost Attribution and Category Spend
- Chapter 6: Volume Discounts and Cost Optimization
- Chapter 7: Fleet Dashboards and Leaderboards
- Chapter 8: API Key Isolation and Security
- Chapter 9: Putting It All Together: The FinOps Control Plane
## Full Guide
# The AI Agent FinOps Playbook: Budget Enforcement, Cost Allocation & Spend Analytics for Multi-Agent Systems
When a CrewAI orchestrator spawns five researcher agents, each calling GPT-4o at $0.005 per request, and one agent enters a retry loop that fires 12,000 requests in an hour, who gets the bill? When a LangGraph workflow fans out to eight parallel tool-calling agents across three cloud providers, how does the platform team attribute $2,400 in daily compute back to the workflow that caused it? When a production AutoGen group chat quietly doubles its tool-call volume over three weeks because a prompt change removed a termination condition, who notices before the monthly invoice arrives? These are not hypothetical problems. A March 2026 postmortem from an AI infrastructure startup revealed that a single misconfigured agent loop consumed $47,000 in API credits over a weekend -- with no alert, no budget cap, and no way to attribute the cost to a specific agent. The FinOps Foundation's "FinOps for AI" initiative, launched in January 2026, identified cost governance for autonomous agents as one of the top three unsolved problems in AI infrastructure economics. This guide solves it. Using the GreenHelix A2A Commerce Gateway's 23 billing and webhook tools, you will build a complete FinOps control plane: per-agent wallets with hard budget caps, real-time spend alerts, cost attribution by agent and workflow, volume discount optimization, fleet-wide dashboards, and API key isolation as a cost containment boundary. Every pattern is production-ready, backed by working Python code, and deployable today.
---
## Quick Start
Before diving into the full guide, here is the minimum viable FinOps setup: one agent, one wallet, one budget cap, one alert. This takes under two minutes.
```python
from greenhelix_trading import AgentFinOps
import os
# Initialize the FinOps client for a single agent
finops = AgentFinOps(
api_key=os.environ["GREENHELIX_API_KEY"],
agent_id="research-agent-01",
base_url="https://api.greenhelix.net/v1",
)
# 1. Create an isolated wallet
finops.create_wallet()
# 2. Fund it
finops.deposit(amount=100.00)
# 3. Set a hard daily budget cap
finops.set_budget_cap(daily_limit=25.00, monthly_limit=500.00)
# 4. Register an alert at 75% spend
finops.register_webhook(
url="https://your-app.example.com/alerts/budget",
events=["budget.threshold"],
config={"threshold_pct": 75},
)
# 5. Check balance before any operation
balance = finops.get_balance()
print(f"Agent research-agent-01 ready. Balance: ${balance['balance']}")
That is five API calls. Your agent now has an isolated wallet, a $25/day hard stop, and an alert that fires when 75% of the daily budget is consumed. The rest of this guide builds on this foundation.
Table of Contents
- The Multi-Agent Cost Problem
- Per-Agent Wallet Provisioning
- Budget Caps and Hard Stops
- Real-Time Spend Alerts with Webhooks
- Cost Attribution and Category Spend
- Volume Discounts and Cost Optimization
- Fleet Dashboards and Leaderboards
- API Key Isolation and Security
- Putting It All Together: The FinOps Control Plane
- What to Do Next
Chapter 1: The Multi-Agent Cost Problem
Why Autonomous Agents Create Unbounded Cost
Human engineers have an implicit cost governor: they get tired, they context-switch, they go home at 6 PM. Agents do not. A LangChain agent with access to a tool catalog will call tools as long as its reasoning loop tells it to. A CrewAI crew that delegates subtasks to specialist agents will spawn as many subtasks as the orchestrator deems necessary. An AutoGen group chat will continue conversing until a termination condition is met -- and if that condition has a bug, the conversation runs forever. Each tool call, each LLM inference, each API request costs money. Without explicit cost governance, the spend trajectory of a multi-agent system is bounded only by the rate limit of the underlying API provider.
Three Failure Modes
Every multi-agent cost incident traces back to one of three root causes.
Failure Mode 1: Runaway Loops. An agent enters a retry or reasoning loop that never terminates. The most common trigger is a tool that returns an error, causing the agent to retry indefinitely. A single agent in a retry loop against GPT-4o can consume $50-200/hour depending on context window size. Without a per-agent budget cap, there is no circuit breaker.
Failure Mode 2: Tool-Call Amplification. An orchestrator agent delegates a task to N worker agents, each of which calls M tools, each of which triggers K downstream API calls. The total cost is O(N * M * K), and small changes to any multiplier produce large cost swings. A prompt change that causes each worker to call one additional tool per task can double fleet-wide spend overnight.
Failure Mode 3: No Attribution. All agents share a single API key and a single billing account. When the monthly bill spikes, there is no way to determine which agent, which workflow, or which tool category caused the increase. Without attribution, there is no accountability, and without accountability, there is no optimization.
The Wallet-Per-Agent Architecture
The GreenHelix approach solves all three failure modes with a single architectural pattern: every agent gets its own wallet with its own budget cap. This is the same principle as giving each microservice its own database -- isolation prevents blast radius propagation.
+-------------------+ +-------------------+ +-------------------+
| Agent: researcher | | Agent: summarizer | | Agent: classifier |
| Wallet: $100 | | Wallet: $50 | | Wallet: $75 |
| Cap: $25/day | | Cap: $15/day | | Cap: $20/day |
| Key: ak_res_xxxxx | | Key: ak_sum_xxxxx | | Key: ak_cls_xxxxx |
+--------+----------+ +--------+----------+ +--------+----------+
| | |
+------------+------------+------------+------------+
| |
+-------v-------------------------v-------+
| GreenHelix A2A Commerce Gateway |
| Budget enforcement layer |
| Cost attribution engine |
| Webhook alert system |
+------------------------------------------+
When the researcher agent hits its $25/day cap, it gets a 402 Payment Required response. The summarizer and classifier agents are unaffected. The blast radius of a runaway researcher is exactly $25 -- not the entire fleet budget.
The AgentFinOps Class
This class wraps every GreenHelix billing and webhook tool used in this guide. Define it once; reference it in every subsequent chapter.
import requests
import json
import time
from typing import Optional
class AgentFinOps:
"""FinOps client for the GreenHelix A2A Commerce Gateway.
Wraps the 23 billing and webhook tools into a clean interface
for budget enforcement, cost allocation, and spend analytics.
"""
def __init__(
self,
api_key: str,
agent_id: str,
base_url: str = "https://api.greenhelix.net/v1",
):
self.base_url = base_url
self.agent_id = agent_id
self.session = requests.Session()
self.session.headers.update({
"Content-Type": "application/json",
"Authorization": f"Bearer {api_key}",
})
def _execute(self, tool: str, input_data: dict) -> dict:
"""Execute a tool on the GreenHelix gateway."""
resp = self.session.post(
f"{self.base_url}/v1",
json={"tool": tool, "input": input_data},
)
if resp.status_code == 402:
raise BudgetExhaustedError(
f"Agent {self.agent_id} has exceeded its budget cap. "
f"Response: {resp.text}"
)
resp.raise_for_status()
return resp.json()
# -- Wallet Management -------------------------------------------
def create_wallet(self) -> dict:
return self._execute("create_wallet", {})
def get_balance(self) -> dict:
return self._execute("get_balance", {})
def deposit(self, amount: float) -> dict:
return self._execute("deposit", {"amount": str(amount)})
# -- Budget Enforcement ------------------------------------------
def set_budget_cap(
self,
daily_limit: float,
monthly_limit: Optional[float] = None,
) -> dict:
payload = {
"agent_id": self.agent_id,
"daily_limit": str(daily_limit),
}
if monthly_limit is not None:
payload["monthly_limit"] = str(monthly_limit)
return self._execute("set_budget_cap", payload)
def get_budget_status(self) -> dict:
return self._execute("get_budget_status", {
"agent_id": self.agent_id,
})
# -- Cost Analytics ----------------------------------------------
def get_usage_analytics(
self,
start_date: str,
end_date: str,
) -> dict:
return self._execute("get_usage_analytics", {
"agent_id": self.agent_id,
"start_date": start_date,
"end_date": end_date,
})
def get_billing_summary(self, period: str = "monthly") -> dict:
return self._execute("get_billing_summary", {
"agent_id": self.agent_id,
"period": period,
})
def get_spending_by_category(
self,
start_date: Optional[str] = None,
end_date: Optional[str] = None,
) -> dict:
payload = {"agent_id": self.agent_id}
if start_date:
payload["start_date"] = start_date
if end_date:
payload["end_date"] = end_date
return self._execute("get_spending_by_category", payload)
# -- Cost Optimization -------------------------------------------
def get_volume_discount(self) -> dict:
return self._execute("get_volume_discount", {
"agent_id": self.agent_id,
})
def estimate_cost(self, tool_name: str, parameters: dict) -> dict:
return self._execute("estimate_cost", {
"tool": tool_name,
"parameters": parameters,
})
def convert_currency(
self,
amount: float,
from_currency: str,
to_currency: str,
) -> dict:
return self._execute("convert_currency", {
"amount": str(amount),
"from_currency": from_currency,
"to_currency": to_currency,
})
# -- Fleet Analytics ---------------------------------------------
def get_agent_leaderboard(self, metric: str = "cost_efficiency") -> dict:
return self._execute("get_agent_leaderboard", {
"metric": metric,
})
def get_platform_stats(self) -> dict:
return self._execute("get_platform_stats", {})
# -- API Key Management ------------------------------------------
def create_api_key(self, label: str, permissions: list = None) -> dict:
payload = {
"agent_id": self.agent_id,
"label": label,
}
if permissions:
payload["permissions"] = permissions
return self._execute("create_api_key", payload)
def rotate_api_key(self, key_id: str) -> dict:
return self._execute("rotate_api_key", {
"agent_id": self.agent_id,
"key_id": key_id,
})
# -- Webhooks ----------------------------------------------------
def register_webhook(
self,
url: str,
events: list,
config: Optional[dict] = None,
) -> dict:
payload = {
"agent_id": self.agent_id,
"url": url,
"events": events,
}
if config:
payload["config"] = config
return self._execute("register_webhook", payload)
def list_webhooks(self) -> dict:
return self._execute("list_webhooks", {
"agent_id": self.agent_id,
})
def update_webhook(
self,
webhook_id: str,
url: Optional[str] = None,
events: Optional[list] = None,
config: Optional[dict] = None,
) -> dict:
payload = {"webhook_id": webhook_id}
if url:
payload["url"] = url
if events:
payload["events"] = events
if config:
payload["config"] = config
return self._execute("update_webhook", payload)
def delete_webhook(self, webhook_id: str) -> dict:
return self._execute("delete_webhook", {
"webhook_id": webhook_id,
})
def get_webhook_logs(
self,
webhook_id: str,
limit: int = 50,
) -> dict:
return self._execute("get_webhook_logs", {
"webhook_id": webhook_id,
"limit": limit,
})
class BudgetExhaustedError(Exception):
"""Raised when an agent exceeds its budget cap (HTTP 402)."""
pass
Chapter 2: Per-Agent Wallet Provisioning
Why One Wallet Per Agent
Shared wallets are the multi-agent equivalent of running all your microservices as root. When five agents share a wallet, a runaway agent drains funds that other agents need. There is no spend attribution, no blast radius containment, and no way to set different budget policies for agents with different risk profiles.
The rule is simple: every agent that can incur cost gets its own wallet. The orchestrator agent gets a wallet. Each worker agent gets a wallet. If the orchestrator funds worker wallets as part of task delegation, the funding amount becomes an explicit cost parameter of the task -- visible, auditable, and capped.
Creating Wallets for a Fleet
import os
API_KEY = os.environ["GREENHELIX_API_KEY"]
# Define your agent fleet
fleet = [
{"agent_id": "orchestrator-01", "initial_deposit": 500.00},
{"agent_id": "researcher-01", "initial_deposit": 100.00},
{"agent_id": "researcher-02", "initial_deposit": 100.00},
{"agent_id": "summarizer-01", "initial_deposit": 75.00},
{"agent_id": "classifier-01", "initial_deposit": 50.00},
]
for agent_cfg in fleet:
finops = AgentFinOps(
api_key=API_KEY,
agent_id=agent_cfg["agent_id"],
)
# Create wallet
wallet = finops.create_wallet()
print(f"Wallet created for {agent_cfg['agent_id']}: {wallet['wallet_id']}")
# Fund it
deposit = finops.deposit(amount=agent_cfg["initial_deposit"])
print(f" Deposited ${agent_cfg['initial_deposit']:.2f}")
# Verify balance
balance = finops.get_balance()
print(f" Balance: ${balance['balance']}")
Pre-Flight Cost Estimation
Before an agent executes an expensive tool call, estimate the cost and compare it against the remaining budget. This prevents the agent from starting work it cannot afford to finish.
def preflight_check(finops: AgentFinOps, tool_name: str, params: dict) -> bool:
"""Check if the agent can afford a tool call before executing it."""
# Estimate the cost of the upcoming operation
estimate = finops.estimate_cost(tool_name=tool_name, parameters=params)
estimated_cost = float(estimate.get("estimated_cost", 0))
# Check remaining budget
budget = finops.get_budget_status()
remaining_daily = (
float(budget.get("daily_limit", 0))
- float(budget.get("spent_today", 0))
)
# Check wallet balance
balance = finops.get_balance()
wallet_balance = float(balance.get("balance", 0))
# Both budget and balance must be sufficient
can_afford = (
estimated_cost <= remaining_daily
and estimated_cost <= wallet_balance
)
if not can_afford:
print(
f"Pre-flight FAIL: {tool_name} costs ~${estimated_cost:.4f}, "
f"daily remaining=${remaining_daily:.2f}, "
f"wallet=${wallet_balance:.2f}"
)
return can_afford
# Usage in an agent loop
finops = AgentFinOps(api_key=API_KEY, agent_id="researcher-01")
tasks = [
("search_services", {"query": "document summarization"}),
("create_escrow", {"amount": "25.00", "payee": "summarizer-01"}),
("submit_metrics", {"metrics": {"accuracy": 0.95}}),
]
for tool_name, params in tasks:
if preflight_check(finops, tool_name, params):
result = finops._execute(tool_name, params)
print(f" Executed {tool_name}: OK")
else:
print(f" Skipped {tool_name}: insufficient budget")
break # Stop the workflow, do not accumulate partial state
Funding Workflows: Orchestrator-to-Worker Deposits
In many architectures, the orchestrator agent funds worker agents as part of task assignment. This creates an explicit cost boundary around each task.
def fund_worker_for_task(
orchestrator: AgentFinOps,
worker_agent_id: str,
task_budget: float,
api_key: str,
) -> AgentFinOps:
"""Fund a worker agent's wallet for a specific task."""
worker = AgentFinOps(
api_key=api_key,
agent_id=worker_agent_id,
)
# Estimate worker's current balance
balance = worker.get_balance()
current = float(balance.get("balance", 0))
# Top up only what is needed
if current < task_budget:
shortfall = task_budget - current
worker.deposit(amount=shortfall)
print(
f"Funded {worker_agent_id}: +${shortfall:.2f} "
f"(new balance: ${task_budget:.2f})"
)
# Set a tight budget cap for this task
worker.set_budget_cap(daily_limit=task_budget)
return worker
# Orchestrator delegates a research task with a $15 budget
orchestrator = AgentFinOps(api_key=API_KEY, agent_id="orchestrator-01")
worker = fund_worker_for_task(
orchestrator=orchestrator,
worker_agent_id="researcher-01",
task_budget=15.00,
api_key=API_KEY,
)
Chapter 3: Budget Caps and Hard Stops
Per-Agent Budget Caps
Budget caps are the most important cost governance primitive. When an agent exceeds its cap, the gateway returns 402 Payment Required on the next tool call. The agent's code catches this as a BudgetExhaustedError and can implement any fail-fast behavior: log the event, notify the orchestrator, gracefully terminate, or queue the remaining work for the next budget cycle.
finops = AgentFinOps(api_key=API_KEY, agent_id="researcher-01")
# Set daily and monthly caps
finops.set_budget_cap(daily_limit=25.00, monthly_limit=500.00)
# Verify the cap is active
status = finops.get_budget_status()
print(f"Daily limit: ${status['daily_limit']}")
print(f"Monthly limit: ${status['monthly_limit']}")
print(f"Spent today: ${status['spent_today']}")
print(f"Spent this month: ${status['spent_this_month']}")
curl -- set budget cap:
curl -s -X POST https://sandbox.greenhelix.net/v1 \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $GREENHELIX_API_KEY" \
-d '{
"tool": "set_budget_cap",
"input": {
"agent_id": "researcher-01",
"daily_limit": "25.00",
"monthly_limit": "500.00"
}
}'
Per-Workflow Budget Caps
Sometimes you need a budget cap on a workflow, not just an individual agent. The pattern: create a dedicated agent ID per workflow run, give it a wallet with exactly the workflow budget, and route all tool calls through it.
import uuid
from datetime import datetime, timezone
def create_workflow_budget(
api_key: str,
workflow_name: str,
budget: float,
) -> AgentFinOps:
"""Create a budget-capped agent for a single workflow run."""
run_id = uuid.uuid4().hex[:8]
workflow_agent_id = f"wf-{workflow_name}-{run_id}"
finops = AgentFinOps(
api_key=api_key,
agent_id=workflow_agent_id,
)
finops.create_wallet()
finops.deposit(amount=budget)
finops.set_budget_cap(daily_limit=budget) # Entire budget as one-shot
print(
f"Workflow {workflow_name} ({run_id}): "
f"${budget:.2f} budget allocated"
)
return finops
# Create a $50 budget for a research workflow
wf_finops = create_workflow_budget(
api_key=API_KEY,
workflow_name="quarterly-report",
budget=50.00,
)
Handling 402 Payment Required
When the budget cap is hit, the gateway returns HTTP 402. Your agent code must handle this gracefully -- not with a retry, but with a controlled shutdown.
def safe_execute(finops: AgentFinOps, tool: str, input_data: dict) -> dict:
"""Execute a tool with budget-aware error handling."""
try:
return finops._execute(tool, input_data)
except BudgetExhaustedError:
# Log the budget exhaustion event
print(
f"BUDGET EXHAUSTED: Agent {finops.agent_id} hit cap "
f"while calling {tool}. Stopping execution."
)
# Return a sentinel that the calling code can check
return {"error": "budget_exhausted", "tool": tool}
except requests.exceptions.HTTPError as e:
if e.response.status_code == 429:
# Rate limited -- back off and retry
time.sleep(2)
return finops._execute(tool, input_data)
raise
# Agent loop with budget-aware execution
finops = AgentFinOps(api_key=API_KEY, agent_id="researcher-01")
for i in range(100): # Agent might try 100 tool calls
result = safe_execute(finops, "search_services", {"query": f"topic-{i}"})
if result.get("error") == "budget_exhausted":
print(f"Stopped at iteration {i}. Budget exhausted.")
break
# Process result normally
Setting Budget Caps Across a Fleet
# Define tiered budget policies
BUDGET_TIERS = {
"orchestrator": {"daily": 100.00, "monthly": 2000.00},
"researcher": {"daily": 25.00, "monthly": 500.00},
"summarizer": {"daily": 15.00, "monthly": 300.00},
"classifier": {"daily": 10.00, "monthly": 200.00},
}
def apply_fleet_budget_policy(api_key: str, agents: list[dict]):
"""Apply budget caps to all agents based on their role tier."""
for agent in agents:
role = agent["role"]
tier = BUDGET_TIERS.get(role)
if not tier:
print(f"WARNING: No budget tier for role '{role}'. Skipping.")
continue
finops = AgentFinOps(api_key=api_key, agent_id=agent["agent_id"])
finops.set_budget_cap(
daily_limit=tier["daily"],
monthly_limit=tier["monthly"],
)
print(
f" {agent['agent_id']} ({role}): "
f"${tier['daily']}/day, ${tier['monthly']}/month"
)
agents = [
{"agent_id": "orchestrator-01", "role": "orchestrator"},
{"agent_id": "researcher-01", "role": "researcher"},
{"agent_id": "researcher-02", "role": "researcher"},
{"agent_id": "summarizer-01", "role": "summarizer"},
{"agent_id": "classifier-01", "role": "classifier"},
]
apply_fleet_budget_policy(API_KEY, agents)
Chapter 4: Real-Time Spend Alerts with Webhooks
Why Polling Is Not Enough
Checking get_budget_status inside your agent loop works for pre-flight checks, but it misses spend that happens between checks. If your agent makes ten tool calls in rapid succession and the budget cap is hit on call seven, calls eight through ten fail with 402. Webhooks push alerts to your system the moment a threshold is crossed, enabling real-time escalation -- warn the team at 75%, pause the agent at 90%, kill it at 100%.
Registering Budget Threshold Alerts
finops = AgentFinOps(api_key=API_KEY, agent_id="researcher-01")
# Register three escalation tiers
alert_tiers = [
{"threshold_pct": 75, "label": "warning"},
{"threshold_pct": 90, "label": "critical"},
{"threshold_pct": 100, "label": "exhausted"},
]
webhook_ids = []
for tier in alert_tiers:
webhook = finops.register_webhook(
url=f"https://your-app.example.com/alerts/{tier['label']}",
events=["budget.threshold"],
config={"threshold_pct": tier["threshold_pct"]},
)
webhook_ids.append(webhook["webhook_id"])
print(
f"Alert registered: {tier['label']} at {tier['threshold_pct']}% "
f"(webhook_id: {webhook['webhook_id']})"
)
curl -- register a webhook:
curl -s -X POST https://sandbox.greenhelix.net/v1 \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $GREENHELIX_API_KEY" \
-d '{
"tool": "register_webhook",
"input": {
"agent_id": "researcher-01",
"url": "https://your-app.example.com/alerts/warning",
"events": ["budget.threshold"],
"config": {"threshold_pct": 75}
}
}'
Building the Alert Receiver
Your alert receiver is a simple HTTP endpoint that receives webhook payloads and dispatches escalation actions.
from flask import Flask, request, jsonify
import logging
app = Flask(__name__)
logger = logging.getLogger("finops_alerts")
@app.route("/alerts/warning", methods=["POST"])
def handle_warning():
"""75% budget consumed -- log and notify Slack."""
payload = request.json
agent_id = payload.get("agent_id")
spent_pct = payload.get("spent_pct")
remaining = payload.get("remaining_budget")
logger.warning(
f"BUDGET WARNING: {agent_id} at {spent_pct}% "
f"(${remaining} remaining)"
)
# Send Slack/PagerDuty notification
notify_slack(
channel="#agent-finops",
message=f"Agent {agent_id} has consumed {spent_pct}% of its daily budget. "
f"${remaining} remaining.",
)
return jsonify({"status": "acknowledged"})
@app.route("/alerts/critical", methods=["POST"])
def handle_critical():
"""90% budget consumed -- pause the agent."""
payload = request.json
agent_id = payload.get("agent_id")
logger.critical(
f"BUDGET CRITICAL: {agent_id} at {payload.get('spent_pct')}%. "
f"Pausing agent."
)
# Signal the agent to pause (via shared state, message queue, etc.)
pause_agent(agent_id)
notify_slack(
channel="#agent-finops",
message=f"CRITICAL: Agent {agent_id} paused at 90% budget.",
)
return jsonify({"status": "agent_paused"})
@app.route("/alerts/exhausted", methods=["POST"])
def handle_exhausted():
"""100% budget consumed -- kill the agent process."""
payload = request.json
agent_id = payload.get("agent_id")
logger.error(
f"BUDGET EXHAUSTED: {agent_id}. Terminating."
)
kill_agent(agent_id)
notify_slack(
channel="#agent-finops",
message=f"EXHAUSTED: Agent {agent_id} terminated. Budget fully consumed.",
)
return jsonify({"status": "agent_terminated"})
def notify_slack(channel: str, message: str):
"""Send a Slack notification. Replace with your Slack webhook."""
import requests as req
req.post(os.environ.get("SLACK_WEBHOOK_URL", ""), json={
"channel": channel,
"text": message,
})
def pause_agent(agent_id: str):
"""Signal an agent to pause. Implementation depends on your runtime."""
# Example: write a pause flag to Redis
# redis_client.set(f"agent:{agent_id}:paused", "1", ex=3600)
pass
def kill_agent(agent_id: str):
"""Terminate an agent process. Implementation depends on your runtime."""
# Example: send SIGTERM to the agent's PID
# os.kill(agent_pids[agent_id], signal.SIGTERM)
pass
Auditing Webhook Configuration
Before relying on webhooks in production, verify they are correctly configured.
finops = AgentFinOps(api_key=API_KEY, agent_id="researcher-01")
# List all registered webhooks
webhooks = finops.list_webhooks()
for wh in webhooks.get("webhooks", []):
print(
f" ID: {wh['webhook_id']} | URL: {wh['url']} | "
f"Events: {wh['events']} | Config: {wh.get('config', {})}"
)
curl -- list webhooks:
curl -s -X POST https://sandbox.greenhelix.net/v1 \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $GREENHELIX_API_KEY" \
-d '{
"tool": "list_webhooks",
"input": {"agent_id": "researcher-01"}
}'
Adjusting Thresholds Dynamically
As your fleet evolves, you may need to tighten or relax alert thresholds. Use update_webhook to adjust without deleting and re-creating.
# Tighten the warning threshold from 75% to 60%
finops.update_webhook(
webhook_id=webhook_ids[0],
config={"threshold_pct": 60},
)
print("Warning threshold tightened to 60%")
Debugging Missed Alerts
When an alert should have fired but did not, check the webhook logs.
# Inspect delivery history for a specific webhook
logs = finops.get_webhook_logs(webhook_id=webhook_ids[0], limit=20)
for entry in logs.get("logs", []):
print(
f" {entry['timestamp']} | Status: {entry['delivery_status']} | "
f"HTTP {entry.get('response_code', 'N/A')}"
)
curl -- get webhook logs:
curl -s -X POST https://sandbox.greenhelix.net/v1 \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $GREENHELIX_API_KEY" \
-d '{
"tool": "get_webhook_logs",
"input": {"webhook_id": "wh_abc123", "limit": 20}
}'
Cleaning Up Stale Webhooks
When decommissioning an agent, remove its webhooks to prevent orphaned alert traffic.
def cleanup_agent_webhooks(finops: AgentFinOps):
"""Remove all webhooks for an agent being decommissioned."""
webhooks = finops.list_webhooks()
for wh in webhooks.get("webhooks", []):
finops.delete_webhook(wh["webhook_id"])
print(f" Deleted webhook {wh['webhook_id']}")
Chapter 5: Cost Attribution and Category Spend
Why Attribution Matters
Budget caps prevent overspend. Attribution answers a different question: where is the money going? Without category-level spend data, you cannot optimize. You cannot answer "which tool category consumes 60% of our budget?" or "which agent has the worst cost-per-task ratio?" or "did last week's prompt change increase or decrease total spend?"
Tagging Spend by Agent, Workflow, and Tool
GreenHelix tracks spend at the agent level natively. Every tool call executed by an agent is attributed to that agent's wallet. To get workflow-level attribution, use the dedicated workflow agent pattern from Chapter 3. To get tool-level attribution, use get_spending_by_category.
finops = AgentFinOps(api_key=API_KEY, agent_id="researcher-01")
# Get spend breakdown by tool category
category_spend = finops.get_spending_by_category(
start_date="2026-04-01",
end_date="2026-04-06",
)
print("Spend by category (April 1-6):")
for category in category_spend.get("categories", []):
print(
f" {category['category']:.<30} "
f"${category['total_cost']:>8.2f} "
f"({category['call_count']} calls)"
)
curl -- get spending by category:
curl -s -X POST https://sandbox.greenhelix.net/v1 \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $GREENHELIX_API_KEY" \
-d '{
"tool": "get_spending_by_category",
"input": {
"agent_id": "researcher-01",
"start_date": "2026-04-01",
"end_date": "2026-04-06"
}
}'
Building Chargeback Reports
In organizations where different teams own different agents, chargebacks attribute platform costs to the team that caused them. This is standard FinOps practice for cloud infrastructure, adapted for multi-agent systems.
def generate_chargeback_report(
api_key: str,
team_agents: dict[str, list[str]],
period: str = "monthly",
) -> dict:
"""Generate a chargeback report by team.
team_agents: {"data-team": ["researcher-01", "researcher-02"],
"ml-team": ["classifier-01", "summarizer-01"]}
"""
report = {}
for team, agent_ids in team_agents.items():
team_total = 0.0
agent_details = []
for agent_id in agent_ids:
finops = AgentFinOps(api_key=api_key, agent_id=agent_id)
summary = finops.get_billing_summary(period=period)
agent_cost = float(summary.get("total_cost", 0))
team_total += agent_cost
agent_details.append({
"agent_id": agent_id,
"cost": agent_cost,
"tool_calls": summary.get("total_calls", 0),
})
report[team] = {
"total_cost": round(team_total, 2),
"agents": agent_details,
}
return report
# Generate monthly chargeback
chargeback = generate_chargeback_report(
api_key=API_KEY,
team_agents={
"data-team": ["researcher-01", "researcher-02"],
"ml-team": ["classifier-01", "summarizer-01"],
"ops-team": ["orchestrator-01"],
},
)
for team, data in chargeback.items():
print(f"\n{team}: ${data['total_cost']:.2f}")
for agent in data["agents"]:
print(
f" {agent['agent_id']}: ${agent['cost']:.2f} "
f"({agent['tool_calls']} calls)"
)
Per-Agent P&L Statements
For agents that both spend and earn (e.g., a service agent that charges for its work), compute a per-agent profit and loss.
def agent_pnl(api_key: str, agent_id: str, period: str = "monthly") -> dict:
"""Compute profit and loss for an agent."""
finops = AgentFinOps(api_key=api_key, agent_id=agent_id)
# Cost side: what the agent spent
summary = finops.get_billing_summary(period=period)
total_cost = float(summary.get("total_cost", 0))
# Revenue side: what the agent earned (from usage analytics)
analytics = finops.get_usage_analytics(
start_date=summary.get("period_start", "2026-04-01"),
end_date=summary.get("period_end", "2026-04-30"),
)
total_revenue = float(analytics.get("revenue_earned", 0))
return {
"agent_id": agent_id,
"revenue": round(total_revenue, 2),
"cost": round(total_cost, 2),
"profit": round(total_revenue - total_cost, 2),
"margin_pct": (
round((total_revenue - total_cost) / total_revenue * 100, 1)
if total_revenue > 0 else 0.0
),
}
# Compute P&L for a service agent
pnl = agent_pnl(API_KEY, "summarizer-01")
print(
f"Agent {pnl['agent_id']}: "
f"Revenue=${pnl['revenue']}, Cost=${pnl['cost']}, "
f"Profit=${pnl['profit']} ({pnl['margin_pct']}% margin)"
)
Historical Spend Analysis
Use get_usage_analytics for time-series spend data to identify trends, anomalies, and the impact of configuration changes.
finops = AgentFinOps(api_key=API_KEY, agent_id="researcher-01")
# Pull weekly analytics for the last 4 weeks
weeks = [
("2026-03-09", "2026-03-15"),
("2026-03-16", "2026-03-22"),
("2026-03-23", "2026-03-29"),
("2026-03-30", "2026-04-05"),
]
print("Weekly spend trend for researcher-01:")
for start, end in weeks:
analytics = finops.get_usage_analytics(start_date=start, end_date=end)
cost = float(analytics.get("total_cost", 0))
calls = analytics.get("total_calls", 0)
print(f" {start} to {end}: ${cost:.2f} ({calls} calls)")
Chapter 6: Volume Discounts and Cost Optimization
Checking Volume Discount Eligibility
High-volume agents may qualify for automatic discounts based on cumulative usage. Check your current tier and what it takes to reach the next one.
finops = AgentFinOps(api_key=API_KEY, agent_id="orchestrator-01")
discount = finops.get_volume_discount()
print(f"Current tier: {discount.get('current_tier')}")
print(f"Discount rate: {discount.get('discount_pct')}%")
print(f"Monthly volume: ${discount.get('monthly_volume')}")
print(f"Next tier at: ${discount.get('next_tier_threshold')}")
print(f"Next tier discount: {discount.get('next_tier_discount_pct')}%")
curl -- check volume discount:
curl -s -X POST https://sandbox.greenhelix.net/v1 \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $GREENHELIX_API_KEY" \
-d '{
"tool": "get_volume_discount",
"input": {"agent_id": "orchestrator-01"}
}'
Pre-Execution Cost Estimation
Before choosing which tool to use for a task, estimate the cost of each option. This enables cost-aware routing: the orchestrator picks the cheapest tool that meets quality requirements.
def cheapest_tool(
finops: AgentFinOps,
candidates: list[dict],
) -> dict:
"""Pick the cheapest tool from a list of candidates.
Each candidate: {"tool": "tool_name", "params": {...}, "quality": float}
Quality must be >= min_quality to be considered.
"""
min_quality = 0.8
estimated = []
for candidate in candidates:
if candidate["quality"] < min_quality:
continue
estimate = finops.estimate_cost(
tool_name=candidate["tool"],
parameters=candidate["params"],
)
candidate["estimated_cost"] = float(
estimate.get("estimated_cost", float("inf"))
)
estimated.append(candidate)
if not estimated:
raise ValueError("No candidate meets minimum quality threshold")
# Sort by cost, pick cheapest
estimated.sort(key=lambda c: c["estimated_cost"])
winner = estimated[0]
print(
f"Selected {winner['tool']} at ${winner['estimated_cost']:.4f} "
f"(quality={winner['quality']})"
)
return winner
# Orchestrator choosing between three summarization tools
finops = AgentFinOps(api_key=API_KEY, agent_id="orchestrator-01")
best = cheapest_tool(finops, [
{"tool": "summarize_gpt4o", "params": {"tokens": 2000}, "quality": 0.95},
{"tool": "summarize_claude", "params": {"tokens": 2000}, "quality": 0.93},
{"tool": "summarize_llama", "params": {"tokens": 2000}, "quality": 0.82},
])
Multi-Currency Cost Normalization
When your fleet operates across regions or platforms with different billing currencies, normalize everything to a single currency for consistent reporting.
def normalize_fleet_costs(
api_key: str,
agent_costs: list[dict],
target_currency: str = "USD",
) -> list[dict]:
"""Convert all agent costs to a single currency.
agent_costs: [{"agent_id": "x", "cost": 100, "currency": "EUR"}, ...]
"""
finops = AgentFinOps(api_key=api_key, agent_id="system")
normalized = []
for entry in agent_costs:
if entry["currency"] == target_currency:
entry["normalized_cost"] = entry["cost"]
else:
conversion = finops.convert_currency(
amount=entry["cost"],
from_currency=entry["currency"],
to_currency=target_currency,
)
entry["normalized_cost"] = float(
conversion.get("converted_amount", entry["cost"])
)
normalized.append(entry)
return normalized
# Normalize costs from a multi-region fleet
costs = normalize_fleet_costs(API_KEY, [
{"agent_id": "eu-researcher", "cost": 85.50, "currency": "EUR"},
{"agent_id": "us-researcher", "cost": 92.00, "currency": "USD"},
{"agent_id": "jp-researcher", "cost": 12000, "currency": "JPY"},
])
for c in costs:
print(
f" {c['agent_id']}: {c['cost']} {c['currency']} "
f"= ${c['normalized_cost']:.2f} USD"
)
Cost-Aware Model Routing
Combine cost estimation with quality requirements to route agent tasks to the most cost-effective model at runtime.
def cost_aware_route(
finops: AgentFinOps,
task_type: str,
quality_floor: float = 0.85,
) -> str:
"""Route a task to the cheapest model that meets quality requirements."""
MODEL_CATALOG = {
"gpt-4o": {"quality": 0.95, "tool": "inference_gpt4o"},
"claude-3.5": {"quality": 0.93, "tool": "inference_claude"},
"llama-3-70b": {"quality": 0.85, "tool": "inference_llama"},
"mistral-8x7b":{"quality": 0.80, "tool": "inference_mistral"},
}
candidates = []
for model, spec in MODEL_CATALOG.items():
if spec["quality"] < quality_floor:
continue
est = finops.estimate_cost(
tool_name=spec["tool"],
parameters={"task_type": task_type},
)
candidates.append({
"model": model,
"cost": float(est.get("estimated_cost", float("inf"))),
"quality": spec["quality"],
})
candidates.sort(key=lambda c: c["cost"])
if candidates:
pick = candidates[0]
print(
f"Routing {task_type} to {pick['model']}: "
f"${pick['cost']:.4f}/call, quality={pick['quality']}"
)
return pick["model"]
raise ValueError(f"No model meets quality floor {quality_floor}")
Chapter 7: Fleet Dashboards and Leaderboards
Cost-Efficiency Leaderboard
The agent leaderboard ranks all agents in your fleet by cost efficiency -- the ratio of useful output to money spent. This is the FinOps equivalent of cloud cost-per-transaction dashboards, applied to agents.
finops = AgentFinOps(api_key=API_KEY, agent_id="orchestrator-01")
# Pull the cost-efficiency leaderboard
leaderboard = finops.get_agent_leaderboard(metric="cost_efficiency")
print("Agent Cost-Efficiency Leaderboard")
print("-" * 65)
print(f"{'Rank':<6}{'Agent ID':<25}{'Efficiency':<15}{'Total Spend':<15}")
print("-" * 65)
for i, entry in enumerate(leaderboard.get("agents", []), 1):
print(
f"{i:<6}{entry['agent_id']:<25}"
f"{entry.get('efficiency_score', 'N/A'):<15}"
f"${float(entry.get('total_spend', 0)):<14.2f}"
)
curl -- get leaderboard:
curl -s -X POST https://sandbox.greenhelix.net/v1 \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $GREENHELIX_API_KEY" \
-d '{
"tool": "get_agent_leaderboard",
"input": {"metric": "cost_efficiency"}
}'
Platform-Wide Statistics
Get a bird's-eye view of your entire agent platform: total spend, active agents, average cost per call, and utilization rates.
finops = AgentFinOps(api_key=API_KEY, agent_id="orchestrator-01")
stats = finops.get_platform_stats()
print("Platform-Wide FinOps Summary")
print("=" * 45)
print(f"Active agents: {stats.get('active_agents')}")
print(f"Total spend (MTD): ${stats.get('total_spend_mtd')}")
print(f"Avg cost/call: ${stats.get('avg_cost_per_call')}")
print(f"Total tool calls: {stats.get('total_calls_mtd')}")
print(f"Budget utilization: {stats.get('budget_utilization_pct')}%")
print(f"Agents over budget: {stats.get('agents_over_budget', 0)}")
curl -- get platform stats:
curl -s -X POST https://sandbox.greenhelix.net/v1 \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $GREENHELIX_API_KEY" \
-d '{
"tool": "get_platform_stats",
"input": {}
}'
Trend Analysis: Week-over-Week Spend Comparison
def weekly_trend_report(
api_key: str,
agent_ids: list[str],
weeks: list[tuple[str, str]],
):
"""Print week-over-week spend trends for a fleet of agents."""
print(f"\n{'Agent':<22}", end="")
for start, end in weeks:
print(f"{start:<16}", end="")
print(" WoW Change")
print("-" * (22 + 16 * len(weeks) + 12))
for agent_id in agent_ids:
finops = AgentFinOps(api_key=api_key, agent_id=agent_id)
costs = []
print(f"{agent_id:<22}", end="")
for start, end in weeks:
analytics = finops.get_usage_analytics(
start_date=start,
end_date=end,
)
cost = float(analytics.get("total_cost", 0))
costs.append(cost)
print(f"${cost:<15.2f}", end="")
# Compute week-over-week change for last two weeks
if len(costs) >= 2 and costs[-2] > 0:
change_pct = (costs[-1] - costs[-2]) / costs[-2] * 100
direction = "+" if change_pct > 0 else ""
print(f" {direction}{change_pct:.1f}%")
else:
print(" N/A")
weekly_trend_report(
api_key=API_KEY,
agent_ids=[
"orchestrator-01", "researcher-01", "researcher-02",
"summarizer-01", "classifier-01",
],
weeks=[
("2026-03-16", "2026-03-22"),
("2026-03-23", "2026-03-29"),
("2026-03-30", "2026-04-05"),
],
)
Building a Daily FinOps Dashboard
Combine all the analytics tools into a single dashboard function that your team runs every morning.
def daily_finops_dashboard(api_key: str, agent_ids: list[str]):
"""Print a comprehensive daily FinOps dashboard."""
print("=" * 70)
print(" DAILY FINOPS DASHBOARD | " + datetime.now(timezone.utc).strftime("%Y-%m-%d %H:%M UTC"))
print("=" * 70)
# Platform overview
system = AgentFinOps(api_key=api_key, agent_id=agent_ids[0])
stats = system.get_platform_stats()
print(f"\nPlatform: {stats.get('active_agents')} agents | "
f"${stats.get('total_spend_mtd')} MTD | "
f"{stats.get('budget_utilization_pct')}% budget used")
# Per-agent status
print(f"\n{'Agent':<22}{'Balance':>10}{'Spent Today':>13}"
f"{'Daily Cap':>11}{'% Used':>9}")
print("-" * 65)
alerts = []
for agent_id in agent_ids:
finops = AgentFinOps(api_key=api_key, agent_id=agent_id)
balance = finops.get_balance()
budget = finops.get_budget_status()
bal = float(balance.get("balance", 0))
spent = float(budget.get("spent_today", 0))
cap = float(budget.get("daily_limit", 0))
pct = (spent / cap * 100) if cap > 0 else 0
flag = " !!!" if pct > 80 else ""
print(
f"{agent_id:<22}${bal:>9.2f}${spent:>12.2f}"
f"${cap:>10.2f}{pct:>8.1f}%{flag}"
)
if pct > 80:
alerts.append(f" {agent_id}: {pct:.1f}% of daily budget used")
if alerts:
print(f"\nALERTS ({len(alerts)}):")
for alert in alerts:
print(alert)
# Top spenders by category (fleet-wide, using first agent for API access)
print("\nTop spending categories (fleet-wide, last 24h):")
for agent_id in agent_ids:
finops = AgentFinOps(api_key=api_key, agent_id=agent_id)
cats = finops.get_spending_by_category(
start_date=datetime.now(timezone.utc).strftime("%Y-%m-%d"),
)
for cat in cats.get("categories", [])[:3]:
print(
f" {agent_id} > {cat['category']}: "
f"${cat['total_cost']:.2f} ({cat['call_count']} calls)"
)
print("\n" + "=" * 70)
Chapter 8: API Key Isolation and Security
Why Per-Agent API Keys Matter for Cost Control
API keys are not just a security boundary -- they are a cost boundary. When five agents share one API key, revoking the key to stop a runaway agent kills all five. Per-agent keys let you revoke one agent's access without disrupting the rest of the fleet. This makes key revocation a viable cost control mechanism, not a fleet-wide outage.
Creating Per-Agent Keys
finops = AgentFinOps(api_key=API_KEY, agent_id="researcher-01")
# Create a dedicated key with restricted permissions
key_result = finops.create_api_key(
label="researcher-01-production",
permissions=["search_services", "get_balance", "get_budget_status"],
)
new_api_key = key_result.get("api_key")
key_id = key_result.get("key_id")
print(f"API key created: {key_id}")
print(f"Key value: {new_api_key[:8]}...{new_api_key[-4:]}")
print(f"Permissions: {key_result.get('permissions')}")
curl -- create API key:
curl -s -X POST https://sandbox.greenhelix.net/v1 \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $GREENHELIX_API_KEY" \
-d '{
"tool": "create_api_key",
"input": {
"agent_id": "researcher-01",
"label": "researcher-01-production",
"permissions": ["search_services", "get_balance", "get_budget_status"]
}
}'
Automated Key Rotation
Rotate keys on a schedule. The pattern: create a new key, update the agent's configuration to use it, verify it works, then delete the old key. Never delete the old key before the new one is in use.
def rotate_agent_key(
admin_finops: AgentFinOps,
agent_id: str,
old_key_id: str,
) -> dict:
"""Rotate an agent's API key with zero downtime."""
finops = AgentFinOps(
api_key=admin_finops.session.headers["Authorization"].split(" ")[1],
agent_id=agent_id,
)
# Step 1: Create new key
new_key = finops.create_api_key(
label=f"{agent_id}-rotated-{int(time.time())}",
)
print(f"New key created: {new_key['key_id']}")
# Step 2: Rotate -- this invalidates the old key after a grace period
rotate_result = finops.rotate_api_key(key_id=old_key_id)
print(f"Old key {old_key_id} scheduled for invalidation")
return {
"new_key_id": new_key["key_id"],
"new_api_key": new_key["api_key"],
"old_key_id": old_key_id,
"rotation_status": rotate_result.get("status"),
}
# Rotate the researcher's key
result = rotate_agent_key(
admin_finops=AgentFinOps(api_key=API_KEY, agent_id="admin"),
agent_id="researcher-01",
old_key_id="key_abc123",
)
print(f"Rotation complete. New key: {result['new_key_id']}")
curl -- rotate API key:
curl -s -X POST https://sandbox.greenhelix.net/v1 \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $GREENHELIX_API_KEY" \
-d '{
"tool": "rotate_api_key",
"input": {
"agent_id": "researcher-01",
"key_id": "key_abc123"
}
}'
Credential Revocation as Cost Control
When you detect a runaway agent through a webhook alert or dashboard anomaly, the fastest way to stop the bleed is to revoke its API key. This is faster than killing the process (which requires infrastructure access) and more targeted than disabling the entire fleet.
def emergency_cost_stop(api_key: str, agent_id: str, key_id: str):
"""Emergency procedure: revoke an agent's API key to stop spend."""
finops = AgentFinOps(api_key=api_key, agent_id=agent_id)
# Rotate the key -- the old key becomes invalid
finops.rotate_api_key(key_id=key_id)
print(f"EMERGENCY: API key {key_id} revoked for {agent_id}")
# Check final spend
budget = finops.get_budget_status()
print(f"Final spend today: ${budget.get('spent_today')}")
# Log the incident
return {
"agent_id": agent_id,
"action": "key_revoked",
"final_spend": budget.get("spent_today"),
"timestamp": datetime.now(timezone.utc).isoformat(),
}
Fleet Key Rotation Schedule
Automate key rotation across the entire fleet as a weekly cron job.
def fleet_key_rotation(api_key: str, agents: list[dict]):
"""Rotate API keys for all agents in the fleet.
Run as a weekly cron job.
agents: [{"agent_id": "...", "key_id": "..."}, ...]
"""
admin = AgentFinOps(api_key=api_key, agent_id="admin")
results = []
for agent in agents:
try:
result = rotate_agent_key(
admin_finops=admin,
agent_id=agent["agent_id"],
old_key_id=agent["key_id"],
)
results.append({
"agent_id": agent["agent_id"],
"status": "rotated",
"new_key_id": result["new_key_id"],
})
except Exception as e:
results.append({
"agent_id": agent["agent_id"],
"status": "failed",
"error": str(e),
})
# Summary
rotated = sum(1 for r in results if r["status"] == "rotated")
failed = sum(1 for r in results if r["status"] == "failed")
print(f"Key rotation complete: {rotated} rotated, {failed} failed")
return results
Chapter 9: Putting It All Together: The FinOps Control Plane
End-to-End Scenario: Five-Agent Research Fleet
This section ties together every concept from the previous chapters into a single, production-ready control plane. The scenario: you are running a research fleet with five agents -- one orchestrator, two researchers, one summarizer, and one classifier. The fleet processes incoming research requests, delegates work, and must operate within a $200/day fleet budget.
import os
from datetime import datetime, timezone
API_KEY = os.environ["GREENHELIX_API_KEY"]
ALERT_BASE_URL = "https://your-app.example.com/alerts"
# ── Step 1: Provision the fleet ─────────────────────────────────
FLEET = [
{"agent_id": "orch-01", "role": "orchestrator", "deposit": 200.00,
"daily_cap": 80.00, "monthly_cap": 1600.00},
{"agent_id": "res-01", "role": "researcher", "deposit": 75.00,
"daily_cap": 30.00, "monthly_cap": 600.00},
{"agent_id": "res-02", "role": "researcher", "deposit": 75.00,
"daily_cap": 30.00, "monthly_cap": 600.00},
{"agent_id": "sum-01", "role": "summarizer", "deposit": 50.00,
"daily_cap": 20.00, "monthly_cap": 400.00},
{"agent_id": "cls-01", "role": "classifier", "deposit": 40.00,
"daily_cap": 15.00, "monthly_cap": 300.00},
]
fleet_clients = {}
for agent_cfg in FLEET:
finops = AgentFinOps(api_key=API_KEY, agent_id=agent_cfg["agent_id"])
# Create wallet and fund
finops.create_wallet()
finops.deposit(amount=agent_cfg["deposit"])
# Set budget caps
finops.set_budget_cap(
daily_limit=agent_cfg["daily_cap"],
monthly_limit=agent_cfg["monthly_cap"],
)
# Create dedicated API key
key = finops.create_api_key(
label=f"{agent_cfg['agent_id']}-prod-key",
)
# Register budget alerts at 75%, 90%, and 100%
for pct in [75, 90, 100]:
finops.register_webhook(
url=f"{ALERT_BASE_URL}/{agent_cfg['agent_id']}/{pct}",
events=["budget.threshold"],
config={"threshold_pct": pct},
)
fleet_clients[agent_cfg["agent_id"]] = finops
print(
f"Provisioned {agent_cfg['agent_id']} "
f"(${agent_cfg['deposit']:.2f} balance, "
f"${agent_cfg['daily_cap']:.2f}/day cap)"
)
# ── Step 2: Pre-flight check before task delegation ─────────────
def delegate_task(
orchestrator: AgentFinOps,
worker_id: str,
tool: str,
params: dict,
) -> dict:
"""Orchestrator delegates a task to a worker with full FinOps checks."""
worker = fleet_clients[worker_id]
# Estimate cost
estimate = worker.estimate_cost(tool_name=tool, parameters=params)
cost = float(estimate.get("estimated_cost", 0))
# Check worker's remaining budget
budget = worker.get_budget_status()
remaining = (
float(budget.get("daily_limit", 0))
- float(budget.get("spent_today", 0))
)
if cost > remaining:
print(
f"SKIP: {worker_id} cannot afford {tool} "
f"(${cost:.4f} > ${remaining:.2f} remaining)"
)
return {"status": "skipped", "reason": "insufficient_budget"}
# Execute
try:
result = worker._execute(tool, params)
return {"status": "success", "result": result}
except BudgetExhaustedError:
return {"status": "budget_exhausted"}
# ── Step 3: Run a workflow ──────────────────────────────────────
orch = fleet_clients["orch-01"]
# The orchestrator delegates research tasks to the two researchers
tasks = [
("res-01", "search_services", {"query": "climate data analysis"}),
("res-02", "search_services", {"query": "economic forecasting models"}),
]
results = []
for worker_id, tool, params in tasks:
result = delegate_task(orch, worker_id, tool, params)
results.append(result)
print(f" {worker_id} > {tool}: {result['status']}")
# Summarizer processes the combined results
delegate_task(
orch, "sum-01",
"search_services",
{"query": "summarize findings"},
)
# Classifier categorizes the output
delegate_task(
orch, "cls-01",
"search_services",
{"query": "classify report type"},
)
# ── Step 4: Generate end-of-day report ──────────────────────────
def end_of_day_report(api_key: str, fleet: list[dict]):
"""Generate a fleet-wide end-of-day FinOps report."""
print("\n" + "=" * 70)
print(" END-OF-DAY FINOPS REPORT | " +
datetime.now(timezone.utc).strftime("%Y-%m-%d"))
print("=" * 70)
total_fleet_spend = 0.0
print(f"\n{'Agent':<14}{'Spent Today':>13}{'Daily Cap':>11}"
f"{'Utilization':>13}{'Balance':>10}")
print("-" * 61)
for agent_cfg in fleet:
finops = AgentFinOps(api_key=api_key, agent_id=agent_cfg["agent_id"])
budget = finops.get_budget_status()
balance = finops.get_balance()
spent = float(budget.get("spent_today", 0))
cap = float(budget.get("daily_limit", 0))
bal = float(balance.get("balance", 0))
util = (spent / cap * 100) if cap > 0 else 0
total_fleet_spend += spent
print(
f"{agent_cfg['agent_id']:<14}${spent:>12.2f}${cap:>10.2f}"
f"{util:>12.1f}%${bal:>9.2f}"
)
print("-" * 61)
print(f"{'Fleet Total':<14}${total_fleet_spend:>12.2f}")
# Leaderboard
system = AgentFinOps(api_key=api_key, agent_id=fleet[0]["agent_id"])
lb = system.get_agent_leaderboard(metric="cost_efficiency")
print("\nEfficiency Leaderboard (top 3):")
for i, entry in enumerate(lb.get("agents", [])[:3], 1):
print(
f" #{i} {entry['agent_id']}: "
f"efficiency={entry.get('efficiency_score', 'N/A')}"
)
print("\n" + "=" * 70)
end_of_day_report(API_KEY, FLEET)
The Complete FinOps Control Plane Class
For production use, wrap the orchestration logic into a reusable control plane class.
class FinOpsControlPlane:
"""Centralized FinOps control plane for multi-agent fleets."""
def __init__(self, api_key: str, alert_base_url: str):
self.api_key = api_key
self.alert_base_url = alert_base_url
self.agents: dict[str, AgentFinOps] = {}
def provision_agent(
self,
agent_id: str,
deposit: float,
daily_cap: float,
monthly_cap: float,
alert_thresholds: list[int] = None,
) -> AgentFinOps:
"""Provision an agent with wallet, budget, key, and alerts."""
if alert_thresholds is None:
alert_thresholds = [75, 90, 100]
finops = AgentFinOps(
api_key=self.api_key,
agent_id=agent_id,
)
finops.create_wallet()
finops.deposit(amount=deposit)
finops.set_budget_cap(
daily_limit=daily_cap,
monthly_limit=monthly_cap,
)
finops.create_api_key(label=f"{agent_id}-managed")
for pct in alert_thresholds:
finops.register_webhook(
url=f"{self.alert_base_url}/{agent_id}/{pct}",
events=["budget.threshold"],
config={"threshold_pct": pct},
)
self.agents[agent_id] = finops
return finops
def fleet_budget_status(self) -> dict:
"""Get budget status for all managed agents."""
statuses = {}
for agent_id, finops in self.agents.items():
budget = finops.get_budget_status()
balance = finops.get_balance()
statuses[agent_id] = {
"spent_today": budget.get("spent_today"),
"daily_limit": budget.get("daily_limit"),
"balance": balance.get("balance"),
}
return statuses
def emergency_stop(self, agent_id: str, key_id: str):
"""Emergency: revoke an agent's API key to halt spending."""
finops = self.agents.get(agent_id)
if finops:
finops.rotate_api_key(key_id=key_id)
print(f"EMERGENCY STOP: {agent_id} key revoked")
def fleet_chargeback(
self,
team_map: dict[str, list[str]],
) -> dict:
"""Generate chargeback by team."""
report = {}
for team, ids in team_map.items():
total = 0.0
for agent_id in ids:
if agent_id in self.agents:
summary = self.agents[agent_id].get_billing_summary()
total += float(summary.get("total_cost", 0))
report[team] = round(total, 2)
return report
def rotate_all_keys(self, key_map: dict[str, str]):
"""Rotate API keys for all agents. key_map: {agent_id: key_id}."""
for agent_id, key_id in key_map.items():
if agent_id in self.agents:
self.agents[agent_id].rotate_api_key(key_id=key_id)
print(f"Rotated key for {agent_id}")
# ── Usage ────────────────────────────────────────────────────────
cp = FinOpsControlPlane(
api_key=API_KEY,
alert_base_url="https://your-app.example.com/alerts",
)
# Provision the fleet
cp.provision_agent("orch-01", deposit=200, daily_cap=80, monthly_cap=1600)
cp.provision_agent("res-01", deposit=75, daily_cap=30, monthly_cap=600)
cp.provision_agent("res-02", deposit=75, daily_cap=30, monthly_cap=600)
cp.provision_agent("sum-01", deposit=50, daily_cap=20, monthly_cap=400)
cp.provision_agent("cls-01", deposit=40, daily_cap=15, monthly_cap=300)
# Daily status check
status = cp.fleet_budget_status()
for agent_id, s in status.items():
print(f" {agent_id}: spent=${s['spent_today']}, balance=${s['balance']}")
# Monthly chargeback
chargeback = cp.fleet_chargeback({
"research-team": ["res-01", "res-02"],
"processing-team": ["sum-01", "cls-01"],
"platform-team": ["orch-01"],
})
for team, cost in chargeback.items():
print(f" {team}: ${cost:.2f}")
Production Checklist
Before deploying your FinOps control plane to production, verify every item on this list.
Wallet Isolation
- Every agent has its own wallet. No shared wallets.
- Orchestrator agents that fund workers have a documented funding budget per task type.
- Initial deposit amounts are based on expected 7-day spend plus 30% buffer.
Budget Caps
- Every agent has a
set_budget_capwith both daily and monthly limits. - Daily limits are set to the maximum reasonable daily spend for that agent's role, not the wallet balance.
- Your code handles
402 Payment Requiredas aBudgetExhaustedErrorwith a graceful shutdown path, not a retry. - Workflow-level budget caps use dedicated per-workflow agent IDs.
Spend Alerts
- Every agent has webhook alerts at 75%, 90%, and 100% of daily budget.
- The alert receiver endpoint is monitored for uptime (if it is down, you get no alerts).
- Critical alerts (90%+) trigger PagerDuty or equivalent on-call notification.
- The 100% alert triggers an automatic agent pause or key revocation.
-
get_webhook_logsis checked weekly to verify alert delivery.
Cost Attribution
-
get_spending_by_categoryis pulled daily for each agent. - Monthly chargeback reports are generated and sent to team leads.
- Historical spend data from
get_usage_analyticsis stored in your data warehouse for trend analysis.
Cost Optimization
-
estimate_costis called before expensive tool executions. -
get_volume_discountis checked monthly; agents are consolidated if it improves tier eligibility. - Multi-currency costs are normalized to a single reporting currency.
API Key Security
- Every agent has its own API key with scoped permissions.
- Key rotation runs weekly as an automated job.
- The emergency key revocation procedure is documented and tested quarterly.
- API keys are stored in a secrets manager, never in code or environment files.
Fleet Monitoring
- The daily FinOps dashboard runs automatically each morning.
- The cost-efficiency leaderboard is reviewed weekly to identify underperforming agents.
-
get_platform_statsis tracked over time to detect fleet-wide cost anomalies. - Week-over-week spend comparison is reviewed for every agent with >$10/day average spend.
Incident Response
- A runbook exists for "runaway agent spend" incidents: detect via webhook, diagnose via
get_spending_by_category, stop via key revocation, postmortem viaget_usage_analytics. - The emergency stop procedure (revoke API key) is executable by any on-call engineer.
- Post-incident, the budget cap for the affected agent is reviewed and tightened if necessary.
Chapter 10: What to Do Next
Advanced Pattern: Dynamic Budget Reallocation
In a fleet with a fixed total budget, idle agents waste allocated capacity while busy agents hit their caps. Dynamic reallocation moves budget from idle agents to active ones in real time.
def rebalance_fleet_budgets(
control_plane: FinOpsControlPlane,
fleet_daily_budget: float,
):
"""Dynamically reallocate daily budgets based on current utilization."""
statuses = control_plane.fleet_budget_status()
# Calculate utilization for each agent
utilizations = {}
for agent_id, s in statuses.items():
cap = float(s.get("daily_limit", 1))
spent = float(s.get("spent_today", 0))
utilizations[agent_id] = spent / cap if cap > 0 else 0
# Agents under 30% utilization donate budget to agents over 80%
donors = {k: v for k, v in utilizations.items() if v < 0.30}
receivers = {k: v for k, v in utilizations.items() if v > 0.80}
if not donors or not receivers:
print("No rebalancing needed.")
return
# Calculate redistributable budget
redistributable = 0.0
for agent_id in donors:
cap = float(statuses[agent_id]["daily_limit"])
spent = float(statuses[agent_id]["spent_today"])
donatable = (cap - spent) * 0.5 # Donate half of unused budget
redistributable += donatable
# Distribute evenly to receivers
per_receiver = redistributable / len(receivers)
for agent_id in receivers:
finops = control_plane.agents[agent_id]
current_cap = float(statuses[agent_id]["daily_limit"])
new_cap = current_cap + per_receiver
finops.set_budget_cap(daily_limit=new_cap)
print(
f" {agent_id}: cap raised from ${current_cap:.2f} "
f"to ${new_cap:.2f} (+${per_receiver:.2f})"
)
# Reduce donor caps
per_donor_reduction = redistributable / len(donors)
for agent_id in donors:
finops = control_plane.agents[agent_id]
current_cap = float(statuses[agent_id]["daily_limit"])
new_cap = max(current_cap - per_donor_reduction, 5.00) # Minimum $5
finops.set_budget_cap(daily_limit=new_cap)
print(
f" {agent_id}: cap lowered from ${current_cap:.2f} "
f"to ${new_cap:.2f}"
)
Advanced Pattern: Cost-Aware Workflow Routing
Instead of always routing to the same set of agents, pick the workflow path that minimizes cost while meeting quality constraints.
def cost_aware_workflow_route(
control_plane: FinOpsControlPlane,
task: dict,
) -> list[str]:
"""Pick the cheapest agent path for a workflow task."""
capable_agents = task["capable_agents"] # List of agent IDs
quality_floor = task.get("quality_floor", 0.8)
# Score each candidate by remaining budget and efficiency
candidates = []
for agent_id in capable_agents:
if agent_id not in control_plane.agents:
continue
finops = control_plane.agents[agent_id]
budget = finops.get_budget_status()
remaining = (
float(budget.get("daily_limit", 0))
- float(budget.get("spent_today", 0))
)
estimate = finops.estimate_cost(
tool_name=task["tool"],
parameters=task["params"],
)
cost = float(estimate.get("estimated_cost", float("inf")))
if cost <= remaining:
candidates.append({
"agent_id": agent_id,
"cost": cost,
"remaining_budget": remaining,
})
# Sort by cost, then by remaining budget (prefer agents with more headroom)
candidates.sort(key=lambda c: (c["cost"], -c["remaining_budget"]))
if candidates:
chosen = candidates[0]["agent_id"]
print(
f"Routing to {chosen} "
f"(cost=${candidates[0]['cost']:.4f}, "
f"${candidates[0]['remaining_budget']:.2f} remaining)"
)
return [chosen]
raise RuntimeError("No agent can afford this task within budget")
Building a SaaS Billing Layer
If you are building a platform where customers deploy their own agent fleets, you can build a multi-tenant billing layer on top of GreenHelix. The pattern: each customer gets a top-level orchestrator agent that funds and manages their fleet. Your platform charges the customer based on their fleet's total spend via get_billing_summary, applying your own markup.
def customer_invoice(
api_key: str,
customer_id: str,
customer_agents: list[str],
markup_pct: float = 20.0,
) -> dict:
"""Generate an invoice for a customer based on their fleet spend."""
total_platform_cost = 0.0
line_items = []
for agent_id in customer_agents:
finops = AgentFinOps(api_key=api_key, agent_id=agent_id)
summary = finops.get_billing_summary(period="monthly")
cost = float(summary.get("total_cost", 0))
total_platform_cost += cost
line_items.append({
"agent_id": agent_id,
"platform_cost": round(cost, 2),
"calls": summary.get("total_calls", 0),
})
markup = total_platform_cost * (markup_pct / 100)
total_invoice = total_platform_cost + markup
return {
"customer_id": customer_id,
"period": "April 2026",
"line_items": line_items,
"platform_cost": round(total_platform_cost, 2),
"markup_pct": markup_pct,
"markup_amount": round(markup, 2),
"total_invoice": round(total_invoice, 2),
}
Where This Guide Fits in the GreenHelix Ecosystem
This guide covers budget enforcement, cost allocation, and spend analytics -- the FinOps layer. For the other layers of agent commerce infrastructure, see the companion guides:
- Agent-to-Agent Commerce: Escrow, Payments, and Trust -- the foundational guide covering wallet setup, escrow patterns (standard, performance, split), marketplace discovery, subscriptions, dispute resolution, and framework integration with CrewAI, LangChain, and AutoGen.
- The Agent Strategy Marketplace Playbook -- how to sell verified trading strategies with performance escrow, including metric submission pipelines, claim chains, and pricing tiers.
- Verified Trading Bot Reputation -- how to build cryptographic PnL proof using Ed25519 signatures and Merkle claim chains, covering the trust score algorithm and leaderboard mechanics.
- Tamper-Proof Audit Trails for Trading Bots -- EU AI Act, MiFID II, and SEC compliance using GreenHelix's event bus and Merkle audit chains.
For the full API reference and tool catalog (all 128 tools, including the 23 billing and webhook tools covered in this guide), visit the GreenHelix developer documentation at https://api.greenhelix.net/docs.
For updates on the FinOps Foundation's "FinOps for AI" initiative and how GreenHelix aligns with their emerging standards for AI cost governance, follow the FinOps Foundation at finops.org.
Price: $29 | Format: Digital Guide | Updates: Lifetime access