What This Module Covers
Production HardeningAgents fail in ways chains never do. They can loop forever, call tools that don't exist, spend your entire monthly API budget in 10 minutes, or get stuck unable to make progress. This module gives you the tools to detect, contain, and recover from every major agent failure mode.
- Failure taxonomy — the 5 agent failure modes and how to recognise each
- Loop detection — detecting infinite loops, repeated tool calls, lack of progress
- Guardrails — output validation, tool call validation, scope enforcement
- Cost circuit breakers — hard spending limits that stop runaway agents
- Structured agent logging — capturing every decision for debugging and audit
- Recovery patterns — graceful degradation, fallback to human, partial result return
The 5 Agent Failure Modes
Know TheseInfinite Loop
Agent calls the same tool repeatedly with same args, never making progress. max_turns doesn't help if the loop is subtle.
Stuck State
Agent keeps trying a failing approach, can't recover. Tool returns error, agent retries with same args, same error.
Hallucinated Tool Calls
Agent invents tool names that don't exist, or calls real tools with nonsensical arguments.
Runaway Cost
Agent spawns subagents, each calling expensive tools in loops. $0.01 task becomes $100 task.
Silent Partial Failure
Agent completes but with incorrect results. It said it succeeded but actually failed midway. No error raised.
Loop Detection and Progress Tracking
Criticalimport hashlib, json
from collections import defaultdict, Counter
from dataclasses import dataclass, field
from typing import Any
@dataclass
class AgentGuardian:
"""Monitors agent execution for failure patterns."""
max_turns: int = 20
max_repeated_calls: int = 3 # same tool+args N times = loop
max_errors: int = 5 # 5 consecutive errors = stuck
max_cost_usd: float = 1.0 # hard spending limit
turn_count: int = 0
error_count: int = 0
total_cost_usd: float = 0.0
tool_call_log: list = field(default_factory=list)
call_counts: dict = field(default_factory=lambda: Counter())
def _call_fingerprint(self, tool_name: str, args: dict) -> str:
"""Hash of tool name + sorted args — detects repeated identical calls."""
key = json.dumps({"tool": tool_name, "args": args}, sort_keys=True)
return hashlib.md5(key.encode()).hexdigest()[:8]
def record_tool_call(self, tool_name: str, args: dict,
result: Any, tokens_used: int = 0) -> None:
fp = self._call_fingerprint(tool_name, args)
self.call_counts[fp] += 1
self.tool_call_log.append({
"turn": self.turn_count,
"tool": tool_name,
"args": args,
"fp": fp,
"success": "error" not in str(result).lower(),
})
if isinstance(result, dict) and not result.get("ok", True):
self.error_count += 1
else:
self.error_count = 0 # reset on success
cost = tokens_used * (3.00 / 1_000_000)
self.total_cost_usd += cost
def check(self) -> tuple[bool, str]:
"""Returns (should_stop, reason). Call before each turn."""
self.turn_count += 1
if self.turn_count > self.max_turns:
return True, f"Max turns exceeded ({self.max_turns})"
if self.total_cost_usd > self.max_cost_usd:
return True, f"Cost limit exceeded: ${self.total_cost_usd:.4f} > ${self.max_cost_usd}"
if self.error_count >= self.max_errors:
return True, f"Stuck: {self.error_count} consecutive errors"
for fp, count in self.call_counts.items():
if count >= self.max_repeated_calls:
recent = [c for c in self.tool_call_log if c["fp"] == fp][-1]
return True, f"Loop detected: {recent['tool']} called {count}x with same args"
return False, ""
# Usage inside agent loop
def guarded_agent(user_message: str) -> dict:
guardian = AgentGuardian(max_turns=15, max_cost_usd=0.50)
messages = [{"role": "user", "content": user_message}]
while True:
should_stop, reason = guardian.check()
if should_stop:
return {"status": "stopped", "reason": reason,
"partial_result": extract_partial_result(messages),
"turns_used": guardian.turn_count,
"cost_usd": guardian.total_cost_usd}
response = client.messages.create(model="claude-3-5-sonnet-20241022",
max_tokens=4096, tools=TOOLS, messages=messages)
if response.stop_reason == "end_turn":
return {"status": "completed",
"answer": response.content[0].text,
"turns_used": guardian.turn_count,
"cost_usd": guardian.total_cost_usd}
messages.append({"role": "assistant", "content": response.content})
tool_results = []
for block in response.content:
if block.type == "tool_use":
result = execute_tool(block.name, block.input)
guardian.record_tool_call(block.name, block.input, result,
tokens_used=response.usage.output_tokens)
tool_results.append({"type": "tool_result",
"tool_use_id": block.id, "content": str(result)})
messages.append({"role": "user", "content": tool_results})Input and Output Guardrails
Validation Layer# ── Tool call validation ────────────────────────────── def validate_tool_call(tool_name: str, args: dict) -> tuple[bool, str]: """Validate before executing. Returns (is_valid, error_message).""" if tool_name not in TOOL_REGISTRY: return False, f"Tool {tool_name!r} does not exist. Available: {list(TOOL_REGISTRY)}" tool_schema = next(t for t in TOOLS if t["name"] == tool_name) required = tool_schema["input_schema"].get("required", []) properties = tool_schema["input_schema"].get("properties", {}) for req_field in required: if req_field not in args: return False, f"Missing required field: {req_field!r}" for field_name, field_val in args.items(): if field_name not in properties: return False, f"Unknown field: {field_name!r}" expected_type = properties[field_name].get("type") if expected_type == "string" and not isinstance(field_val, str): return False, f"{field_name} must be a string, got {type(field_val).__name__}" if expected_type == "integer" and not isinstance(field_val, int): return False, f"{field_name} must be an integer" return True, "" def execute_tool_safe(tool_name: str, args: dict) -> dict: is_valid, error = validate_tool_call(tool_name, args) if not is_valid: return {"ok": False, "error": "INVALID_TOOL_CALL", "message": error, "suggestion": "Check the tool name and argument types before calling again."} try: result = TOOL_REGISTRY[tool_name](**args) return result if isinstance(result, dict) else {"ok": True, "result": result} except Exception as e: return {"ok": False, "error": "TOOL_EXECUTION_ERROR", "message": str(e)} # ── Output guardrail ────────────────────────────────── # Validate the agent's final answer before returning to user from pydantic import BaseModel from typing import Optional class AgentOutputGuardrail(BaseModel): is_complete: bool has_answer: bool is_on_topic: bool issues: list[str] = [] def validate_agent_output(original_goal: str, output: str) -> AgentOutputGuardrail: return instructor_client.messages.create( model="claude-3-haiku-20240307", max_tokens=200, messages=[{"role": "user", "content": f"""Validate this agent output against the original goal. Goal: {original_goal} Output: {output} Check: Is the goal addressed? Is there a clear answer? Is it on topic?"""}], response_model=AgentOutputGuardrail )
Cost Circuit Breakers
Financial Safetyimport sqlite3
from datetime import datetime
MODEL_COSTS = {
"claude-3-5-sonnet-20241022": {"input": 3.0/1e6, "output": 15.0/1e6},
"claude-3-haiku-20240307": {"input": 0.25/1e6, "output": 1.25/1e6},
}
class AgentCostCircuitBreaker:
"""Hard spending limits for agent sessions."""
def __init__(self, session_limit_usd: float = 1.0,
daily_limit_usd: float = 10.0,
per_tool_call_limit_usd: float = 0.10):
self.session_limit = session_limit_usd
self.daily_limit = daily_limit_usd
self.per_tool_call_limit = per_tool_call_limit_usd
self.session_spend = 0.0
self.session_id = datetime.utcnow().isoformat()
def _compute_cost(self, model: str, input_tok: int, output_tok: int) -> float:
prices = MODEL_COSTS.get(model, MODEL_COSTS["claude-3-5-sonnet-20241022"])
return input_tok * prices["input"] + output_tok * prices["output"]
def _get_daily_spend(self) -> float:
today = datetime.utcnow().strftime("%Y-%m-%d")
with sqlite3.connect("agent_costs.db") as conn:
conn.execute("CREATE TABLE IF NOT EXISTS costs (ts TEXT, session TEXT, cost REAL)")
row = conn.execute(
"SELECT SUM(cost) FROM costs WHERE ts LIKE ?", (f"{today}%",)).fetchone()
return row[0] or 0.0
def record_and_check(self, model: str, input_tok: int,
output_tok: int) -> tuple[float, bool, str]:
cost = self._compute_cost(model, input_tok, output_tok)
self.session_spend += cost
with sqlite3.connect("agent_costs.db") as conn:
conn.execute("INSERT INTO costs VALUES (?,?,?)",
(datetime.utcnow().isoformat(), self.session_id, cost))
daily = self._get_daily_spend()
if cost > self.per_tool_call_limit:
return cost, True, f"Single call cost ${cost:.4f} exceeds per-call limit"
if self.session_spend > self.session_limit:
return cost, True, f"Session spend ${self.session_spend:.4f} exceeds session limit"
if daily > self.daily_limit:
return cost, True, f"Daily spend ${daily:.4f} exceeds daily limit"
return cost, False, ""⚠️ Always set a session cost limit for any agent that can spawn subagents or loop. A misconfigured agent that recursively calls expensive tools can exhaust a $100 budget in minutes. The circuit breaker pattern is not optional — it is the difference between a manageable incident and a billing nightmare.
Structured Agent Logging
Audit & Debugpip install structlog import structlog, time from datetime import datetime # Configure structlog for JSON output structlog.configure( processors=[ structlog.contextvars.merge_contextvars, structlog.processors.add_log_level, structlog.processors.TimeStamper(fmt="iso"), structlog.processors.JSONRenderer() ] ) logger = structlog.get_logger() class AgentLogger: """Structured logging for agent execution.""" def __init__(self, session_id: str, goal: str): self.session_id = session_id self.goal = goal self.turn = 0 self.start_time = time.time() logger.info("agent_started", session_id=session_id, goal=goal) def log_turn(self, stop_reason: str, tools_called: list): self.turn += 1 logger.info("agent_turn", session_id=self.session_id, turn=self.turn, stop_reason=stop_reason, tools_called=tools_called) def log_tool_call(self, tool_name: str, args: dict, result: dict, latency_ms: float, cost_usd: float): success = result.get("ok", True) logger.info("tool_call", session_id=self.session_id, turn=self.turn, tool=tool_name, success=success, latency_ms=round(latency_ms, 1), cost_usd=round(cost_usd, 6), error=result.get("error") if not success else None) def log_completion(self, status: str, total_cost_usd: float, answer: str = ""): elapsed = round(time.time() - self.start_time, 2) logger.info("agent_completed", session_id=self.session_id, status=status, total_turns=self.turn, elapsed_sec=elapsed, total_cost_usd=round(total_cost_usd, 6), answer_length=len(answer)) def log_failure(self, reason: str, last_tool: str = ""): logger.error("agent_failed", session_id=self.session_id, turn=self.turn, reason=reason, last_tool=last_tool) # Example output (one JSON line per event): # {"event":"agent_started","session_id":"abc123","goal":"Analyse Q3 sales","level":"info","timestamp":"2024-..."} # {"event":"tool_call","tool":"search_sales_db","success":true,"latency_ms":124.3,"cost_usd":0.000045,...} # {"event":"agent_failed","reason":"Loop detected: search_sales_db called 3x with same args",...}
💡 Structured logs are queryable. When you have 10,000 agent runs in production and one fails, you need to find: which session, which turn, which tool, what the exact args were. JSON logs let you grep, jq-filter, and aggregate across millions of events. Unstructured print() statements do not.
Recovery and Graceful Degradation
Resilience# ── Pattern 1: Alternative strategy prompt ──────────── # When tool fails N times, inject a prompt asking the agent to try differently STUCK_RECOVERY_MSG = """You have encountered repeated errors with {tool_name}. The error was: {error_message} Please try a different approach: - Use a different tool if available - Simplify your query or arguments - If you cannot complete this subtask, explain what you found so far and skip it Do NOT call {tool_name} again with the same arguments.""" def inject_recovery_hint(messages: list, tool_name: str, error: str) -> list: recovery = STUCK_RECOVERY_MSG.format(tool_name=tool_name, error_message=error) messages.append({ "role": "user", "content": [{"type": "text", "text": recovery}] }) return messages # ── Pattern 2: Partial result extraction ───────────── # When agent hits limit, extract what it learned before stopping def extract_partial_result(messages: list) -> str: if len(messages) < 2: return "No results gathered before timeout." response = client.messages.create( model="claude-3-haiku-20240307", max_tokens=512, messages=[ *messages, {"role": "user", "content": "Summarise what you have found so far, even if incomplete. Be honest about what's missing."} ] ) return response.content[0].text # ── Pattern 3: Fallback to human ───────────────────── # When agent cannot proceed, escalate with full context def escalate_to_human(session_id: str, goal: str, messages: list, failure_reason: str) -> dict: partial = extract_partial_result(messages) ticket = { "session_id": session_id, "original_goal": goal, "failure_reason": failure_reason, "partial_result": partial, "turns_completed": len([m for m in messages if m["role"] == "assistant"]), "escalated_at": datetime.utcnow().isoformat(), "priority": "high" if "cost" in failure_reason.lower() else "normal" } create_human_task(ticket) # your ticketing system return {"status": "escalated", "ticket_id": ticket["session_id"], "message": "A human agent will continue this task."} # ── Pattern 4: Checkpoint and resume ───────────────── # Save progress periodically — resume if agent crashes import pickle, pathlib def save_checkpoint(session_id: str, messages: list, state: dict): path = pathlib.Path(f".checkpoints/{session_id}.pkl") path.parent.mkdir(exist_ok=True) with open(path, "wb") as f: pickle.dump({"messages": messages, "state": state}, f) def load_checkpoint(session_id: str) -> dict | None: path = pathlib.Path(f".checkpoints/{session_id}.pkl") if not path.exists(): return None with open(path, "rb") as f: return pickle.load(f)
FREE LEARNING RESOURCES
| Type | Resource | Best For |
|---|---|---|
| Article | Anthropic: Building Effective Agents — anthropic.com/research | Covers agent failure modes and the importance of minimal footprint and human oversight. |
| Library | structlog — structlog.org — structured logging for Python | The standard library for structured JSON logging in Python. Read the Getting Started guide. |
| Docs | LangGraph: Checkpointing — langchain-ai.github.io/langgraph | LangGraph's built-in checkpoint system for agent state persistence and recovery. |
Take your M19 research agent and add the full production hardening layer from this module.
Requirements
- AgentGuardian — loop detection via tool call fingerprinting, max_turns, consecutive error counter
- AgentCostCircuitBreaker — session limit ($1), daily limit ($10), per-call limit ($0.10)
- Tool validation — validate all tool names and arg types before execution
- Structured logging — every turn, tool call, failure, and completion logged as JSON
- Recovery hints — inject alternative strategy prompt after 3 consecutive tool errors
- Partial result extraction — on any stop (limit/loop/cost), extract and return what was learned
- Checkpoint/resume — save state after each turn, auto-resume if session_id provided
Testing
- Trigger every failure mode deliberately and verify each guard works
- Run 10 real tasks and review the structured logs — identify any unexpected failure patterns
Skills: AgentGuardian, cost circuit breaker, tool validation, structlog, recovery patterns, checkpoint/resume
Trigger and Detect Every Failure Mode
Objective: Deliberately trigger all 5 failure modes and verify the AgentGuardian catches each one.
Structured Log Analysis
Objective: Practice querying structured logs to diagnose agent failures post-hoc.
Checkpoint and Resume
Objective: Verify that checkpointing allows agent recovery from crashes without losing work.
P6-M21 MASTERY CHECKLIST
- Can name all 5 agent failure modes: infinite loop, stuck state, hallucinated tool calls, runaway cost, silent partial failure
- Can implement tool call fingerprinting using content hash to detect repeated identical calls
- Can implement AgentGuardian that checks max_turns, max_repeated_calls, consecutive errors, and cost before every turn
- All agents return structured results on failure — never Python exceptions propagating to the user
- Can validate tool name and argument types before execution using the tool's JSON schema
- Can validate agent output against the original goal using a cheap LLM checker
- Can implement cost circuit breaker with session, daily, and per-call limits using SQLite
- Can set up structlog for JSON-structured logging with turn, tool, cost, and latency fields
- Can implement the recovery hint pattern: inject alternative strategy prompt after repeated errors
- Can extract a partial result from conversation history when an agent hits a limit
- Can escalate to human with a structured ticket containing partial results and failure context
- Can implement checkpoint/resume with pickle or LangGraph's built-in checkpointer
- Can query JSONL structured logs to compute success rate, failure distribution, and most-called tools
- Completed Lab 1: all 5 failure modes triggered and verified
- Completed Lab 2: structured log analysis with success rate and failure debugging
- Completed Lab 3: checkpoint/resume verified end-to-end
- Milestone project: hardened agent with all guards pushed to GitHub
✅ When complete: Move to P6-M22 — Evaluation Harnesses. You now have agents that fail safely. M22 covers how to measure and improve agent quality systematically with DeepEval, Ragas, and LLM-as-judge.