agent-contracts

Troubleshooting

Common issues and their solutions

Validation Errors

“Unknown slice ‘X’ in node ‘Y’ reads/writes”

Cause: The slice name is not registered in the registry.

Solution:

# Option 1: Add the slice to the registry
registry.add_valid_slice("your_slice_name")

# Option 2: Check for typos
# Maybe "shoping" should be "shopping"

Prevention:

# Define slice names as constants
SLICE_ORDERS = "orders"
SLICE_WORKFLOW = "workflow"

# Use constants in contracts
reads=[SLICE_ORDERS]

“Node requires LLM but not provided”

Cause: Contract has requires_llm=True but no LLM was injected.

Solution:

# Provide LLM when instantiating
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4")
node = MyNode(llm=llm)

# Or when building graph
graph = build_graph_from_registry(
    registry=registry,
    llm=llm,  # Passed to all nodes
)

“Unknown service ‘X’ required by node ‘Y’”

Cause: The service is declared in services but not available.

Solution:

# Provide all required services
db_service = DatabaseService()
cache_service = CacheService()

node = MyNode(
    llm=llm,
    db_service=db_service,
    cache_service=cache_service,
)

Prevention:

# Validate with known services
validator = ContractValidator(
    registry,
    known_services={"db_service", "cache_service"},
)

Routing Issues

“Node is never called”

Possible causes and solutions:

No matching TriggerCondition

# Check: Is your 'when' condition correct?
when={"request.action": "serch"}  # Typo! Should be "search"

Priority too low

# Another node with higher priority is matching first
# Use decide_with_trace() to debug
decision = await supervisor.decide_with_trace(state)
print(decision.reason.matched_rules)

Missing from supervisor

# Check: Is the node registered to the correct supervisor?
supervisor="main"  # Must match supervisors= in build_graph_from_registry

Unreachable (no trigger conditions)

# Add a trigger condition
trigger_conditions=[
    TriggerCondition(priority=10, when={"request.action": "my_action"})
]

“Wrong node is being selected”

Debug with traceable routing:

decision = await supervisor.decide_with_trace(state)

print(f"Selected: {decision.selected_node}")
print(f"Type: {decision.reason.decision_type}")
print(f"Matched rules:")
for rule in decision.reason.matched_rules:
    print(f"  P{rule.priority}: {rule.node} - {rule.condition}")

Common fixes:

Adjust priority values
Make when conditions more specific
Add when_not to exclude unwanted matches

“LLM routing is unpredictable”

Solutions:

Improve llm_hints

# Bad
llm_hint="Search"
   
# Good
llm_hint="Use when user explicitly asks to search for products. Do NOT use for browsing or recommendations."

Use rule-based for clear actions

# If the action is explicit, use rules instead of LLM
when={"request.action": "search"}  # Clear intent

Increase priority for critical paths

priority=100  # Force selection before LLM decides

Execution Issues

“Infinite loop / Max iterations reached”

Cause: Nodes keep routing back without reaching END.

Solutions:

Check terminal states

# Make sure your response types are in terminal_states
terminal_response_types={"question", "results", "error"}
   
# And your node outputs matching types
return NodeOutputs(
    response={
        "response_type": "results",
        "response_data": {"items": [1, 2, 3]},
    }
)

Set is_terminal on appropriate nodes

class ResultNode(ModularNode):
    CONTRACT = NodeContract(
        name="result",
        description="Returns the final result and ends the flow",
        reads=["request"],
        writes=["response"],
        supervisor="main",
        trigger_conditions=[TriggerCondition(priority=10)],
        is_terminal=True,  # Force END after this node
    )

Increase max_iterations during debugging

supervisor = GenericSupervisor(
    max_iterations=50,  # Increase to find the issue
)

“State updates not persisting”

Cause: Node outputs don’t match contract writes.

Solution:

# Contract declares
writes=["orders"]

# Execute must return matching slice
return NodeOutputs(
    orders={"cart": [...]},  # ✅ Correct
    # Not: response={"cart": [...]}  # ❌ Wrong slice
)

“NodeInputs missing expected data”

Cause: Contract reads don’t include the needed slice.

Solution:

# If you need data from 'context' slice
CONTRACT = NodeContract(
    name="my_node",
    description="Example node that needs context slice",
    reads=["request", "context"],  # Include 'context'
    writes=["response"],
    supervisor="main",
)

async def execute(self, inputs, config=None):
    context = inputs.get_slice("context")  # Now available

Configuration Issues

“Config not loading”

Check file path:

from agent_contracts.config import load_config, set_config

# Absolute path
config = load_config("/path/to/agent_config.yaml")

# Or relative to your working directory
config = load_config("./config/agent_config.yaml")

set_config(config)

Check YAML syntax:

# Valid YAML
supervisor:
  max_iterations: 10

response_types:
  terminal_states:
    - question
    - results

“Terminal states not working”

Check configuration:

# In agent_config.yaml
response_types:
  terminal_states:
    - question    # Must match response_type exactly
    - results
    - error

Check response_type formatting:

# Must match exactly
return NodeOutputs(
    response={
        "response_type": "question",  # Exact match
        # Not "Question" or "QUESTION"
    }
)

Contract I/O Issues

“Undeclared slice read/write”

If you see warnings like Undeclared slice read / Undeclared slice write(s), a node is accessing slices not listed in its NodeContract.reads/writes.

Options:

Update the node’s NodeContract to declare the slice(s)
Or configure runtime enforcement:

io:
  strict: false                 # true: raise ContractViolationError
  warn: true                    # warning logs
  drop_undeclared_writes: true  # drop undeclared writes

Testing Issues

“Async tests failing”

Use pytest-asyncio:

import pytest

@pytest.mark.asyncio
async def test_node_execution():
    node = MyNode(llm=mock_llm)
    inputs = NodeInputs(request={"action": "test"})
    
    result = await node.execute(inputs)
    
    assert result.response is not None

Configure pytest:

# pyproject.toml
[tool.pytest.ini_options]
asyncio_mode = "strict"

“Registry state leaking between tests”

Reset registry in fixtures:

import pytest
from agent_contracts import reset_registry


@pytest.fixture(autouse=True)
def clean_registry():
    reset_registry()
    yield
    reset_registry()

Performance Issues

“LLM calls are slow”

Solutions:

Use lighter models for routing

# Use GPT-3.5 for supervisor, GPT-4 for nodes
routing_llm = ChatOpenAI(model="gpt-3.5-turbo")
execution_llm = ChatOpenAI(model="gpt-4")
   
supervisor = GenericSupervisor(llm=routing_llm)

Rely more on rule-based routing

# If action is explicit, don't need LLM
when={"request.action": "search"}

Remove LLM from simple graphs

# Pure rule-based, no LLM overhead
supervisor = GenericSupervisor(
    llm=None,  # Rule-based only
)

Token Consumption Issues

“Unexpectedly high token usage with Supervisor”

Cause: Large data (especially base64 images) in state slices being sent to LLM for routing decisions.

Symptoms:

Routing decisions consuming thousands of tokens
Slow supervisor responses
High API costs

Solution:

Verify data sanitization is working (v0.3.3+)

# GenericSupervisor automatically sanitizes:
# - Image data → "[IMAGE_DATA]"
# - Long strings → Truncated with preserved beginning
   
# Default max_field_length is 10000 chars
supervisor = GenericSupervisor(
    supervisor_name="main",
    llm=llm,
    max_field_length=10000,  # Adjust if needed
)

Check for image data in request slice

# If you're storing base64 images in state:
request = {
    "action": "analyze",
    "image": "data:image/png;base64,iVBORw0KG..."  # Automatically sanitized
}

Review context_builder implementation

def my_context_builder(state, candidates):
    # Don't include slices with large data
    return {
        "slices": {"request", "response", "_internal"},  # Minimal context
        # Avoid: {"request", "response", "raw_data", "images"}
    }

Monitor token usage

# Enable debug logging to see what's sent to LLM
import logging
logging.getLogger("agent_contracts").setLevel(logging.DEBUG)

Prevention:

Store large data (images, files) outside state slices when possible
Use references/URLs instead of embedding data
Leverage automatic sanitization (v0.3.3+)
Use minimal context in context_builder

Migration Issues

“TypeError: ‘TriggerMatch’ object is not subscriptable” (v0.4.0)

Cause: Code using evaluate_triggers() directly assumes v0.3.x tuple format

Symptoms:

# v0.3.x format code
matches = registry.evaluate_triggers("main", state)
priority, node_name = matches[0]  # Error!

Solution:

# Update to v0.4.0 format
matches = registry.evaluate_triggers("main", state)
match = matches[0]
priority = match.priority
node_name = match.node_name
condition_index = match.condition_index  # New feature!

Affected Code:

Direct calls to evaluate_triggers()
Processing results of registry.evaluate_triggers()

Unaffected Code:

Using GenericSupervisor only
Using decide() or decide_with_trace() only

Getting Help

If you’re stuck:

Check the examples: examples/ directory
Use validation: ContractValidator.validate()
Use tracing: decide_with_trace()

Enable debug logging:

import logging
logging.getLogger("agent_contracts").setLevel(logging.DEBUG)

📚 Core Concepts - Understanding the architecture
🎯 Best Practices - Design patterns

This site is open source. Improve this page.

agent-contracts

Troubleshooting

Validation Errors

“Unknown slice ‘X’ in node ‘Y’ reads/writes”

“Node requires LLM but not provided”

“Unknown service ‘X’ required by node ‘Y’”

Routing Issues

“Node is never called”

“Wrong node is being selected”

“LLM routing is unpredictable”

Execution Issues

“Infinite loop / Max iterations reached”

“State updates not persisting”

“NodeInputs missing expected data”

Configuration Issues

“Config not loading”

“Terminal states not working”

Contract I/O Issues

“Undeclared slice read/write”

Testing Issues

“Async tests failing”

“Registry state leaking between tests”

Performance Issues

“LLM calls are slow”

Token Consumption Issues

“Unexpectedly high token usage with Supervisor”

Migration Issues

“TypeError: ‘TriggerMatch’ object is not subscriptable” (v0.4.0)

Getting Help

Related Docs