Beyond n8n for Workflow Automation: Agent Graphs as Your Universal Agent Harness

Published March 20, 2026

Portrait of Scarlett Attensil.

by Scarlett Attensil

Hardcoded multi-agent orchestration is brittle: topology lives in framework-specific code, changes require redeploys, and bottlenecks are hard to see. Agent Graphs externalize that topology into LaunchDarkly, while your application continues to own execution.

In this tutorial, you’ll build a small multi-agent workflow, traverse it with the SDK, monitor per-node latency on the graph itself, and update a slow node’s model without changing application code.

How Agent Graphs work
  • Node = AI Config (model, instructions, tools)
  • Edge = handoff metadata (routing contract you define)
  • Graph = topology (which nodes connect)
  • Your app = execution + interpretation

LaunchDarkly provides graph structure, config, and observability. Your application owns execution semantics: you write the code that interprets edges and runs agents.

Agent Graph with monitoring

Agent Graphs: Visual orchestration with per-node metrics overlaid directly on your workflow

What You’ll Build

In this tutorial, you’ll add Agent Graphs to an existing multi-agent workflow:

  1. Build a graph visually in the LaunchDarkly UI
  2. Connect it to your code with a few lines of SDK integration
  3. Run your agents and see the graph in action
  4. Monitor performance with per-node latency and invocation tracking
  5. Fix a slow agent by swapping models from the dashboard

By the end, you’ll have a multi-agent system where topology metadata changes happen in the UI, picked up by your traversal code on the next request.

Prerequisites

  • LaunchDarkly account with AI Configs access (sign up here)
  • Python 3.9+
  • An existing agent workflow (or use our sample repo)

The Problem with Hardcoded Orchestration

Every multi-agent framework handles orchestration differently:

1# LangGraph - topology hardcoded in graph setup
2workflow = StateGraph(AgentState)
3workflow.add_node("supervisor", supervisor_node)
4workflow.add_node("security", security_node)
5workflow.add_node("support", support_node)
6workflow.set_entry_point("supervisor")
7# Routing logic buried in node functions or conditional edges
8
9# OpenAI Agents SDK - handoffs defined per agent
10security_agent = Agent(name="Security", instructions="...")
11support_agent = Agent(name="Support", instructions="...")
12supervisor = Agent(
13 name="Supervisor",
14 handoffs=[security_agent, support_agent] # Topology locked in code
15)

The topology is scattered across code. Agent Graphs make it visible: you see the entire workflow in one view, edit connections in the UI, and traverse it with graph-aware SDK methods.

Why Externalizing Topology Helps

If you’ve built multi-agent systems with LangGraph, OpenAI Swarm, or Strands, you’ve hit these walls:

  • Config duplication: Agent definitions scattered across framework-specific formats
  • Silent failures: An agent times out and you don’t know until users complain
  • No topology visibility: The workflow exists only in code
  • Custom observability: Getting consistent per-agent metrics means reconciling different trace formats and data schemas across frameworks
Deep dive on orchestrators

For a detailed comparison of LangGraph, OpenAI Swarm, and Strands, see Compare AI orchestrators. Agent Graphs work with multiple agent frameworks.

Agent Graphs solve these by giving you a visual graph builder where you:

  • See your entire workflow at a glance, not buried in code
  • Monitor per-node metrics overlaid directly on the graph (latency, invocations, tool calls)
  • Add or remove agents without changing traversal logic, provided your runtime supports the node’s tools and output contract
  • Inspect routing logic on edges, with handoff data visible in the UI
  • Use graph-aware SDK methods like is_terminal(), is_root(), and get_edges() instead of manual tracking

Step 1: Create AI Configs for Your Agents

Before building a graph, you need AI Configs for each agent. If you already have AI Configs, skip to Step 2.

New to AI Configs?

See the AI Configs quickstart or run the bootstrap script in our sample repo:

$git clone https://github.com/launchdarkly-labs/devrel-agents-tutorial
$cd devrel-agents-tutorial
$git checkout tutorial/agent-graphs
$uv sync
$cp .env.example .env # Add your LD_SDK_KEY, LD_API_KEY, OPENAI_API_KEY
$uv run python bootstrap/create_configs.py

For this tutorial, we’ll use three configs:

  • supervisor-agent: Orchestrates the workflow and routes queries based on PII pre-screening
  • security-agent: Detects and redacts personally identifiable information (PII)
  • support-agent: Answers questions using dynamically loaded tools (search, RAG)

Step 2: Build the Graph in the UI

This is where Agent Graphs diverge from code-based orchestration. Instead of writing add_edge() calls, you’ll see your topology and modify it visually.

Open your LaunchDarkly dashboard and navigate to AI > Agent graphs.

  1. You’ll see the first-time setup wizard. Since you already created AI Configs in Step 1, expand Create a graph at the bottom.

First-time agent graph wizard

First-time setup wizard. Expand 'Create a graph' since you already have AI Configs.
  1. Name your graph chatbot-flow and click Create graph.

Creating your first Agent Graph

Creating your first Agent Graph in the LaunchDarkly UI.
  1. Add your first node: click Add node and select supervisor-agent
  2. Set it as the root: click the node and toggle Root node
  3. Add security-agent and support-agent as nodes

Adding security agent

Adding the security-agent node to the graph.

Adding support agent

Adding the support-agent node to complete the workflow.
  1. Draw edges: drag from supervisor-agent to both child agents
  2. Add handoff data to each edge to define routing logic:

supervisor-agent → security-agent:

1{
2 "action": "sanitize",
3 "reason": "PII detected",
4 "route": "security"
5}

PII detected edge

Edge from supervisor to security with route: security handoff data.

supervisor-agent → support-agent:

1{
2 "action": "direct",
3 "reason": "Clean input",
4 "route": "support"
5}

Clean edge

Edge from supervisor to support with route: support handoff data.

security-agent → support-agent:

1{
2 "action": "proceed",
3 "reason": "Input sanitized",
4 "route": "continue"
5}

Redacted edge

Edge from security to support with route: continue handoff data.

Notice what you’re seeing: the entire workflow topology in one view. This graph is your architecture diagram, always current. Each node shows which AI Config variation it serves. The edges show routing logic that would otherwise be buried in conditional statements. When you need to add a new agent or change routing, you do it here, not in code.

The graph defines structure, your code defines behavior

LaunchDarkly doesn’t execute your graph. It provides:

  • Topology: Which nodes exist and how they connect
  • Handoff metadata: Whatever JSON you put on edges
  • Per-node AI Config: Model, instructions, tools for each agent

Your code:

  • Decides which edges to follow based on agent decisions
  • Interprets handoff data however you want (the schema is yours)
  • Executes the actual agents

The handoff JSON is arbitrary metadata. You define the schema, you interpret it. LaunchDarkly stores and delivers it.

Step 3: Add the SDK to Your Project

Install the LaunchDarkly AI SDK:

$uv add launchdarkly-server-sdk launchdarkly-server-sdk-ai

Initialize the clients in your code:

1# config_manager.py - Initialize LaunchDarkly clients
2def _initialize_launchdarkly_client(self):
3 """Initialize LaunchDarkly client and AI client"""
4 config = ldclient.Config(self.sdk_key)
5 ldclient.set_config(config)
6 self.ld_client = ldclient.get()
7
8 # Block until client is initialized (max 10 seconds)
9 self.ld_client.start_wait(10)
10
11 if not self.ld_client.is_initialized():
12 raise RuntimeError("LaunchDarkly client initialization failed")
13
14 self.ai_client = LDAIClient(self.ld_client)

Build a context for targeting and tracking:

1# config_manager.py - Build context for targeting
2def build_context(self, user_id: str, user_context: dict = None) -> Context:
3 """Build a LaunchDarkly context with consistent attributes."""
4 context_builder = Context.builder(user_id).kind('user')
5
6 if user_context:
7 for key, value in user_context.items():
8 context_builder.set(key, value)
9
10 return context_builder.build()

Step 4: Integrate with Your Framework

This section walks through the integration code, starting with the building block (what runs at each node), then showing how nodes are orchestrated.

The Generic Agent Pattern

The key to dynamic execution is create_generic_agent. Every node uses the same implementation—no agent registry, no hardcoded agent types:

1# agents/generic_agent.py
2def create_generic_agent(agent_config, config_manager, valid_routes: List[str] = None):
3 """Create a generic agent from LaunchDarkly AI Config."""
4
5 class GenericAgent:
6 def __init__(self):
7 self.valid_routes = valid_routes or []
8
9 async def ainvoke(self, state: dict) -> dict:
10 """Execute the agent using LaunchDarkly config."""
11 if not agent_config.enabled:
12 return {"response": "", "_skipped": True}
13
14 # Create model from config
15 model = create_model_for_config(
16 provider=agent_config.provider.name,
17 model=agent_config.model.name,
18 config_manager=config_manager
19 )
20
21 # Load tools from LaunchDarkly config
22 tools = create_dynamic_tools_from_launchdarkly(agent_config)
23
24 # Get instructions from config
25 instructions = agent_config.instructions or "Process the input."
26
27 # Inject route options into instructions
28 if self.valid_routes:
29 route_instruction = f"\n\nSelect one of these routes: {self.valid_routes}. Return: {{\"route\": \"<selected_route>\"}}"
30 instructions = instructions + route_instruction
31
32 # Execute and extract routing decision
33 result = await self._execute(model, instructions, tools, state)
34 result["routing_decision"] = self._extract_route(result.get("response", ""))
35
36 # Track metrics
37 agent_config.tracker.track_success()
38 return result
39
40 return GenericAgent()
Why this enables adding agents without code changes

The generic agent pattern means:

  • No agent registry: Every node uses the same create_generic_agent function
  • Config-driven behavior: Model, instructions, and tools all come from LaunchDarkly
  • Dynamic routing: Valid routes are injected from graph edges, not hardcoded
  • Minimal code changes: Add a new agent in LaunchDarkly, create its AI Config, add it to your graph, and it works—provided your runtime supports the node’s tools and output contract

The AgentService Class

The AgentService class is the entry point for processing messages through your Agent Graph:

1# api/services/agent_service.py
2class AgentService:
3 """Multi-Agent Orchestration using LaunchDarkly Agent Graph."""
4
5 def __init__(self):
6 self.config_manager = ConfigManager()
7 self.config_manager.flush()
8
9 async def process_message(
10 self,
11 user_id: str,
12 message: str,
13 user_context: dict = None
14 ) -> ChatResponse:
15 """Process message using LaunchDarkly Agent Graph."""
16 result = await self._execute_graph(
17 graph_key=os.getenv("AGENT_GRAPH_KEY", "chatbot-flow"),
18 user_id=user_id.strip() or "anonymous",
19 user_input=message,
20 user_context=user_context or {}
21 )
22
23 return ChatResponse(
24 response=result.get("final_response", ""),
25 tool_calls=result.get("tool_calls", []),
26 # ... other fields
27 )

Executing the Graph

The _execute_graph method fetches the graph from LaunchDarkly and uses traverse() with skip logic for conditional routing:

1# api/services/agent_service.py
2async def _execute_graph(
3 self,
4 graph_key: str,
5 user_id: str,
6 user_input: str,
7 user_context: dict = None
8) -> Dict[str, Any]:
9 """Execute agents using SDK's traverse() with skip logic."""
10 ld_context = self.config_manager.build_context(user_id, user_context)
11 graph = self.config_manager.ai_client.agent_graph(graph_key, ld_context)
12
13 if not graph.is_enabled():
14 raise ValueError(f"Agent Graph '{graph_key}' is not enabled")
15
16 ctx = {
17 "user_input": user_input,
18 "messages": [HumanMessage(content=user_input)],
19 "processed_input": user_input,
20 "final_response": "",
21 "tool_calls": [],
22 # Skip logic: track which nodes should execute
23 "_routed_to": {graph.root().get_key()},
24 "_path": [],
25 "_prev_key": None,
26 }
27
28 tracker = graph.get_tracker()
29
30 # Define the node callback (see next section)
31 def execute_node(node, exec_ctx):
32 # ... node execution logic
33 pass
34
35 # Use SDK's traverse() - it handles traversal order
36 graph.traverse(execute_node, ctx)
37
38 # Track graph completion
39 if tracker:
40 tracker.track_path(ctx.get("_path", []))
41 tracker.track_invocation_success()
42
43 return ctx

Skip Logic for Conditional Routing

The execute_node callback implements skip logic—the core pattern that enables conditional routing:

1# api/services/agent_service.py - inside _execute_graph
2def execute_node(node, exec_ctx):
3 """Execute a single node if it was routed to."""
4 key = node.get_key()
5
6 # Skip logic: only execute if parent routed to this node
7 if key not in exec_ctx.get("_routed_to", set()):
8 return {"_skipped": True}
9
10 exec_ctx["_path"].append(key)
11
12 # Track node invocation
13 if tracker:
14 tracker.track_node_invocation(key)
15 if exec_ctx.get("_prev_key"):
16 tracker.track_handoff_success(exec_ctx["_prev_key"], key)
17
18 # Get edges and valid routes for this node
19 edges = node.get_edges()
20 valid_routes = [e.handoff.get("route") for e in edges if e.handoff and e.handoff.get("route")]
21
22 # Execute agent with config from this node
23 agent = create_generic_agent(node.get_config(), self.config_manager, valid_routes=valid_routes)
24 result = _run_async(agent.ainvoke(exec_ctx))
25
26 # Track tool calls
27 if tracker and result.get("tool_calls"):
28 for tool in result["tool_calls"]:
29 tracker.track_tool_call(key, tool)
30
31 # Route to next node: add to _routed_to set
32 if edges:
33 next_key = self._select_next_node(edges, result, tracker)
34 if next_key:
35 exec_ctx["_routed_to"].add(next_key)
36
37 exec_ctx["_prev_key"] = key
38 return result
How skip logic enables conditional routing

The _routed_to set tracks which nodes should execute:

  1. Start: Add root node to _routed_to
  2. traverse() visits each node: If node is in _routed_to, execute it; otherwise skip
  3. After execution: Add the next node (based on routing decision) to _routed_to

This enables conditional routing: the supervisor routes to either security OR support, and only the chosen path executes.

Routing Between Nodes

The _select_next_node method determines which node to route to based on the agent’s routing decision:

1# api/services/agent_service.py
2def _select_next_node(self, edges, result: dict, tracker=None):
3 """Select next node key based on routing decision."""
4 routing = result.get("routing_decision", "").lower().strip() if result.get("routing_decision") else None
5
6 # Build route map: route -> target_config
7 route_map = {}
8 for edge in edges:
9 route = (edge.handoff.get("route", "") if edge.handoff else "").lower().strip()
10 if route:
11 route_map[route] = edge.target_config
12
13 # Exact match
14 if routing and routing in route_map:
15 return route_map[routing]
16 elif routing:
17 if tracker:
18 tracker.track_handoff_failure()
19
20 # Default: first edge
21 if edges:
22 return edges[0].target_config
23
24 return None

The key insight: your graph topology comes from LaunchDarkly, not hardcoded orchestration. Change the graph in the UI, and your code picks up the new structure on the next request.

Step 5: Run It

With the AgentService wired up (as shown in Step 4), you can now process messages through your Agent Graph. The service handles:

  1. Building the LaunchDarkly context for targeting
  2. Fetching the graph and executing nodes via traverse()
  3. Tracking metrics for monitoring
  4. Returning the final response

Test it by sending a message:

1service = AgentService()
2response = await service.process_message(
3 user_id="user-123",
4 message="What's the status of my order?",
5 user_context={"plan": "premium"}
6)
7print(response.response)

Now go back to the LaunchDarkly UI. Add a new node or change an edge. Run your code again. Topology changes are picked up by your traversal code on subsequent SDK evaluations.

Step 6: Monitor Agent Performance

This is the key differentiator: monitoring happens on the graph itself, not in a separate dashboard. You see metrics overlaid on the same visual topology you built, so bottlenecks are immediately obvious.

The sample repo includes full instrumentation: calls to tracker.track_success(), tracker.track_error(), and tracker.track_tool_call() in the agent execution path. After running some traffic, open your Agent Graph to see the results.

Navigate to AI > Agent graphs > chatbot-flow. You’ll see a metrics bar at the top of the graph view where you can toggle different metrics on and off.

Metrics on the graph

Here’s what makes this different from traditional APM: the metrics appear directly on your workflow visualization. No mental mapping between a dashboard and your code. No correlating trace IDs. The slow node lights up on the graph.

Turn on Latency to see duration data overlaid directly on your graph:

  • Total duration: The combined time for the entire graph invocation
  • Per-node duration: How long each individual agent takes

Turn on Invocations to see how often each node is reached. This reveals which paths your users take most frequently. In a routing graph, you’ll quickly see whether most queries go through security or skip directly to support.

Turn on Tool calls to see the average number of tool invocations per node. If an agent is calling tools excessively, you’ll spot it here.

Monitoring page

Click Monitoring to see all metrics over time. This view shows:

  • Latency trends: Duration per node over hours, days, or weeks
  • Invocation patterns: Traffic flow through your graph
  • Tool call breakdown: Which specific tools are being called and how often

Monitoring dashboard

Node-level metrics broken down by agent, showing invocations, tool calls, and latency over time
Instrument tool tracking

To see which specific tools are called, you need to track them in your code using the tracker. The SDK sends this data to LaunchDarkly, which displays it in the monitoring view.

Generate traffic to see metrics

Run the traffic generator from the sample repo to send queries through your graph:

$uv run python tools/traffic_generator.py --queries 20 --delay 2

This sends a mix of queries (some with PII, some without) to exercise both the security and support paths. After a few minutes, you’ll see metrics populate on the graph.

Detecting a slow agent

With traffic flowing, suppose the security-agent starts averaging 5 seconds per call. With latency metrics enabled on the graph, you see it immediately: the security-agent node shows a high duration value while other nodes stay fast.

The invocation numbers also tell a story. If security-agent shows 50 invocations and support-agent shows 80, you know ~30 queries are bypassing security (the clean path). This helps you understand whether the slow agent is affecting most users or just a subset.

Without Agent Graphs, you’d need custom logging, Datadog queries, and manual correlation. With Agent Graphs, you see the problem in 30 seconds.

Step 7: Fix Without Deploying

The security-agent is slow because it’s using claude-sonnet-4 for PII detection. A smaller, faster model may be sufficient for this task.

In the LaunchDarkly dashboard, update the pii-detector variation:

  • Change model from Anthropic.claude-sonnet-4-20250514 to Anthropic.claude-3-haiku-20240307

Or use Agent Skills to make the change from your coding assistant:

The security-agent pii-detector variation is averaging 5 seconds.
Change the model to claude-3-haiku-20240307.

No code changes. No deploy. Changes are picked up on subsequent SDK evaluations.

Run the traffic generator again and watch the latency drop.

What just happened

  1. Traffic generator sent queries through the graph
  2. Monitoring showed the slow agent on the graph
  3. Model swap happened in the UI (or via Agent Skills)
  4. Your code automatically used the new configuration

No deploys. No PRs. The fix is live.

OpenAI Agents SDK Integration (Conceptual)

Agent Graphs work with multiple frameworks. This conceptual example shows how the pattern translates to OpenAI Agents SDK:

1# Conceptual example showing how Agent Graph SDK methods work with OpenAI Agents
2from agents import Agent, Runner
3
4def handle_traversal(node, state):
5 config = node.get_config()
6 tracker = config.tracker
7 edges = node.get_edges()
8
9 # Child agents are already in state (reverse traversal builds bottom-up)
10 handoffs = [state[edge.target_config] for edge in edges]
11
12 def on_handoff(ctx):
13 # Track handoff events
14 return ctx
15
16 return Agent(
17 name=config.key,
18 instructions=config.instructions,
19 handoffs=handoffs,
20 on_handoff=on_handoff,
21 )
22
23if agent_graph.is_enabled():
24 root = agent_graph.reverse_traverse(handle_traversal, {})
25
26result = await Runner.run(root, "Tell me about your engineering team")

Same graph definition, adapted to each framework’s execution model. The topology metadata lives in LaunchDarkly; your code interprets and executes it.

Best Practices

Start simple: Begin with a linear graph (A → B → C) before adding conditional routing.

Use handoff data for context passing: Include metadata like action type, reason, or state that the next agent needs to continue the workflow.

Track everything: Call tracker.track_success() and tracker.track_error() in every node for complete visibility. Use graph_tracker.track_tool_call(tool_name) to track which tools agents invoke.

Test with targeting: Use LaunchDarkly targeting to route test users to experimental graph configurations.

Handle missing edges: Decide what happens when no edge matches a routing decision or when a target node is disabled. Recommend: fail closed, log diagnostics, and track routing failures.

Keep execution state request-scoped: Store execution state inside the context object (ctx) passed through traversal, not in instance-level variables. Treat graph traversal as request-scoped to avoid concurrency issues.

What You’ve Built

You now have a multi-agent system where:

  • Graph topology is externalized and self-documenting
  • Routing logic is visible on edges, not buried in code
  • Monitoring appears on the graph itself, not a separate dashboard
  • Node-level control lets you disable a single agent without touching others, provided your executor checks node availability
  • Multiple frameworks can consume the same graph metadata

When you spot a slow agent in monitoring, you can swap the model from the dashboard without a deploy.

Next Steps

Conclusion

Hardcoded orchestration was fine when you had one agent. With multi-agent systems, it becomes a liability. Every change requires a deploy. Every incident requires a developer.

Agent Graphs flip this. Define your workflow in LaunchDarkly, integrate it with your framework, and fix many problems without touching code. Your agents become as dynamic as your feature flags.

Ready to stop hardcoding? Get started with AI Configs and create your first Agent Graph.