Beyond n8n for workflow automation: agent graphs as your universal agent harness

Published March 20, 2026

by Scarlett Attensil

Hardcoded multi-agent orchestration is brittle: topology lives in framework-specific code, changes require redeploys, and bottlenecks are hard to see. Agent graphs externalize that topology into LaunchDarkly, while your application continues to own execution.

In this tutorial, you’ll build a small multi-agent workflow, traverse it with the SDK, monitor per-node latency on the graph itself, and update a slow node’s model without changing application code.

How agent graphs work

Node = AgentControl config (model, instructions, tools)
Edge = handoff metadata (routing contract you define)
Graph = topology (which nodes connect)
Your app = execution + interpretation

LaunchDarkly provides graph structure, config, and observability. Your application owns execution semantics: you write the code that interprets edges and runs agents.

Agent graph with monitoring — Agent graphs: visual orchestration with per-node metrics overlaid directly on your workflow

What you’ll build

In this tutorial, you’ll add agent graphs to an existing multi-agent workflow:

Build a graph visually in the LaunchDarkly UI
Connect it to your code with a few lines of SDK integration
Run your agents and watch the graph in action
Monitor performance with per-node latency and invocation tracking
Fix a slow agent by swapping models in the LaunchDarkly UI

By the end, you’ll have a multi-agent system where topology metadata changes happen in the UI, picked up by your traversal code on the next request.

Prerequisites

LaunchDarkly account with AgentControl access (sign up here)
Python 3.9+
An existing agent workflow (or use our sample repo)

The problem with hardcoded orchestration

Every multi-agent framework handles orchestration differently:

1 # LangGraph - topology hardcoded in graph setup
2 workflow = StateGraph(AgentState)
3 workflow.add_node("supervisor", supervisor_node)
4 workflow.add_node("security", security_node)
5 workflow.add_node("support", support_node)
6 workflow.set_entry_point("supervisor")
7 # Routing logic buried in node functions or conditional edges
8 
9 # OpenAI Agents SDK - handoffs defined per agent
10 security_agent = Agent(name="Security", instructions="...")
11 support_agent = Agent(name="Support", instructions="...")
12 supervisor = Agent(
13     name="Supervisor",
14     handoffs=[security_agent, support_agent]  # Topology locked in code
15 )

The topology is scattered across code. Agent graphs make it visible: you view the entire workflow in one place, edit connections in the UI, and traverse it with graph-aware SDK methods.

Why externalizing topology helps

If you’ve built multi-agent systems with LangGraph, OpenAI Swarm, or Strands, you’ve run into the same limitations:

Config duplication: Agent definitions scattered across framework-specific formats
Silent failures: An agent times out and you don’t know until users complain
No topology visibility: The workflow exists only in code
Custom observability: Getting consistent per-agent metrics means reconciling different trace formats and data schemas across frameworks

Deep dive on orchestrators

For a detailed comparison of LangGraph, OpenAI Swarm, and Strands, read Compare AI orchestrators. Agent graphs work with multiple agent frameworks.

Agent graphs solve these by giving you a visual graph builder where you:

View your entire workflow at a glance, not buried in code
Monitor per-node metrics overlaid directly on the graph (latency, invocations, tool calls)
Add or remove agents without changing traversal logic, provided your runtime supports the node’s tools and output contract
Inspect routing logic on edges, with handoff data visible in the UI
Use graph-aware SDK methods like is_terminal(), get_edges(), and root() instead of manual tracking

Step 1: Create a project and configs

Agent graphs live in a LaunchDarkly project alongside your configs. Create a dedicated project for this tutorial, then seed it with an AgentControl config for each agent. The bootstrap script below handles config setup. If you already have a project, skip to Step 2.

Create a tutorial project first

Ask your coding assistant to create the project with the LaunchDarkly MCP server (or the projects agent skill), or create one manually under Account settings > Projects. Then copy its SDK key and project key into your .env.

New to AgentControl?

Use the AgentControl quickstart or run the bootstrap script in our sample repo to seed your project with the three configs:

$ git clone https://github.com/launchdarkly-labs/devrel-agents-tutorial
$ cd devrel-agents-tutorial
$ git checkout tutorial/agent-graphs
$ uv sync
$ cp .env.example .env  # Add your LD_PROJECT_KEY, LD_SDK_KEY, LD_API_KEY, OPENAI_API_KEY
$ cd bootstrap && uv run python create_configs.py

The bootstrap script creates the configs in the project you set in LD_PROJECT_KEY.

For this tutorial, we’ll use three configs:

supervisor-agent: Orchestrates the workflow and routes queries based on PII pre-screening
security-agent: Detects and redacts personally identifiable information (PII)
support-agent: Answers questions using dynamically loaded tools (search, RAG)

Step 2: Build the graph in the UI

This is where agent graphs diverge from code-based orchestration. Instead of writing add_edge() calls, you’ll view your topology and modify it visually.

Open the LaunchDarkly UI and navigate to AI > Agent graphs.

The first-time setup wizard appears. Since you already created configs in Step 1, expand Create a graph at the bottom.

First-time agent graph wizard — First-time setup wizard. Expand 'Create a graph' since you already have configs.

Name your graph chatbot-flow and click Create graph.

Creating your first agent graph in the LaunchDarkly UI.

Add your first node: click Add node and select supervisor-agent
Add security-agent and support-agent as nodes

Adding security agent — Adding the security-agent node to the graph.

Adding support agent — Adding the support-agent node to complete the workflow.

Draw edges: drag from supervisor-agent to both child agents
Add handoff data to each edge to define routing logic:

supervisor-agent → security-agent:

1 {
2   "action": "sanitize",
3   "reason": "PII detected",
4   "route": "security"
5 }

PII detected edge — Edge from supervisor to security with route: security handoff data.

supervisor-agent → support-agent:

1 {
2   "action": "direct",
3   "reason": "Clean input",
4   "route": "support"
5 }

Clean edge — Edge from supervisor to support with route: support handoff data.

security-agent → support-agent:

1 {
2   "action": "proceed",
3   "reason": "Input sanitized",
4   "route": "continue"
5 }

Redacted edge — Edge from security to support with route: continue handoff data.

Notice what you’re seeing: the entire workflow topology in one view. This graph is your architecture diagram, always current. Each node shows which config variation it serves. The edges show routing logic that would otherwise be buried in conditional statements. When you need to add a new agent or change routing, you do it here, not in code.

The graph defines structure, your code defines behavior

LaunchDarkly doesn’t execute your graph. It provides:

Topology: Which nodes exist and how they connect
Handoff metadata: Whatever JSON you put on edges
Per-node config: Model, instructions, tools for each agent

Your code:

Decides which edges to follow based on agent decisions
Interprets handoff data however you want (the schema is yours)
Executes the actual agents

The handoff JSON is arbitrary metadata. You define the schema, you interpret it. LaunchDarkly stores and delivers it.

Step 3: Add the SDK to your project

Install the LaunchDarkly AI SDK:

$ uv add launchdarkly-server-sdk launchdarkly-server-sdk-ai

Initialize the clients in your code:

1 # config_manager.py - Initialize LaunchDarkly clients
2 def _initialize_launchdarkly_client(self):
3     """Initialize LaunchDarkly client and AI client"""
4     config = ldclient.Config(self.sdk_key)
5     ldclient.set_config(config)
6     self.ld_client = ldclient.get()
7 
8     # Block until client is initialized (max 10 seconds)
9     self.ld_client.start_wait(10)
10 
11     if not self.ld_client.is_initialized():
12         raise RuntimeError("LaunchDarkly client initialization failed")
13 
14     self.ai_client = LDAIClient(self.ld_client)

Build a context for targeting and tracking:

1 # config_manager.py - Build context for targeting
2 def build_context(self, user_id: str, user_context: dict = None) -> Context:
3     """Build a LaunchDarkly context with consistent attributes."""
4     context_builder = Context.builder(user_id).kind('user')
5 
6     if user_context:
7         for key, value in user_context.items():
8             context_builder.set(key, value)
9 
10     return context_builder.build()

Step 4: Integrate with your framework

This section walks through the integration code, starting with the building block (what runs at each node), then showing how nodes are orchestrated.

The generic agent pattern

The key to dynamic execution is create_generic_agent. Every node uses the same implementation, with no agent registry and no hardcoded agent types:

1 # agents/generic_agent.py
2 def create_generic_agent(agent_config, config_manager, valid_routes: List[str] = None):
3     """Create a generic agent from a config."""
4 
5     class GenericAgent:
6         def __init__(self):
7             self.valid_routes = valid_routes or []
8 
9         async def ainvoke(self, state: dict) -> dict:
10             """Execute the agent using config."""
11             if not agent_config.enabled:
12                 return {"response": "", "_skipped": True}
13 
14             # Create model from config
15             model = create_model_for_config(
16                 provider=agent_config.provider.name,
17                 model=agent_config.model.name,
18                 config_manager=config_manager
19             )
20 
21             # Load tools from config
22             tools = create_dynamic_tools_from_launchdarkly(agent_config)
23 
24             # Get instructions from config
25             instructions = agent_config.instructions or "Process the input."
26 
27             # Inject route options into instructions
28             if self.valid_routes:
29                 route_instruction = f"\n\nSelect one of these routes: {self.valid_routes}. Return: {{\"route\": \"<selected_route>\"}}"
30                 instructions = instructions + route_instruction
31 
32             # Execute and extract routing decision
33             result = await self._execute(model, instructions, tools, state)
34             result["routing_decision"] = self._extract_route(result.get("response", ""))
35 
36             tracker = agent_config.create_tracker()
37             tracker.track_success()
38             return result
39 
40     return GenericAgent()

Why this enables adding agents without code changes

The generic agent pattern means:

No agent registry: Every node uses the same create_generic_agent function
Config-driven behavior: Model, instructions, and tools all come from LaunchDarkly
Dynamic routing: Valid routes are injected from graph edges, not hardcoded
Minimal code changes: Add a new agent in LaunchDarkly, create its config, add it to your graph, and it works, provided your runtime supports the node’s tools and output contract

The AgentService class

The AgentService class is the entry point for processing messages through your agent graph:

1 # api/services/agent_service.py
2 class AgentService:
3     """Multi-Agent Orchestration using LaunchDarkly Agent Graph."""
4 
5     def __init__(self):
6         self.config_manager = ConfigManager()
7         self.config_manager.flush()
8 
9     async def process_message(
10         self,
11         user_id: str,
12         message: str,
13         user_context: dict = None
14     ) -> ChatResponse:
15         """Process message using LaunchDarkly Agent Graph."""
16         result = await self._execute_graph(
17             graph_key=os.getenv("AGENT_GRAPH_KEY", "chatbot-flow"),
18             user_id=user_id.strip() or "anonymous",
19             user_input=message,
20             user_context=user_context or {}
21         )
22 
23         return ChatResponse(
24             response=result.get("final_response", ""),
25             tool_calls=result.get("tool_calls", []),
26             # ... other fields
27         )

Executing the graph

The _execute_graph method fetches the graph from LaunchDarkly and uses traverse() with skip logic for conditional routing:

1 # api/services/agent_service.py
2 async def _execute_graph(
3     self,
4     graph_key: str,
5     user_id: str,
6     user_input: str,
7     user_context: dict = None
8 ) -> Dict[str, Any]:
9     """Execute agents using SDK's traverse() with skip logic."""
10     ld_context = self.config_manager.build_context(user_id, user_context)
11     graph = self.config_manager.ai_client.agent_graph(graph_key, ld_context)
12 
13     if not graph.is_enabled():
14         raise ValueError(f"Agent Graph '{graph_key}' is not enabled")
15 
16     ctx = {
17         "user_input": user_input,
18         "messages": [HumanMessage(content=user_input)],
19         "processed_input": user_input,
20         "final_response": "",
21         "tool_calls": [],
22         # Skip logic: track which nodes should execute
23         "_routed_to": {graph.root().get_key()},
24         "_path": [],
25         "_prev_key": None,
26     }
27 
28     tracker = graph.create_tracker()
29 
30     # Define the node callback (see next section)
31     def execute_node(node, exec_ctx):
32         # ... node execution logic
33         pass
34 
35     # Use SDK's traverse() - it handles traversal order
36     graph.traverse(execute_node, ctx)
37 
38     # Track graph completion
39     if tracker:
40         tracker.track_path(ctx.get("_path", []))
41         tracker.track_invocation_success()
42 
43     return ctx

Skip logic for conditional routing

The execute_node callback implements skip logic, the core pattern that enables conditional routing:

1 # api/services/agent_service.py - inside _execute_graph
2 def execute_node(node, exec_ctx):
3     """Execute a single node if it was routed to."""
4     key = node.get_key()
5 
6     # Skip logic: only execute if parent routed to this node
7     if key not in exec_ctx.get("_routed_to", set()):
8         return {"_skipped": True}
9 
10     exec_ctx["_path"].append(key)
11 
12     node_tracker = node.get_config().create_tracker()
13     node_tracker.track_success()
14     if tracker and exec_ctx.get("_prev_key"):
15         tracker.track_handoff_success(exec_ctx["_prev_key"], key)
16 
17     # Get edges and valid routes for this node
18     edges = node.get_edges()
19     valid_routes = [e.handoff.get("route") for e in edges if e.handoff and e.handoff.get("route")]
20 
21     # Execute agent with config from this node
22     agent = create_generic_agent(node.get_config(), self.config_manager, valid_routes=valid_routes)
23     result = _run_async(agent.ainvoke(exec_ctx))
24 
25     # Track tool calls
26     if result.get("tool_calls"):
27         for tool in result["tool_calls"]:
28             node_tracker.track_tool_call(tool)
29 
30     # Route to next node: add to _routed_to set
31     if edges:
32         next_key = self._select_next_node(node, edges, result, tracker)
33         if next_key:
34             exec_ctx["_routed_to"].add(next_key)
35 
36     exec_ctx["_prev_key"] = key
37     return result

How skip logic enables conditional routing

The _routed_to set tracks which nodes should execute:

Start: Add root node to _routed_to
traverse() visits each node: If node is in _routed_to, execute it; otherwise skip
After execution: Add the next node (based on routing decision) to _routed_to

This enables conditional routing: the supervisor routes to either security OR support, and only the chosen path executes.

Routing between nodes

The _select_next_node method determines which node to route to based on the agent’s routing decision:

1 # api/services/agent_service.py
2 def _select_next_node(self, node, edges, result: dict, tracker=None):
3     """Select next node key based on routing decision."""
4     routing = result.get("routing_decision", "").lower().strip() if result.get("routing_decision") else None
5 
6     # Build route map: route -> target_config
7     route_map = {}
8     for edge in edges:
9         route = (edge.handoff.get("route", "") if edge.handoff else "").lower().strip()
10         if route:
11             route_map[route] = edge.target_config
12 
13     # Exact match
14     if routing and routing in route_map:
15         return route_map[routing]
16     elif routing:
17         if tracker:
18             target = edges[0].target_config if edges else None
19             tracker.track_handoff_failure(node.get_key(), target)
20 
21     # Default: first edge
22     if edges:
23         return edges[0].target_config
24 
25     return None

The key insight: your graph topology comes from LaunchDarkly, not hardcoded orchestration. Change the graph in the UI, and your code picks up the new structure on the next request.

Step 5: Run it

With the AgentService wired up (as shown in Step 4), you can now process messages through your agent graph. The service handles:

Building the LaunchDarkly context for targeting
Fetching the graph and executing nodes via traverse()
Tracking metrics for monitoring
Returning the final response

Test it by sending a message:

1 service = AgentService()
2 response = await service.process_message(
3     user_id="user-123",
4     message="What's the status of my order?",
5     user_context={"plan": "premium"}
6 )
7 print(response.response)

Now go back to the LaunchDarkly UI. Add a new node or change an edge. Run your code again. Topology changes are picked up by your traversal code on subsequent SDK evaluations.

Step 6: Monitor agent performance

This is the key differentiator: monitoring happens on the graph itself, not in a separate view. You see metrics overlaid on the same visual topology you built, so bottlenecks are immediately obvious.

The sample repo includes full instrumentation: calls to tracker.track_success(), tracker.track_error(), and tracker.track_tool_call() in the agent execution path. After running some traffic, open your agent graph to view the results.

Navigate to AI > Agent graphs > chatbot-flow. A metrics bar at the top of the graph view lets you toggle different metrics on and off.

Metrics on the graph

Here’s what makes this different from traditional APM: the metrics appear directly on your workflow visualization. No mental mapping between a separate view and your code. No correlating trace IDs. The slow node lights up on the graph.

Turn on Latency to see duration data overlaid directly on your graph:

Total duration: The combined time for the entire graph invocation
Per-node duration: How long each individual agent takes

Turn on Invocations to see how often each node is reached. This reveals which paths your users take most frequently. In a routing graph, you’ll quickly see whether most queries go through security or skip directly to support.

Turn on Tool calls to view the average number of tool invocations per node. If an agent is calling tools excessively, you’ll spot it here.

Monitoring page

Click Monitoring to see all metrics over time. This view shows:

Latency trends: Duration per node over hours, days, or weeks
Invocation patterns: Traffic flow through your graph
Tool call breakdown: Which specific tools are being called and how often

Monitoring view — Node-level metrics broken down by agent, showing invocations, tool calls, and latency over time

Instrument tool tracking

To see which specific tools are called, you need to track them in your code using the tracker. The SDK sends this data to LaunchDarkly, which displays it in the monitoring view.

Generate traffic to see metrics

Run the traffic generator from the sample repo to send queries through your graph:

$ uv run python tools/traffic_generator.py --queries 20 --delay 2

This sends a mix of queries (some with PII, some without) to exercise both the security and support paths. After a few minutes, you’ll see metrics populate on the graph.

Detecting a slow agent

With traffic flowing, suppose the security-agent starts averaging 5 seconds per call. With latency metrics enabled on the graph, you see it immediately: the security-agent node shows a high duration value while other nodes stay fast.

The invocation numbers also tell a story. If security-agent shows 50 invocations and support-agent shows 80, you know ~30 queries are bypassing security (the clean path). This helps you understand whether the slow agent is affecting most users or just a subset.

Without agent graphs, you’d need custom logging, Datadog queries, and manual correlation. With agent graphs, you spot the problem in 30 seconds.

Step 7: Fix without deploying

The security-agent is slow because it’s using claude-sonnet-4-6 for PII detection. A smaller, faster model may be sufficient for this task.

In the LaunchDarkly UI, update the security-agent variation:

Change model from Anthropic.claude-sonnet-4-6 to Anthropic.claude-haiku-4-5-20251001

Or use Agent Skills to make the change from your coding assistant:

The security-agent variation is averaging 5 seconds.
Change the model to claude-haiku-4-5-20251001.

No code changes. No deploy. Changes are picked up on subsequent SDK evaluations.

Run the traffic generator again and watch the latency drop.

What just happened

Traffic generator sent queries through the graph
Monitoring showed the slow agent on the graph
Model swap happened in the UI (or via Agent Skills)
Your code automatically used the new configuration

No deploys. No PRs. The fix is live.

OpenAI Agents SDK integration (conceptual)

Agent graphs work with multiple frameworks. This conceptual example shows how the pattern translates to OpenAI Agents SDK:

1 # Conceptual example showing how Agent Graph SDK methods work with OpenAI Agents
2 from agents import Agent, Runner
3 
4 def handle_traversal(node, state):
5     config = node.get_config()
6     tracker = config.create_tracker()
7     edges = node.get_edges()
8 
9     # Child agents are already in state (reverse traversal builds bottom-up)
10     handoffs = [state[edge.target_config] for edge in edges]
11 
12     def on_handoff(ctx):
13         # Track handoff events
14         return ctx
15 
16     return Agent(
17         name=config.key,
18         instructions=config.instructions,
19         handoffs=handoffs,
20         on_handoff=on_handoff,
21     )
22 
23 if agent_graph.is_enabled():
24     root = agent_graph.reverse_traverse(handle_traversal, {})
25 
26 result = await Runner.run(root, "Tell me about your engineering team")

Same graph definition, adapted to each framework’s execution model. The topology metadata lives in LaunchDarkly; your code interprets and executes it.

Best practices

Start simple: Begin with a linear graph (A → B → C) before adding conditional routing.

Use handoff data for context passing: Include metadata like action type, reason, or state that the next agent needs to continue the workflow.

Track everything: Call tracker.track_success() and tracker.track_error() in every node for complete visibility. Use node.get_config().create_tracker().track_tool_call(tool_name) to track which tools agents invoke.

Test with targeting: Use LaunchDarkly targeting to route test users to experimental graph configurations.

Handle missing edges: Decide what happens when no edge matches a routing decision or when you turn off a target node. Recommend: fail closed, log diagnostics, and track routing failures.

Keep execution state request-scoped: Store execution state inside the context object (ctx) passed through traversal, not in instance-level variables. Treat graph traversal as request-scoped to avoid concurrency issues.

What you’ve built

You now have a multi-agent system where:

Graph topology is externalized and self-documenting
Routing logic is visible on edges, not buried in code
Monitoring appears on the graph itself, not a separate view
Node-level control lets you disable a single agent without touching others, provided your executor checks node availability
Multiple frameworks can consume the same graph metadata

When you spot a slow agent in monitoring, swap the model in the LaunchDarkly UI without a deploy.

Next steps

Agent graphs: SDK methods for traverse, reverse_traverse, get_edges(), and handoff data
AgentControl documentation: Learn more about variations, targeting, and experiments
Agent Skills tutorial: Manage configs from your coding assistant
Monitor config performance: Deep dive into metrics and monitoring views
Sample repository: Complete code from this tutorial
Offline Evaluation of RAG-Grounded Answers: Add a regression-testing layer for the graph’s RAG-bearing nodes
Smart AI Agent Targeting with MCP Tools: Layer plan, geo, and tier targeting on top of your graph variations
Evaluate LLM code generation with LLM-as-judge evaluators: Attach custom judges to individual nodes for per-agent quality scoring
Building Framework-Agnostic AI Swarms: Run the same graph topology across LangGraph, Strands, and OpenAI Swarm
Proving ROI with data-driven AI agent experiments: A/B test graph variations to prove which routing or model choice wins

Conclusion

Hardcoded orchestration was fine when you had one agent. With multi-agent systems, it becomes a liability. Every change requires a deploy. Every incident requires a developer.

Agent graphs flip this. Define your workflow in LaunchDarkly, integrate it with your framework, and fix many problems without touching code. Your agents become as dynamic as your feature flags.

Ready to stop hardcoding? Get started with AgentControl and create your first agent graph.