For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
Sign inTry it free
DocsGuidesSDKsIntegrationsAPI docsTutorialsFlagship blog
DocsGuidesSDKsIntegrationsAPI docsTutorialsFlagship blog
  • Tutorials
    • The AI Iteration Loop for Deploying Reliable Agents with LangGraph
    • Using LaunchDarkly feature flags and Experimentation with Wordpress
    • Migrate a Hardcoded LangGraph Agent to LaunchDarkly AgentControl in 20 Minutes
    • Offline Evaluation of RAG-Grounded Answers in AgentControl
    • Beyond n8n for Workflow Automation: Agent Graphs as Your Universal Agent Harness
    • Catch your first silent AI failure with Vega AI in under 10 minutes
    • Evaluate LLM code generation with LLM-as-judge evaluators
    • OpenTelemetry for LLM Applications: A Practical Guide with LaunchDarkly and Langfuse
    • Use LaunchDarkly Agent Skills in Claude Code and Cursor
    • Detection to Resolution: Real World Debugging with Rage Clicks and Session Replay
    • Compare AI orchestrators: LangGraph vs Strands vs OpenAI Swarm
    • Building a data extraction pipeline with LaunchDarkly
    • Day 12 | 🎊 New Year, New Observability
    • Day 11 | ✉️ Letters to Santa: What engineering teams really want from Observability in 2026
    • Day 10 | Why observability and feature flags go together like milk and cookies
    • Day 9 | 👻 The Three Ghosts Haunting Your AI This Holiday Season
    • Day 7 | 🎄✨The Rockefeller tree in NYC: SLOs that actually drive decisions
    • Day 6 | 💸 The famous green character that stole your cloud budget: the cardinality problem
    • Day 5 | 🧹 Using a Popular Tidying Method to Consolidate Your Observability Stack
    • Day 4 | ❄️ Tracing the impact of holiday styling in your Node.js app
    • Day 8 | 🎁 Observable Multi-Modal Agentic Systems
    • Day 3 | 🔔 Jingle All the Way to Zero-Config Observability
    • Day 2 | 🎅 He knows if you have been bad or good... But what if he gets it wrong?
    • Collecting user feedback in your app with feature flags
    • Day 1 | 🎄 Observability Under the Tree: What Changed in 2025
    • Build a User Frustration Detection & Response System
    • When to Add Online Evals to Your AgentControl
    • Detecting User Frustration: Understanding Rage Clicks and Session Replay
    • AgentControl config CI/CD Pipeline: Automated Quality Gates and Safe Deployment
    • A Deeper Look at LaunchDarkly Architecture: More than Feature Flags
    • Add Observability to Your React Native App in 5 minutes
    • Smart AI Agent Targeting with MCP Tools
    • Build a LangGraph Multi-Agent System in 20 Minutes with LaunchDarkly AgentControl
    • Snowflake Cortex Completion API + LaunchDarkly SDK Integration
    • Using AgentControl to review database changes
    • How to implement WebSockets and kill switches in a Python application
    • 4 hacks to turbocharge your Cursor productivity
    • Create a feature flag in your IDE in 5 minutes with LaunchDarkly's MCP server
    • Observability for Your Go ORM: OpenTelemetry Integration with GORM
    • The complete guide to OpenTelemetry in Next.js
    • How to instrument your React Native app with OpenTelemetry
    • The complete guide to OpenTelemetry in Python
    • Monitoring Browser Applications with OpenTelemetry
    • How to Use OpenTelemetry to Monitor Next.js Applications
    • What is OpenTelemetry and Why Should I Care?
    • Distributed Tracing in Next.js Apps
    • Tracing Distributed Systems in Next.js
    • Real-time Monitoring in Django: Essential Tools and Techniques
    • DeepSeek vs Qwen: local model showdown featuring LaunchDarkly AgentControl
    • Application Tracing in .NET for Performance Monitoring
    • The Ultimate Guide to Ruby Logging: Best Libraries and Practices
    • Using Materialized Views in ClickHouse (vs. Postgres)
    • Filtering and Sampling LaunchDarkly Ingest
    • How to Set Up Your Production AWS MSK Kafka Cluster
    • Publishing an NPM Package with Private pnpm Monorepo Dependencies
    • How To Use The Chrome Inspector & Debugger
    • 3 Levels of Data Validation in a Full Stack Application With React
    • The power of the monorepo: Keep your fullstack app in sync!
    • Compression: The simple, powerful upgrade for your web stack
    • Video tutorials
Sign inTry it free
LogoLogo
On this page
  • Tutorial summary
  • Prerequisites
  • Clone the hardcoded starting point
  • Skill stage 1: Audit the hardcoded values
  • Skill stage 2: Wrap the call in the AI SDK
  • Skill stage 3: Move the tool into the config
  • Skill stage 4: Wire the tracker
  • Keep going
Tutorials

Migrate a Hardcoded LangGraph Agent to AgentControl in 20 Minutes

Was this page helpful?
Previous

Offline Evaluation of RAG-Grounded Answers in AgentControl

Next
Built with

Published April 20th, 2026

Portrait of Scarlett Attensil.

by Scarlett Attensil

In this tutorial you’ll run a small LangGraph agent locally, then migrate its hardcoded prompts, model choice, and tools into AgentControl. After the migration, every prompt tweak, model swap, or tool change ships as a LaunchDarkly update instead of a code deploy. The migration takes about 20 minutes.

When you finish, the codebase will:

  • Pull its system prompt, model name, and parameters from a config on every request.
  • Load its Tavily search tool definition from the same Config instead of a hardcoded module-level list.
  • Emit duration, token, success, and error metrics to LaunchDarkly on each user turn.
  • Have one offline-eval dataset staged for pre-rollout regression testing in the LaunchDarkly Playground.
  • Fail gracefully by falling back to the original hardcoded values if LaunchDarkly is unreachable.
  • Run A/B tests on models, prompts, parameters, and tool sets by creating variations and targeting them at user segments.

Tutorial summary

The agent you’ll run is the official langchain-ai/react-agent template: a single-node React agent that uses Claude Sonnet and a Tavily search tool. The migration will pull three files into LaunchDarkly:

  • The prompt in prompts.py,
  • The model name in context.py, and
  • the tool list in tools.py.

The migrate agent skill completes the work in five stages (audit, wrap, move tools, instrument, and attach evaluators). It pauses at the end of each stage for you to review.

The provider call and the routing logic stay where they are. react-agent is one LLM that decides, one ToolNode that runs the tools the LLM asks for, and one conditional edge that loops between them. When you add a second agent with a handoff, you move the topology into a LaunchDarkly Agent Graph.

This is a reviewer’s workflow, not a coding exercise. You ask your agent to run the migrate skill, then read the diffs and verify the skill got the audit, fallback, and tool schemas right. Every code sample below is an example of what your agent should produce, not something you should copy and paste.

If you’d rather compare your migration to a finished one, the aiconfig-migrate branch of launchdarkly-labs/react-agent is the reference end state for this tutorial: the five stages applied against the upstream template, with config-driven model, prompt, and tool wiring already in place.

Prerequisites

You’ll need:

  • Python 3.11 or higher with uv
  • A LaunchDarkly account with an AI project and access to an active LaunchDarkly SDK key
  • An Anthropic API key for Claude Sonnet
  • A Tavily API key for the search tool
  • Claude Code (or another Claude Agent SDK client) with the LaunchDarkly agent skills installed and the LaunchDarkly MCP server configured. If you haven’t used skills before, the agent skills quickstart completes the setup in under 10 minutes.

Clone the hardcoded starting point

$git clone https://github.com/langchain-ai/react-agent
$cd react-agent
$uv sync
$cp .env.example .env

Specify an ANTHROPIC_API_KEY and TAVILY_API_KEY in .env.

Then identify what’s hardcoded. The migrate skill’s first step is a read-only audit. Knowing the shape from the beginning makes the audit output easier to read. Here’s a table of the hardcoded values in react-agent:

TitleFile:lineCurrent value
System promptsrc/react_agent/prompts.py:3"You are a helpful AI assistant.\n\nSystem time: {system_time}"
Default modelsrc/react_agent/context.py:25"anthropic/claude-sonnet-4-5-20250929"
max_search_resultssrc/react_agent/context.py:3310
Toolsrc/react_agent/tools.py:17Tavily search function
.bind_tools(TOOLS)src/react_agent/graph.py:37Binds the module-level list
ToolNode(TOOLS)src/react_agent/graph.py:73Runs the same list

Skill stage 1: Audit the hardcoded values

Open Claude Code inside the cloned repo and run:

Migrate this app to AgentControl using the migrate skill.

The skill starts by performing a read-only audit. It scans for hardcoded model and prompt values, identifies your package manager and provider, and produces a structured summary. For react-agent the summary will look similar to this example:

Language: Python 3.11+
Package manager: uv
LLM provider: LangChain (init_chat_model) -> Anthropic
Existing LD SDK: none
Target mode: agent (LangGraph custom StateGraph)
Hardcoded targets:
- src/react_agent/prompts.py:3 SYSTEM_PROMPT (templated with {system_time})
- src/react_agent/context.py:25 model = "anthropic/claude-sonnet-4-5-20250929"
- src/react_agent/context.py:33 max_search_results = 10
- src/react_agent/tools.py:29 TOOLS = [search]
- src/react_agent/graph.py:37 .bind_tools(TOOLS)
- src/react_agent/graph.py:73 ToolNode(TOOLS)
Proposed plan:
- Single config key `react-agent` in agent mode
- Stage 3 (tools) required, one tool (search) with schema extracted from the
function signature via StructuredTool.from_function
- Stage 4 (tracking) inline via LangChain callback handler
- Stage 5 (evals) attached programmatically via create_judge
- Existing Context dataclass becomes the fallback shape

The skill stops here. Reply “continue” (or whatever affirmative response is appropriate for your shape) to begin Stage 2.

Audit output can vary

If your audit output doesn’t match this, don’t continue without making improvements. The skill is designed to adapt. Read what it produces, reconcile that output against the table in Step 1, and tell the skill where it’s wrong. Iterate until the audit output addresses all the hardcoded values in the table.

Skill stage 2: Wrap the call in the AI SDK

This is the first stage where the skill writes code. It installs the SDK, creates the config in LaunchDarkly, rewrites the hardcoded prompt to Mustache syntax, and adds a new ld_client.py module. To read the finished file, visit ld_client.py.

Three things to check in the diff:

  • The fallback mirrors the audit exactly. Every value you captured in Step 1 appears in FALLBACK with the same model name, provider, instruction text, and knob values. A drifted fallback silently changes behavior when LaunchDarkly is unreachable. max_search_results belongs in ModelConfig(custom={...}), not parameters={...}. parameters is forwarded to the provider SDK, and Anthropic, OpenAI, and Gemini all reject unknown kwargs.
  • Model construction goes through create_langchain_model(ai_config), not a hand-rolled init_chat_model or load_chat_model wrapper. Hand-rolled builders only pass the model name, so variation parameters such as temperature, max_tokens, and top_p silently drop. If the template’s utils.load_chat_model is still present, have the skill delete it.
  • {{ system_time }} interpolation goes through the SDK, not a manual .replace(). The fourth argument to agent_config(...) is {"system_time": system_time}. If you see .replace("{{ system_time }}", ...) at the call site, the skill missed the built-in interpolation.

Verify both paths run before continuing. The skill won’t move to Stage 3 until both work. Here’s how to do that:

In one terminal, start the dev server with your SDK key:

$LD_SDK_KEY=sdk-... uv run --with "langgraph-cli[inmem]" langgraph dev --no-browser

In a second terminal, invoke the graph once via the local API:

$curl -s http://127.0.0.1:2024/runs/wait \
> -H "Content-Type: application/json" \
> -d '{
> "assistant_id": "agent",
> "input": {"messages": [{"role": "user", "content": "What is the weather in San Francisco?"}]}
> }' | jq '.messages[-1].content'

A natural-language answer should appear. To make the LaunchDarkly-served path visually distinct from the fallback path, open the react-agent config in LaunchDarkly, edit the default variation’s instructions, and append a sentence like:

Always respond in over-the-top 1980s slang. Use words like “totally,” “rad,” “gnarly,” and “tubular.” Drop a “righteous!” somewhere.

Save the variation, then re-run the curl command. Within a few seconds you should see the answer come back with added 80s slang. That’s proof the LaunchDarkly-served prompt is winning over the hardcoded fallback.

Next, stop the server, unset LD_SDK_KEY, restart it, and run the same curl call again. The slang should disappear and the answer should read in the original neutral voice. That’s proof the fallback, which still follows the pre-migration prompt exactly, runs when LaunchDarkly is unreachable.

If you’d rather click through a chat UI, LangGraph Studio (free LangSmith login) and the hosted Agent Chat UI (point it at http://127.0.0.1:2024 with the graph id agent) both work against the same local server.

Skill stage 3: Move the tool into the config

Stage 3 attaches the tool schema to the LaunchDarkly variation and rewires graph.py and tools.py to read the tool list from the config using the skill’s tool factory pattern. Each tool is built by a factory that takes the per-run ai_config and returns a closure. The closure captures max_search_results, or any other model.custom knob, one time at the start of the turn, so the tool body never re-evaluates the config. For the finished shape, visit tools.py and graph.py.

The pattern, drawn verbatim from the reference repo:

1# Source of truth: launchdarkly-labs/react-agent@aiconfig-migrate src/react_agent/tools.py:15-42
2def make_search(ai_config: AIAgentConfig) -> Callable[..., Any]:
3 """Build a search tool that closes over this run's max_search_results.
4
5 Capturing the value at run setup keeps it stable across the turn, so a
6 mid-run flag flip won't change it between two tool calls. The tool body
7 never re-evaluates the config, which would emit an
8 extra $ld:ai:agent_config event per tool call.
9 """
10 max_results = ai_config.model.get_custom("max_search_results") or 10
11
12 async def search(query: str) -> dict:
13 """Search for general web results.
14
15 This function performs a search using the Tavily search engine, which is designed
16 to provide comprehensive, accurate, and trusted results. It's particularly useful
17 for answering questions about current events.
18 """
19 return await TavilySearch(max_results=max_results).ainvoke({"query": query})
20
21 return search
22
23
24# Registry of tool factories keyed by the LD AI Tool name. Each factory takes
25# the per-run config and returns the actual callable. graph.py materializes
26# this into {name: callable} on the first call_model tick.
27TOOL_FACTORIES: Dict[str, Callable[[AIAgentConfig], Callable[..., Any]]] = {
28 "search": make_search,
29}

graph.py materializes the factories inside call_model’s first-tick branch: built = {name: factory(ai_config) for name, factory in TOOL_FACTORIES.items()}, then update["tools"] = build_structured_tools(ai_config, built). Subsequent ticks read state.tools and pass it to create_langchain_model(ai_config).bind_tools(tools). For an exact sample, visit graph.py:50-63.

Verify three things:

  • The registry exports TOOL_FACTORIES and not a plain TOOL_REGISTRY of callables,
  • Each factory returns a closure that reads model.custom values at construction time, not from inside the tool body, and
  • bind_tools reads the materialized tool list off state instead of referencing the registry directly. build_structured_tools from ldai_langchain.langchain_helper wraps each built callable as a LangChain StructuredTool with the LD-served schema.
Why the factory pattern matters

Reading ai_config.model.get_custom(...) from inside a tool body fires get_agent_config() on every tool invocation, inflating $ld:ai:agent_config event counts proportional to tool-call volume and letting a mid-turn flag change swap max_search_results between the first and second tool call. The factory captures the value one time at the start of the turn, preserves turn-level atomicity, and keeps agent_config evaluations at one per turn.

Skill stage 4: Wire the tracker

This is the stage where the graph topology changes. The migration adds a finalize node so every metric event for a user turn shares one runId, the unit LaunchDarkly bills and groups by in the Monitoring tab. A React agent turn loops through call_model several times to pick a tool, execute, and summarize. The at-most-once events, such as duration, tokens, success, and error, fire one time across that whole loop, not one time per tick.

The three things to understand:

  • Run-scoped state. On the first call_model tick of a turn, the migration resolves the config, mints one tracker with ai_config.create_tracker(), materializes the tool factories into concrete callables, starts a perf_counter_ns timer, and stashes all of it on state. Every subsequent tick reuses what’s on state. The same tracker uses the same runId and results appear in one row per turn in Monitoring.
  • Per-step events stay in call_model. tracker.track_tool_calls(...) is explicitly not at-most-once. It runs every tick the LLM dispatches tools. Token usage accumulates into Annotated[int, add] state fields across ticks.
  • Run-level events move to a new finalize node. track_duration, track_tokens, track_success, and track_error all fire there, one time per turn, reading totals off state.

Read state.py for the run-scoped fields (ai_config, tracker, tools, start_perf_ns, three token counters, errored) and graph.py for the lazy-init prelude in call_model, the finalize node, and other details.

Two SDK details you should know

ai_config.create_tracker() is a factory method as of launchdarkly-server-sdk-ai 0.18.0. If your skill emits ai_config.tracker instead of ai_config.create_tracker, regenerate. This migration workflow uses get_ai_usage_from_response rather than get_ai_metrics_from_response so the graph can accumulate tokens across ticks into state fields rather than tracking them synchronously per-call.

Test this yourself by sending one request through the graph, then opening the config in LaunchDarkly and reviewing the Monitoring tab. Within one or two minutes you should see one row per user question with non-zero duration and token counts. If the tab fills up with multiple rows per question, the skill minted a tracker inside call_model instead of threading one through state.

The Monitoring tab showing duration, token, and generation metrics for a migrated config.

The Monitoring tab showing duration, token, and generation metrics for a migrated config.
Two simplifications compared to the skill

This repo collapses the setup steps of resolving the config, minting the tracker, and building the tools into the first tick of call_model instead of a dedicated setup_run node. It also skips track_metrics_of_async around ainvoke, which would fire duration and success per call rather than per turn. This helps produce a legible code diff, but production code should follow the skill’s setup_run and finalize factoring.

If your app has a thumbs-up/down UI, the skill will also wire tracker.track_feedback(...). Feedback usually arrives in a later request from a different process, so pass tracker.resumption_token out to your frontend at call time and rebuild the tracker with LDAIClient.create_tracker(token, context) in the feedback handler. react-agent doesn’t have a feedback UI, so we’ve intentionally skipped this step.

Keep going

The migration is done. The payoff is what you can do next without another code deploy:

  • Reference implementation. Diff your own run against launchdarkly-labs/react-agent on the aiconfig-migrate branch to validate fallback shape, tool wiring, and tracker placement.
  • Regression-test before rollout. Agent-mode Configs don’t support UI-attached automatic judges, so run an offline evaluation against a fixed dataset. The skill generates a starter datasets/react-agent-tests.csv from your audit; take it to the Offline Evaluation of RAG-Grounded Answers tutorial. The Accuracy judge at threshold 0.85, on a different model family than the agent, is the right starting point.
  • Zero-code changes in production. Swap models per cohort, A/B test prompts or tool sets on 50/50 traffic, disable a tool for a segment, or watch duration, token spend, and eval scores land in the Monitoring tab in real time. All from the LaunchDarkly UI.
  • Scale to a second agent. The moment you add a supervisor plus specialists or any routing handoff, move the topology itself into LaunchDarkly via ai_client.agent_graph("key", ld_context). The Beyond n8n tutorial walks the full pattern, and launchdarkly-labs/devrel-agents-tutorial (agent-skills branch) is the production-grade reference with three agents, per-user targeting, and dynamic routing.