For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
Sign inTry it free
DocsGuidesSDKsIntegrationsAPI docsTutorialsFlagship blog
DocsGuidesSDKsIntegrationsAPI docsTutorialsFlagship blog
  • Tutorials
    • The AI Iteration Loop for Deploying Reliable Agents with LangGraph
    • Using LaunchDarkly feature flags and Experimentation with Wordpress
    • Migrate a Hardcoded LangGraph Agent to LaunchDarkly AgentControl in 20 Minutes
    • Offline Evaluation of RAG-Grounded Answers in AgentControl
    • Beyond n8n for Workflow Automation: Agent Graphs as Your Universal Agent Harness
    • Catch your first silent AI failure with Vega AI in under 10 minutes
    • Evaluate LLM code generation with LLM-as-judge evaluators
    • OpenTelemetry for LLM Applications: A Practical Guide with LaunchDarkly and Langfuse
    • Use LaunchDarkly Agent Skills in Claude Code and Cursor
    • Detection to Resolution: Real World Debugging with Rage Clicks and Session Replay
    • Compare AI orchestrators: LangGraph vs Strands vs OpenAI Swarm
    • Building a data extraction pipeline with LaunchDarkly
    • Day 12 | 🎊 New Year, New Observability
    • Day 11 | ✉️ Letters to Santa: What engineering teams really want from Observability in 2026
    • Day 10 | Why observability and feature flags go together like milk and cookies
    • Day 9 | 👻 The Three Ghosts Haunting Your AI This Holiday Season
    • Day 7 | 🎄✨The Rockefeller tree in NYC: SLOs that actually drive decisions
    • Day 6 | 💸 The famous green character that stole your cloud budget: the cardinality problem
    • Day 5 | 🧹 Using a Popular Tidying Method to Consolidate Your Observability Stack
    • Day 4 | ❄️ Tracing the impact of holiday styling in your Node.js app
    • Day 8 | 🎁 Observable Multi-Modal Agentic Systems
    • Day 3 | 🔔 Jingle All the Way to Zero-Config Observability
    • Day 2 | 🎅 He knows if you have been bad or good... But what if he gets it wrong?
    • Collecting user feedback in your app with feature flags
    • Day 1 | 🎄 Observability Under the Tree: What Changed in 2025
    • Build a User Frustration Detection & Response System
    • When to Add Online Evals to Your AgentControl
    • Detecting User Frustration: Understanding Rage Clicks and Session Replay
    • AgentControl config CI/CD Pipeline: Automated Quality Gates and Safe Deployment
    • A Deeper Look at LaunchDarkly Architecture: More than Feature Flags
    • Add Observability to Your React Native App in 5 minutes
    • Smart AI Agent Targeting with MCP Tools
    • Build a LangGraph Multi-Agent System in 20 Minutes with LaunchDarkly AgentControl
    • Snowflake Cortex Completion API + LaunchDarkly SDK Integration
    • Using AgentControl to review database changes
    • How to implement WebSockets and kill switches in a Python application
    • 4 hacks to turbocharge your Cursor productivity
    • Create a feature flag in your IDE in 5 minutes with LaunchDarkly's MCP server
    • Observability for Your Go ORM: OpenTelemetry Integration with GORM
    • The complete guide to OpenTelemetry in Next.js
    • How to instrument your React Native app with OpenTelemetry
    • The complete guide to OpenTelemetry in Python
    • Monitoring Browser Applications with OpenTelemetry
    • How to Use OpenTelemetry to Monitor Next.js Applications
    • What is OpenTelemetry and Why Should I Care?
    • Distributed Tracing in Next.js Apps
    • Tracing Distributed Systems in Next.js
    • Real-time Monitoring in Django: Essential Tools and Techniques
    • DeepSeek vs Qwen: local model showdown featuring LaunchDarkly AgentControl
    • Application Tracing in .NET for Performance Monitoring
    • The Ultimate Guide to Ruby Logging: Best Libraries and Practices
    • Using Materialized Views in ClickHouse (vs. Postgres)
    • Filtering and Sampling LaunchDarkly Ingest
    • How to Set Up Your Production AWS MSK Kafka Cluster
    • Publishing an NPM Package with Private pnpm Monorepo Dependencies
    • How To Use The Chrome Inspector & Debugger
    • 3 Levels of Data Validation in a Full Stack Application With React
    • The power of the monorepo: Keep your fullstack app in sync!
    • Compression: The simple, powerful upgrade for your web stack
    • Video tutorials
Sign inTry it free
LogoLogo
On this page
  • The Decor Agent
  • Prerequisites
  • Install LaunchDarkly’s Python SDKs
  • Demo walkthrough
  • 1. Create the LaunchDarkly Project
  • 2. Feature flags as agent gates
  • 3. Setting up AgentControl Configs
  • 4. Variations and targeting
  • 5. Targeting for variation
  • Resources
Tutorials

The AI Iteration Loop for Deploying Reliable Agents with LangGraph

Was this page helpful?
Previous

Using LaunchDarkly feature flags and Experimentation with WordPress

Next
Built with

Published May 21, 2026

portrait of Alexis Roberson.

by Alexis Roberson

Deterministic CI/CD compared to the AI iteration loop.

Deterministic CI/CD compared to the AI iteration loop.

Engineers know the rhythm of continuous integration and continuous deployment (CI/CD): write the code, run tests, deploy, monitor.

The AI iteration loop follows the same structure, but adapts to systems like LLMs where nondeterministic outputs are the predicted norm. In these situations, instead of asserting exact outputs for LLM responses, you evaluate the quality of the response on a spectrum.

A decline in code quality can be tracked back to the latest git commit. With LLMs there are multiple points of drift including changes to model provider, weight updates, user behavior shifts.

This means we need a continuous loop that builds, evaluates on test data, releases, evaluates again on live data, monitor. And each pass through the loop feeds the next, which leads to the iteration.

A good way to set up a complete loop is by using feature flags as the kill/switch mechanism, AgentControl for managing model changes, variations/targeting for selecting which users experience a specific variation, offline/online evaluations for comparing variation impact, guarded rollouts, and observability.

In this tutorial, we’ll explore the first steps in the AI iteration loop starting with feature flags, AgentControl, and variations/targeting for a interior design agent.

LaunchDarkly’s AI configs product was renamed to AgentControl this week. The MCP server endpoints, slash commands, and some skill names still use the aiconfigs slug, and the resource itself is still commonly called an “AI config.” This tutorial uses the current product name (AgentControl) and keeps the legacy slugs in code and commands where they still apply.

The Decor Agent

The Decor Agent chat UI.

The Decor Agent chat UI.

The Decor Agent is a LangGraph agent that gives confident, specific interior design advice. Users ask Decora, a senior interior design advisor, about colors, layouts, and trends. The assistant routes each question to one of three specialist tools, synthesizes a short opinionated response, and returns it.

The main agent uses three skills:

  • the style advisor to help identify decor themes,
  • the room planner for understanding the right decor based on room dimensions, and
  • the trendspotter, which discovers emerging design trends outside traditional styles, such as whimsical or dopamine decor.

Prerequisites

To complete this tutorial, you must have the following prerequisites:

  • A LaunchDarkly account with an AI project and access to your LaunchDarkly SDK key
  • A hosted MCP server configured to use feature management, AgentControl Configs, and observability. To learn how to install a hosted MCP server, read LaunchDarkly hosted MCP server.
  • An Anthropic API key for Claude Sonnet
  • Claude Code (or another Claude Agent SDK client) with the LaunchDarkly agent skills installed and the LaunchDarkly MCP server configured.

This setup assumes you’ll be working in Claude Code Desktop, but feel free to use other coding assistants.

All code for this tutorial can be found here.

Before getting started, use the docs to setup LaunchDarkly’s hosted MCP Servers for feature management, AgentControl configs, and observability.

Now, you can add the SDKs and dependencies you’ll need to run through the demo.

Install LaunchDarkly’s Python SDKs

Install the SDKs
$cd decor-agent
$source venv/bin/activate
$pip install launchdarkly-server-sdk launchdarkly-server-sdk-ai

Then update the requirements.txt file with these packages:

requirements.txt
launchdarkly-server-sdk
launchdarkly-server-sdk-ai

Optionally, you can test the app yourself by starting the server. Here’s how:

Run the server
$# http://localhost:8000
$python server.py

Demo walkthrough

For the demo, we’ll create a LaunchDarkly project and run the following prompts in Claude Code to create the decor-agent feature flag, the AI Configs along with variations and targeting to control which users can access features on the “free ” vs “premium” plan.

1. Create the LaunchDarkly Project

Navigate to the LaunchDarkly UI and create the project with a project key.

Grab the project’s SDK key and LD project key and place them in your .env file before we continue prompting in the Claude Code chat.

.env project keys
$LD_PROJECT_KEY=decor-agent-demo-v2
$LD_SDK_KEY=sdk-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxx

2. Feature flags as agent gates

Feature flags have long been a staple of traditional software, used to control rollouts and ship with confidence.

In agentic systems, they take on a new role as an access gate for the agent itself. Wrapping an agent or skill in a flag gives you a runtime switch that decides whether that capability is reachable at all.

When it’s off, the call short-circuits before it hits the model, and the user sees a graceful fallback. The flag becomes a kill switch you can flip without redeploying.

Let’s execute this prompt in the chat to create the feature flag:

Create the decor-agent-enabled flag
/launchdarkly-flag-create
Tell the skill:
- Flag key: `decor-agent-enabled`
- Type: boolean
- Description: "Kill switch for the decor agent chat endpoint"
- Default: on

Output:

Create the decor-agent-enabled flag
Done. Flag decor-agent-enabled created in decor-agent-demo-v2 and turned on in production.
• Kind: boolean (variations: true / false )
• On variation: true (feature enabled)
• Off variation: false (kill switch engaged)
• Description: "Kill switch for the decor agent chat endpoint"
• Permanent flag
• State: on in production

Verify the flag has been created and is toggled on in the LaunchDarkly UI.

The decor-agent-enabled feature flag in the LaunchDarkly dashboard.

The decor-agent-enabled feature flag in the LaunchDarkly dashboard.

You can test this gate by toggling the feature flag off. You should see a maintenance message showing access to the chat assistant has been removed for now.

This is a basic representation of the power of gating your agents.

The decor agent UI when the feature flag is off.

The decor agent UI when the feature flag is off.

Click to enable the flag again and continue the demo.

3. Setting up AgentControl Configs

Now, create four AI Configs to represent the overall agent and the three agent skills: style-advisor, room-planner, and trend-spotter.

AI Config keyPurposeVariation nameInitial model
decor-agent-mainOrchestrator/routerv1-baselineclaude-sonnet-4
decor-style-advisorStyle specialist toolv1-baselineclaude-sonnet-4
decor-room-plannerRoom planner specialist toolv1-baselineclaude-sonnet-4
decor-trend-spotterTrend specialist toolv1-baselineclaude-sonnet-4
Use AI Configs in completion mode

This demo uses completion mode to keep the focus on the iteration loop foundation: flags, variations, and targeting.

For multi-agent systems used in production, we recommend using agent mode, which manages tool definitions, instructions, and per-agent metadata in LaunchDarkly, rather than in your application code, and allows per-agent monitoring out of the box.

The targeting, variation, and evaluation patterns in this tutorial apply to both modes.

In the decor-agent code directory, there is a file calleddecor-agent/app/prompts.py which contains all of the system prompts for each AI component. These will be the prompts transferred to LaunchDarkly to establish the behavior of the overall agent and the three agent skills.

Now, let’s create each config by running the following prompt:

Create AI Configs
/aiconfig-create
For `decor-agent-main`:
- Config key: `decor-agent-main`
- Type: completion
- Initial variation name: `v1-baseline`
- Model: `claude-sonnet-4-20250514`
- `max_tokens`: 1024
- `temperature`: 1.0
- System prompt: paste the current `AGENT_SYSTEM_PROMPT` from `app/prompts.py`
Repeat for `decor-style-advisor`, `decor-room-planner`, `decor-trend-spotter`
with their respective prompts.

Output:

All four Al Configs are fully set up in decor-agent-demo-v2:

Config keyVariationModelmax_tokenstemperature
decor-agent-mainv1-baselineAnthropic.claude-sonnet-4-510241.0
decor-style-advisorv1-baselineAnthropic.claude-sonnet-4-510241.0
decor-room-plannerv1-baselineAnthropic.claude-sonnet-4-510241.0
decor-trend-spotterv1-baselineAnthropic.claude-sonnet-4-510241.0

And you can always verify the prompts worked in the UI.

All four AI Configs created in the LaunchDarkly UI.

All four AI Configs created in the LaunchDarkly UI.

For example, if you view the decor-agent-main AI Config, you should be able to see the transferred prompt from prompts.py.

Click to expand the decor-agent-main system prompt
You are Decora, a senior interior design advisor. Users come to you for confident, specific, actionable design decisions — not textbook answers.
Your job is to route each question to the right specialist tool, then synthesize the result into a short, opinionated response.
## Tools Available
- **style_advisor** — styles, color palettes, materials, finishes, furniture pairings, aesthetic direction, style-vs-style comparisons
- **room_planner** — room layouts, furniture placement, traffic flow, fitting pieces into a space, making a room feel bigger or smaller, budgeted furnishing lists
- **trend_spotter** — what's currently trending, fading, or emerging in interior design
## Routing Criteria
- If the question centers on **how a space feels or functions physically** (dimensions, layout, fit, traffic flow, small-space problems) → room_planner, even if color or material also come up in the answer.
- If the question centers on **aesthetic choices** (which color, which material, which style) with no spatial constraint → style_advisor.
- If the question is about **trend trajectories** (what's in, what's out, what's coming) → trend_spotter.
- If the user mentions a budget, room dimensions, or both → pass them through to room_planner.
- If the user names a style preference (mid-century, boho, Scandinavian, etc.) → pass it as context to whichever tool you pick.
- **Off-topic messages** (greetings, weather, non-design questions) → respond politely and briefly without calling any tool.
## Process
1. Read the user's message. Note any style preferences, room dimensions, or budget they mention.
2. Pick the single best tool. If the question genuinely spans two (e.g., "boho vibe in a tiny room"), call both.
3. Pass the user's raw question plus any extracted context (preferences, dimensions, budget) to the tool.
4. Synthesize the tool's response into your final answer.
## Output Format
- 2-3 short paragraphs. No headers, no bullet lists, no numbered steps.
- Lead with the recommendation, then the rationale.
- Name specific products: actual paint colors ("Benjamin Moore Simply White"), materials ("white oak with matte finish"), furniture dimensions ("72-inch sofa"), price tiers when relevant.
- If the user mentioned a style preference, lean into it in your phrasing.
## Constraints
- Never answer a design question from general knowledge alone. Always route through a tool.
- Never say "it depends" without committing to a recommendation.
- You cannot order products, schedule consultations, or make purchases — don't promise those.
- Do not expose tool names or internal mechanics to the user. They don't need to know a "tool" ran.
## Tone
Confident and direct. Opinionated without being preachy. Speak as a designer who has made this call hundreds of times.

4. Variations and targeting

Because AI agents are nondeterministic, the most effective way to test output quality is by testing multiple variations of the same AI Config.

And in this case, we will create two new variations for the decor-agent-main AI Config:

VariationPrompt emphasisModelTarget segment
budget-consciousIKEA, Target, Article, thrift, DIY, name price tiers, flag splurgesclaude-haiku-4-5user-tier == "free"
luxury-curatorDesign Within Reach, designer fabrics, unlacquered brass, custom millworkclaude-sonnet-4user-tier == "premium"

Variations aren’t limited to prompt changes. Each variation can override the model, temperature, and max_tokens independently. Here, the budget-conscious variation runs on a cheaper, faster Haiku model to keep free-tier costs down, while the luxury-curator variation stays on Sonnet for richer, more nuanced recommendations.

The prompt for these variations will be almost identical to the decor-agent-main, except each variation contains more granular constraints to ensure they’re called for the right scenario.

Here’s a prompt to create the variations:

Click to expand the variation create prompt
For decor-agent-main, create two variations.
The variations are the demo's personality. Quality matters. Use these as starting points and polish.
budget-conscious variation (for decor-agent-main)
- Model: `claude-haiku-4-5-20251001` (cheaper/faster for free-tier users)
- `max_tokens`: 1024
- `temperature`: 1.0
- Append to the existing AGENT_SYSTEM_PROMPT:
## Budget Conscious Mode
The user is budget-conscious. When recommending products or materials:
- Lead with accessible brands: IKEA, Target, Wayfair, Article, World Market, Urban Outfitters.
- Include approximate prices in USD for each named item.
- When naming a high-end piece, offer an accessible alternative immediately ("Design Within Reach × or IKEA Ektorp").
- Mention DIY, thrift, or secondhand options when genuinely viable (Facebook Marketplace, Craigslist, estate sales).
- Flag splurges explicitly: "if you can stretch the budget, [premium option] is worth it because X."
- Respect tight budgets. Under $500 total means under $500 total — don't recommend a single $400 sofa.
luxury-curator variation (for decor-agent-main)
- Model: `claude-sonnet-4-20250514` (keep Sonnet for richer curation)
- `max_tokens`: 1024
- `temperature`: 1.0
- Append to the existing AGENT_SYSTEM_PROMPT:
## Luxury Curator Mode
The user values quality, craftsmanship, and distinction. When recommending:
- Lead with designer and trade brands: Design Within Reach, B&B Italia, Pinch London, Knoll, Roman & Williams, Lawson-Fenning, Nickey Kehoe.
- Name specific designers when relevant ("a Pierre Jeanneret-inspired teak lounge chair").
- Specify materials precisely: "hand-rubbed unlacquered brass," "mohair velvet," "travertine with honed finish," "solid white oak, rift-sawn."
- Include approximate prices — premium is expected, but the user still wants transparency.
- Suggest custom or made-to-order pieces when appropriate.
- Skip mass-market references entirely. IKEA does not appear in this mode.
When to add a variation versus a new config
New variation — same role, different personality, tone, or model. Examples: budget vs. luxury, Sonnet vs. Haiku, concise vs. detailed.
New AI Config — genuinely new capability. Example: a new decor-lighting-expert tool would be a new config, not a variation.
Rule of thumb: if the question "which one is correct?" has different answers for different users, it's a variation. If the question is "which one applies to this request?", it's a new config.

Output:

.env project keys
Both variations are live on decor-agent-main:
• budget-conscious — leads with IKEA / Target / Wayfair / Article / World Market / Urban Outfitters / Amazon, USD prices on every item, secondhand as first-class, splurge flags, literal budget respect.
• luxury-curator — leads with DWR / B&B Italia / Pinch / Knoll / Roman & Williams Guild / Lawson-Fenning / Nickey Kehoe / Vitra / Cassina / RH Modern / De La Espada, named designers + provenance, precise material specs, trade/made-to-order paths, mass market explicitly excluded.
Both use Anthropic.claude-sonnet-4-20250514, max_tokens 1024, temperature 1.0. Polished light touches: tightened brand lists, added price-transparency rule, made secondhand and splurge guidance more directive than hedged.

We should also be able to see the newly created AI Config variations in the LaunchDarkly UI under the decor-agent-main config.

Variations visible in the LaunchDarkly UI.

Variations visible in the LaunchDarkly UI.

To determine which user is exposed to a certain variation, we can set up targeting that determines access.

5. Targeting for variation

For this tutorial, we can create two tiers called “free” and “premium.” All users will have access to the budget-conscious variation, but only premium users get access to the luxury-curator variation.

To configure this in the LaunchDarkly UI, use this prompt in Claude:

Targeting prompt
/aiconfigs-targeting
Or edit targeting in the LD UI directly for the `decor-agent-main` AI Config:
IF user-tier IS "premium" → serve luxury-curator
IF user-tier IS "free" → serve budget-conscious
DEFAULT → serve budget-conscious

Output:

PriorityConditionServes
Rule 1user-tier IS “premium”luxury-curator (idx 3)
Rule 2user-tier IS “free”budget-conscious (idx 2)
Fallthroughdefaultbudget-conscious (idx 2)

Here is how the successful outcome of this prompt appears in the LaunchDarkly UI:

AI Config targeting for decor-agent-main variations in the LaunchDarkly UI.

AI Config targeting for decor-agent-main variations in the LaunchDarkly UI.

For a final test, use a toggle to change between the free and premium variations. This provokes either budget-friendly or luxury-minded responses.

Test the final result in the Decor Agent UI. Here is an example prompt:

Test prompt
Help me pick a sofa for a 11x14 living room with a boho theme.

Use the toggle to switch between the free and premium tiers and notice the difference in responses for the same prompt.

Resources

With AI Configs, a feature flag gate, targeting, and variations, we are primed for the next steps to complete this AI iteration loop, which include:

  • Offline evaluations and datasets
  • Online evaluations
  • Guarded rollouts
  • Vega - root cause analysis
  • Session Replay - granular user session debugging and playback