Optimize AI performance and cost with AgentControl configs | LaunchDarkly

This guide shows how to use LaunchDarkly AgentControl configs to define prompt and model variations, evaluate them against real production traffic, and route to the best-performing option without a redeploy.

Prerequisites

To complete this guide, you need the following:

A LaunchDarkly account.
LaunchDarkly installed and initialized in your application. To learn more, read SDK overview.
At least one AI-powered feature in production or ready to deploy.

How AI optimization works in LaunchDarkly

LaunchDarkly treats AI configuration the same way it treats feature releases. A behavior loop using LaunchDarkly AgentControl features lets you iterate on AI model use in production with measurement and control at every stage.

The optimization loop has three phases:

Define different patterns to test. You can do this with AgentControl configs. Create prompt, model, or parameter variations and validate them against benchmark datasets before releasing them to production.
Validate high-performing variations. Use offline evaluations to validate your config variations against a baseline dataset.
Experiment in production. Expose variations to real users and measure cost, latency, and output quality under real conditions.
Promote and iterate. Route traffic to the winning configuration and keep the loop running as your usage patterns change.

Step 1: Create an AgentControl config with variations

In LaunchDarkly, create an AgentControl config for the agent or feature you want to optimize. Add variations for each configuration you want to compare, such as different models, prompts, temperature settings, or any parameter that affects performance or cost. To learn more, read AgentControl.

Do not specify a config variation in your code. Keeping your application code agnostic as to which variation a config serves lets you adjust behavior using config variation changes, while your code only executes the instructions it is given.

Start with the highest-impact axes

Start with the axes that have the biggest cost or quality impact, such as model selection and system prompt. Add parameter tuning in later iterations.

Step 2: Plan and test with offline evaluations

Before you send any production traffic to new variations, validate them against a benchmark dataset. LaunchDarkly’s offline evaluation tools let you run evaluations with defined judges, such as quality, cost, correctness, and safety, against each variation. This lets you surface differences in outputs and reconfigure the config variations if necessary without any real users involved.

Use this phase to eliminate variations that are clearly underperforming. This saves you cost and bad user experience when real users encounter your config variations in production.

To learn more, read Datasets.

Step 3: Experiment in production

Roll your surviving variations out to a percentage of real traffic with LaunchDarkly’s Experimentation. Define the metrics that constitute a win for your use case:

Metric	Example
Cost	Tokens per request, spend per 1,000 calls
Latency	Time to first token, total response time
Quality	Judge scores for correctness, relevance, and safety
Business	Task completion rate, user satisfaction signal

LaunchDarkly measures these metrics for each variation across real production traffic and displays results as they accumulate. To learn more, read Proving ROI with data-driven AI agent experiments.

Step 4: Promote the winner and keep iterating

When a variation reaches statistical significance, promote it as the new baseline. In LaunchDarkly, this updates the AgentControl config without a code change or redeploy. Traffic routes to the winning configuration immediately.

Don’t treat promotion as the end of the loop. Schedule the next iteration. New model releases, prompt refinements, and changed usage patterns are all reasons to run the cycle again.

Review your configs and experiments regularly

Set a regular review cadence, such as monthly or after any significant change in traffic volume or user behavior, to keep your configs optimized as conditions evolve.

Next steps

To continue, explore the following topics:

AgentControl for full configuration and variation setup.
Experimentation to measure variations against production metrics.
Metrics to feed cost and quality signals into your experiments.
Guarded releases to add automated rollback thresholds to your AgentControl config promotions.