Ship AI-built code with AgentControl or CodeControl
Ship AI-built code with AgentControl or CodeControl
Ship AI-built code with AgentControl or CodeControl
AI generates code faster than teams can validate it. This guide shows how to use LaunchDarkly to release AI-built changes safely. You get gradual rollouts with defined guardrails, automated detection, and instant remediation if something goes wrong, all without a redeploy.
To complete this guide, you need the following:
The pattern is the same whether you ship AI-generated application code or deploy an AI coding agent:
The section that applies to you depends on what you ship:
This section explains how to ship AI coding agents safely.
Every AI-generated change that touches production should run behind a feature flag. Using feature flags increases safety because flags make automatic remediation possible. Without a flag, you cannot easily disable or switch to another version of your code in production.
To learn more, read Creating new flags.
You can use metrics as guardrails that prevent or mitigate problems. Connect your flag to the metrics that matter for this change and set thresholds that will indicate if a problem occurs. In LaunchDarkly, configure these settings under your flag’s Guarded releases:
Start at a small percentage of traffic, such as 5% to 10%, rather than releasing to everyone at once. LaunchDarkly monitors your connected metrics as the rollout progresses. If a threshold is crossed, LaunchDarkly automatically executes the remediation action you defined in step 2.
You do not need to monitor the rollout yourself. By defining the metric guardrails and remediation thresholds in advance of the release, you ensure that corrective action occurs automatically if needed.
This section explains how to deploy AI coding agents safely.
Define your agent’s behavior, including prompts, model selection, and parameters, in AgentControl configs rather than hardcoding them. Connect evaluation metrics that reflect the outcomes you want, such as code quality scores, test pass rates, accuracy, or cost. To learn more, read AgentControl.
Expose the agent to real workloads and monitor its behavior, including traces, outputs, and evaluation scores, across actual interactions. Use percentage rollouts here as well. Start narrow, then expand as evaluation scores confirm the agent performs as expected.
If evaluation scores degrade or error thresholds are crossed, LaunchDarkly will switch the active config. When this happens, traffic reroutes to a fallback model, reverts to a known-good prompt, or disables the agent path entirely. The change takes effect in milliseconds, without a redeploy.
Test your guardrails in a non-production environment before relying on them in production. Here’s how:
If remediation doesn’t fire, check that your metrics source is connected and reporting, and that flag evaluation uses the correct context.
To continue, explore the following topics: