This guide summarizes the LaunchDarkly patterns for managing agent behavior outside of code, monitoring it continuously, and correcting it automatically when it goes wrong.
Agents can make autonomous decisions in production. Without runtime control, you can observe the results but cannot intervene. By the time drift or degradation is visible, customers may have have already felt it. LaunchDarkly offers remediation and protection options to prevent or mitigate bad outcomes from misbehaving agents.
To complete this guide, you need the following:
When you have agents with hardcoded configurations running in production, three things can go wrong:
LaunchDarkly addresses all three by moving agent configuration, including prompts, models, and parameters, out of code and into a centrally managed, auditable runtime layer called AgentControl.
The foundational step is to place hardcoded prompts, model selections, and behavioral parameters with LaunchDarkly AgentControl configs. Everything else depends on the ability to change agent behavior without touching code.
To learn more, read AgentControl.
Every team that deploys agents should reference the same AgentControl config structure. This gives you a single place to update behavior, a shared audit trail of what changed and when, and consistent governance across deployments.
Here is an example of retrieving an agent’s active configuration in Python:
Connect each AgentControl config to the metrics that reflect acceptable agent behavior. Define the thresholds that indicate drift or degradation before you release to production, not after.
The following table shows metrics worth monitoring for most production agents:
Set thresholds for each. When a threshold is crossed, LaunchDarkly executes the remediation action you configure, such as rerouting traffic, reverting to a known-good config, or disabling a behavior entirely.
Roll new agent configurations out to a percentage of traffic, rather than all users at once. As the rollout progresses, LaunchDarkly measures your connected metrics for each config variation and tracks behavioral changes across real interactions, including traces, outputs, and evaluation scores.
This gives you the data to make a confident promotion decision, and it limits exposure if a new config degrades in production.
Treat every agent config update as a release, even prompt changes that feel minor. Small prompt edits can produce significant behavioral shifts at scale.
When evaluation scores degrade or thresholds are crossed, LaunchDarkly acts without waiting for human intervention. It can take one or several of these actions:
The change takes effect in milliseconds. Most users never encounter the degraded behavior.
For issues that require human review before remediation, configure an alert action instead. LaunchDarkly notifies your on-call channel while holding the current config stable.
Before relying on automated governance in production, validate the full loop in a non-production environment. Here’s how:
If remediation doesn’t fire, verify that your metrics source is connected and reporting data, and that your agent evaluates the AgentControl config with the correct context key.
To continue, explore the following topics: