Defining regression thresholds for guarded rollouts

Fine-grained control over releases is table stakes for any modern software delivery team. However, without change monitoring, that control is like flying blind.

With LaunchDarkly Guarded Releases, you gain the ability to perform a guarded rollout, a type of controlled release where a new feature or change is gradually rolled out to a portion of users. You can monitor key metrics in real time and automatically revert the change if those metrics indicate a problem.

In a guarded rollout, we are comparing the original variation to the new variation and checking whether there’s high confidence that the new variation performs worse than the original variation. If there’s high confidence, we automatically notify you of the failed change, and pause or roll back the release.

Understanding regression thresholds helps you fine-tune your risk tolerance in a guarded rollout so that you only roll back when things are truly broken.

A lower threshold means you're being more careful about any drop in performance, while a higher threshold means you're okay with the new version doing a little worse than the original one—within a threshold you’ve set. Thresholds let you set your own custom definition of “worse” for the guarded rollout.

Now, imagine you’re setting up a guarded rollout in LaunchDarkly. Before choosing your regression threshold, it’s helpful to understand how regression thresholds work and what role they play in detecting regressions during rollout.

UI of regression threshold

Regression threshold is a relative concept, not an absolute concept

Let’s say we’re measuring an error rate metric: the percentage of users who encounter at least one error.

In this case, a lower error rate is better. This means a higher error rate is worse, so we want to detect a regression when the new variation has a significantly higher error rate than the original variation’s .

One common misconception about regression thresholds is thinking they refer to a fixed percentage. They do not; the threshold is measured relative to the original variation.

For instance, 500ms latency might be perfectly acceptable in a dashboard app but completely unacceptable in a real-time trading system. What matters in this context isn’t whether 500ms is objectively fast or slow, but whether your change has pushed latency meaningfully beyond what the original variation experienced.

This highlights a key idea: regression thresholds aren’t absolute judgments of what’s “good” or “bad.” Instead, they define a buffer zone—how much worse the new variation can be, relative to the original variation, before it’s considered a meaningful regression.

Here are a couple of examples to illustrate this concept:

❌ FALSE: "If I set the regression threshold to 10%, LaunchDarkly will detect a regression when the new variation’s error rate is 10%."

This is incorrect because it is viewing the percentage as a fixed value, rather than relative to how the original variation is performing.

❌ FALSE: "If I set the regression threshold to 10%, LaunchDarkly will detect a regression when the new variation’s error rate is 10% of the original variation’s error rate."

Saying “10% of the original variation’s error rate” suggests an absolute fraction rather than a relative increase. With the original variation’s error rate at 5%, then 10% of that is just 0.5%.

This would mean the new variation is considered to perform worse only if its error rate drops to 0.5%, which is actually an improvement—not a regression!

The correct way to think about regression thresholds

When you set a 10% regression threshold, it actually means:

The new variation is considered worse only if its error rate is more than 110% of the original variation’s error rate (i.e., 10% higher than original variation’s).
Using the same 5% original variation’s error rate example, this means the new variation is considered to perform worse only if its error rate exceeds 5.5% (5% × 110% = 5% × 1.1 = 5.5%). But the new variation is not considered to perform worse if its error rate is higher than 5% but lower than 5.5%.

When the regression threshold is set to 0% by default, it actually means:

You’re being very strict. The new variation is considered worse as soon as its error rate is any amount higher than the original variation’s error rate.
Using the same 5% original variation’s error rate example, this means the new variation is considered to perform worse if its error rate exceeds 5%. Even a new variation’s error rate of 5.0001% would be considered as performing worse.

The concept of “worse” in guarded rollouts

Under the hood, LaunchDarkly guarded rollouts use a concept called “probability to be worse” to decide whether to roll out or roll back a feature.

But before diving into the math, let’s take a step back:

How do we define “worse”? It partly depends on the success criterion specified in the metric definition—whether higher is better or lower is better. That sets the direction for what counts as improvement or degradation.

UI for metrics and error rate

But perhaps not every drop in performance counts as a meaningful regression. This is where the regression threshold comes in: it establishes the additional level of underperformance we’re willing to tolerate for the new variation compared to the original variation.

Understanding regression thresholds in practice

Now that we’ve cleared up the misinterpretations, let’s go back to what setting a 10% regression threshold actually means for error rate:

You’re defining “worse” as when the new variation’s error rate exceeds 110% of the original variation’s error rate.
In other words, you’re okay with the new variation having a slightly higher error rate than the original variation’s—but not more than 10% higher.

At its core, the regression threshold is based on the relative difference from original variation.

So when you set a 10% regression threshold, you're essentially saying:

“I consider the new variation to be worse if its error rate is more than 10% higher than the original variation’s error rate.”

Or in formula terms:

Relative difference from original variation > regression threshold

For an error rate metric, this means:

(New Variation Error Rate - Original Variation’s Error Rate) / Original Variation’s Error Rate > regression threshold

Learn more about Guarded Releases metrics

Want to dive deeper into the metrics you need to identify to get started with Guarded Releases? Check out a demo of the product from LaunchDarkly Product Manager Tim Cook or download our guide, Metrics-Driven Guarded Releases: Your Guide to Confident Feature Rollouts.

Learn more about how Guarded Releases helps you track metrics that matter.

Download the ebook

Like what you read?

Get a demo

Diane Lu

Data Scientist, LaunchDarkly