What Is Canary Testing? Benefits, Challenges & How to Get Started

Canary testing is a powerful way of testing new features and new functionality in production with minimal impact to users. People often use the terms canary testing, canary release, and canary deployment interchangeably. In this article, canary release or canary deployment refer to a software release that’s being used for canary testing. Canary testing, then, is the act of using canary releases (also called canary deployments) to test out new features or new software versions with real users in a live production environment.

Why do we call it canary testing?

We can thank coal mining for the term canary testing. Up until the 1980s, coal miners would take a canary along with them in the mine to indicate the presence of odorless but toxic gases. If the canary died, the miners knew the lethal gas was building up to dangerous levels and it was time to evacuate.

Similarly, but luckily in a less lethal situation, canary releases give you an early warning of things going wrong.

Let’s talk a little more about the idea behind canary testing.

The idea behind canary testing

Hopefully, you already have a procedure for testing your software changes. You probably even have several techniques from the DevOps world such as A/B testing and blue-green deployments. Developers often employ these two techniques:

Developers write automated tests for the changes they’re making.
The changes are deployed to a testing environment where someone else can play around with the new features.

If all is well, the update is pushed to the production environment, and the end users can enjoy the new feature.

But software being as it is, bugs inevitably find a way to production. You’re human, after all. You can’t think about every edge case that might happen. And you work under the pressure of deadlines and budgets.

Canary testing is a technique to let those production bugs affect only a small subset of your users. Traditionally, in canary testing, you would first need two identical production environments. This doesn’t have to mean two separate servers. You could have two web applications running on the same server, for example.

When you have a new release ready, you deploy it to one environment. Then you route a small subset of your users to this canary release. About 5% is a good amount. These users will see the new features, while the other group of users doesn’t see any changes.

Now you can monitor this canary release and fix any bugs that occur. The goal is not to eliminate production bugs. Instead, you’re trying to minimize their impact. If you have a bug in the new version, only 5% of your users are affected. While the bug still needs fixing, the pressure on you may be less than it would be if all users encountered the bug.

We’ve just described a canary deployment using two separate production environments. But with a feature management platform like LaunchDarkly, you can simply roll out a feature to a canary group in a single production environment. Here’s how it works.

Running canary tests with feature flags

With feature flags, you don’t need multiple environments and complicated routing configurations. You can expose the new code to that small subset of users and have everyone else use the application as it was—all within your main production environment.

This is what it could look like. First, you do a rollout of your application with the new feature, but without exposing it to any users yet.

Then, you could use LaunchDarkly to expose the new feature to a small group of users. This means the majority of your users will use the application without the new feature. But the canary users will have access to it.

Once you’re happy with how everything’s going, you can expose the feature to more and more users at your own pace until you reach 100% of your user base.

After that, you can remove the feature flag, both in code and in LaunchDarkly. And that’s how to successfully complete a canary deployment using feature flags!

Now let’s consider the advantages and possible challenges of canary testing.

Using feature flags to release a new feature to a small subset of users, the canary group.

When the feature has successfully met all the performance requirements of the canary test, you then release the feature to a larger audience (in this case, all “Regional Users”).

When the new feature has met its performance requirements after having been released to the Regional User segment, you can then release it to yet a larger audience, All Global Users.

Advantages of canary tests

Canary testing has several advantages.

Of course, the first is that a bug will affect only a small number of users. This significantly reduces the risk to your organization. Launching a new feature that contains bugs often angers users. If this happens often enough, your reputation is at stake. And it won’t be long before sales suffer.

Thanks to canary testing, this risk is reduced. This means a new release is a lot less stressful for both the team and for management. This is even more true if you’re adhering to DevOps principles where the development team is also responsible for maintaining and supporting the application.

Apart from that, there are some other advantages:

You can monitor the performance of new code as you increase the load.
Your QA testers can be the first group of users to test the new feature. This allows them to test in the production environment, which means they’re testing in the environment that the real users will be using too. This partly solves the problem of bugs occurring in production but not in the staging environment.
You should be able to quickly remove the feature if it’s buggy, degrades the application’s performance, or invokes bad user feedback.
It’s possible to set up a beta program with users who are willing to enjoy the latest and greatest features of your application. You can inform them that new features may still have issues, and you can ask them for feedback.

Because of these advantages, you will have more confidence to release new features at a higher cadence. This brings you a step closer to implementing continuous delivery, i.e., bringing new features to production faster to shorten the feedback loop and react to the feedback faster.

Challenges of canary tests

There are two sides to every story, and canary releases are no different. Though I wouldn’t really think in terms of the “disadvantages” to canary testing. Rather, they’re challenges you can easily solve with feature management.

Mobile applications

Having two environments is all well and good, but what if you have only one environment: the user’s device? Mobile apps are typically distributed through an app store, and you can’t choose which user gets a newer version of your app.

Luckily, feature flags can help here again. You could add the feature to the new version of your app but enable it only for a small group of users. Thus, with feature flags, you can do canary deployments with a single production instance of your application.

Canary release management

Things can get tricky if you’re releasing new features rapidly. In my examples above, I’ve always talked about a single new feature for which you need one extra environment. But what if you want to test two new features? Then you’d need three environments: one for the majority of the users and one environment for each of the two new features. And maybe even a fourth for the combination of the two new features? This will become hard to manage!

Again, using feature flags to expose a feature to a percentage of users can alleviate this pain. With a good feature flag management platform, you can easily enable one or more features to a group of users. And when things go as planned, you can just increase the percentages until every user is enjoying the new version of your software at its fullest.

Start running canary tests with feature management

Feature flags let you perform canary releases in a single production environment. This makes canary testing a lot easier, especially if you’re using a feature management platform like LaunchDarkly, which allows you to use feature flags on a large scale with a high degree of sophistication.

Canary testing will reduce the risk of releasing software, increase flexibility and confidence, and allow you to roll out features faster.

Request a demo of LaunchDarkly’s feature management platform today!