As codebases get more complex and more engineers get involved, things can spiral. It can become more difficult to coordinate deployments with intended release dates, while ensuring that your application stays online and available to users as new features are tested and rolled out.
This is where feature flags come in.
A feature flag is a piece of code that allows developers to deploy a feature in a disabled state. Only when the feature flag is enabled is the functionality exposed to the end user, giving software teams the ability to decouple deployments from functionality releases. Feature flags also give software teams the ability to roll out new functionalities to a subset of users, gradually perform rollouts over time, and much more. Developers and site reliability engineering (SRE) teams can both take advantage of feature flags, but because their responsibilities and focus within the application are different, they use them in slightly different ways.
In this article, you'll take a look at what feature flags are and how they can benefit site reliability engineers, specifically, in their quest to keep an application up and running.
Why feature flags?
As an application grows in complexity, maintaining application availability while delivering new features and bug fixes becomes more complex, as well. When an organization is small, all of these tasks are often the responsibility of a single team of developers. However, as the organization grows, bug fixes and the development of new features are often left in the hands of the developers, while keeping the application up and running becomes the responsibility of the SRE team.
While developers use feature flags primarily to enable new functionality after it's been deployed, SRE teams can use feature flags in an entirely different way. Because the SRE team is concerned with the performance and availability of an application, they can use feature flags for things like testing if disabling a feature affects the performance of the application, or for reducing the risk of infrastructure problems related to deploying new code by giving users access to new features in smaller groups.
Just as important as having these abilities is that they can do all of this independently. There's no need to rely on the developers to trigger deployments, or to interfere with the development and deployment of new code or any bug fixes.
SRE-specific use cases for feature flags
As previously mentioned, SRE teams have some very different use cases for feature flags than developer teams. While developers use feature flags to control the release of features that they've built, SRE teams use feature flags to experiment and ensure that their application is running in the most performant and efficient manner possible.
Measuring performance impact
One of the most important roles of the SRE team is to make sure the performance of the application or applications they are responsible for remains stable, or even improves over time. While there are many tools that allow them to manage application performance overall, it's important to know when to look for performance deviations that may relate to specific new features. However, if the development team is deploying new features in parallel and releasing many new features at once, the SRE team might not be able to easily correlate whether specific performance deviations match up to certain deployments.
Example: When previously-deployed features are behind feature flags, the SRE team can easily enable and disable a particular feature if they suspect that it is having an impact on the performance of the application. With the ability to toggle a particular feature on and off and measure the difference in application performance in their preferred tool, they can more quickly investigate any performance issues that arise and more quickly locate any potentially problematic code that has gotten deployed into the codebase.
On the other hand, if disabling a feature requires a deployment, that introduces the complexity of coordinating with the development team, fitting the desired test into the pipeline of other features and bug fixes that are slated to be deployed, and going through the same process to add the feature back in if it's found not to be contributing to the problem. This lengthens the time needed for the testing process, makes the application more unstable for the duration of testing, and introduces the potential for more undesirable behavior to make its way into the codebase as a result of this undoing and redoing work.
Temporarily disabling functionality for system stability
Feature flags can also be used to increase a site's stability by disabling features that negatively affect performance under times of high load.
Almost everyone has heard of or experienced one of their favorite websites going down because a product or piece on the site went viral, a major ad campaign drove a lot of traffic to the site, or there are other scaling issues, but sometimes specific features can contribute more to server load during these high traffic periods and would be better disabled until the spike passes.
If the feature or features that need to be disabled aren’t behind a feature flag, disabling the feature to try and lighten the load on the server means deploying a rollback, which will then have to be reversed when the feature needs to be re-enabled. However, if this feature is behind a feature flag, turning the feature off temporarily would be as straightforward as flipping the feature flag.
Example: If you've ever been on a popular forum like Reddit, you may have noticed that sometimes, certain threads—or even the entire site—are in "read only" mode, which means that commenting isn't possible. This is an example of temporarily disabling a specific piece of functionality to help with performance concerns.
Gradual rollouts
Another powerful use case for feature flags is to enable the gradual rollout of features. When you're looking at introducing new functionality that affects a large part of the existing application, involves some changes that may influence performance or stability, or even is just tied to a major business objective, it can be important that a particular feature isn't deployed all at once. In this case, feature flags can be very helpful in facilitating this gradual rollout.
Depending on how you want to roll out a new feature, feature flags can be used to only enable the new feature for specific users, enable the feature for a specific percentage of the userbase, or anything in between. Especially for features where the SRE team is concerned about the rate of adoption or the effect that a new feature might have on the underlying infrastructure, a phased rollout using feature flags allows them to adjust as they see the data start to come in. This may mean making adjustments to the platform, or working with the developers to implement changes to a feature before it's made available to all users.
Example: One place that many people have seen this type of gradual rollout is with Slack. If you're part of multiple Slack teams, you may notice that notifications of new features show up earlier for some teams than others, because Slack rolls out new features gradually, rather than all at once. This is an important way to make sure potentially performance-impacting bugs haven't accidentally been deployed to production. Sometimes performance issues with new code only show up when a large number of users get access and begin to use them. Catching issues like this before they’re rolled out to the entire userbase allows you to address them before they create a potentially catastrophic failure of the application.
Observing metrics impact
Similarly to how feature flags can be used to measure the performance impact of new features or bug fixes, if there are other important metrics that the SRE team needs to monitor, using feature flags to enable and disable new functionality can support this as well. Whether you're trying to monitor usage, load on your infrastructure, loading time for a specific portion of the application, or any of the other metrics that you’re tracking, measuring these metrics before and after a feature flag is flipped can give you a very precise look into how the functionality enabled by the feature flag affects these metrics.
Example: If your SRE team has been receiving reports or seeing in your monitoring that the portion of your application where a user views information about their account has been performing slowly the last few weeks, this is probably something you would want to look into. If your team has put most of the features on this page behind feature flags, you can disable specific features and see whether the performance of this page improves before turning them back on. Once a specific feature or combination of features being disabled shows a significant performance improvement, you know where you need to look for the underlying issue. Without feature flags, this sort of testing could involve tedious back-and-forth discussion between the SRE and development teams. Feature flags allow each team to move faster and accomplish their work more independently.
Feature flags empower the SRE team to do their job more effectively, monitor the metrics they need to monitor, and ensure their work to optimize and maintain the performance and stability of the application doesn't interfere with the developers' new features and bug fixes.
Wrapping Up
While feature flags are often thought of as a tool for developers that allows them to develop incrementally, SRE teams can use feature flags as well. SRE teams can leverage feature flags to help them debug variances in important metrics, perform gradual rollouts of new features to ensure a smooth experience for users, and help them observe changes to other metrics important to their application.
As the overall goal of any SRE team is to make their application more performant and stable using whatever tools they need, feature flags can be a powerful tool in that toolbox.