Feature Management in Extraordinary Circumstances
How feature flags can help software teams respond to a rapidly-changing work, social, and business environment.
At LaunchDarkly, we spent the first couple weeks of California's Covid-19 lockdown making sure that we will continue to provide the level of service and reliability you've come to expect. We know that we are part of the critical path and infrastructure for a lot of companies that can't afford any surprises right now. We feel confident that we are as prepared as possible for this new business environment. For more information, see https://launchdarkly.com/blog/launchdarkly-response-to-covid-19-pandemic/.
Recently, we've noticed some interesting stories in our support queue and feedback channels. Some of our customers, instead of slowing down their release cadence or dialing back experimentation, are actually using feature flags more heavily than before, to deliver essential services, target geographic areas with different restrictions, or unlock firewalls for specific information.
Others are using feature flags to quickly hide functionality that is no longer relevant or is potentially even insensitive to surface during a crisis. We wanted to highlight some of the ways that feature flags can help your company better respond to the extremely rapid changes we're all experiencing.
Banners and custom sites
Most websites you visit right now have a new banner that notifies you about that company's reaction to Covid-19.
These banners are a fast and efficient way to inform your customers about changes to your policies, hours, and way of doing business.
However, there are times when you want to deliver different information depending on the customer's location. For example, a pharmacy chain may be working with a delivery service, but only in places where that delivery service already exists. If you're in a rural or exurban area, you may not have a delivery option available. And if you advertise the delivery option at every store, when in fact this option excludes rural areas, your rural customers are going to have a frustrating experience.
We have seen one of our food-delivery customers make heavy use of feature flags and location targeting to change the availability of options from their store locator. Some stores can't offer take-out, and they can add that store to a group named DELIVERY_ONLY. If a store is in that group, none of the ads or promotions or take-out messages will appear on the page, since they are all feature-flagged.
Safer code pushes
A developer who wishes to remain anonymous wrote in to tell us about how she accidentally pushed broken code to production. Her kids distracted her as she was fixing it, and she deployed instead of testing. Because it was flagged, the developer just flipped the flag to OFF, instead of having to redeploy or re-push any code. The mistake still existed in the code, but it wasn't executing anymore, so it didn't matter that it was broken.
Giving distributed teams a way to easily disable code that isn't working the way it should is a way to increase safety at a time when we can't all be in the same place—even if a feature appears to be working when it first rolls out, it may encounter problems later, and anyone on the team can turn it off. Diagnostics and correction can happen offline, instead of needing to wrangle code in real-time or on-call. Flag control can be assigned to a group of people instead of an individual, which gives you more control around time zones, interruptions, and unavoidable absences.
Locations and availability
In the U.S., different states and counties have different stay-at-home and quarantine orders in place. If you're trying to comply with all local regulations, which are changing on an almost hourly basis, having the ability to target parts of your website and application by geographic location is valuable.
For example, health officials in the San Francisco Bay are asking people to stay in their neighborhoods as much as possible. If you're a health and fitness app, you could turn on or try out an exercise-route suggester that only covers the square mile centered on the user's home. On the other hand, users in rural areas may not need that restriction, because a ‘neighborhood' could span dozens of miles.
Your navigation software may need to know which roads have been closed to become pedestrian paths, but you're going to want to turn them back into navigable roads as soon as the municipality does. That could be a feature controlled by a flag.
Load-shedding and reduced service
You've probably noticed some load and latency problems as organizations have added so many more people to online tools, especially classrooms. Worldwide usage patterns have changed very quickly and very drastically, and there are some inevitable problems appearing. Especially at peak times, it can be difficult for students to log in and access their online learning content.
Part of the problem is that as each user comes online, they need to load all the elements of the page and the video. Once it's loaded, maintaining the stream is much easier. The natural human instinct, though, is to hit Refresh if it seems like the page is stuck or loading slowly, but that only extends the problem, because then the server starts sending the whole heavy package all over again.
Feature flags can help with this scenario in a couple of ways. Neither option is an ideal experience, but they are both better than absolute failures or server problems that can be triggered by excessive load.
First, you can use load-time triggered flags to strip down what you are delivering. For example, if your analytics indicate that it's taking over three seconds to load a page, you would have flags that turn off any scripts, images, or auto-play elements. That way you're still delivering part of the page and allowing everyone to get online and settled into their connection. Once load times drop to an acceptable level again, you can do a percentage distribution of those heavy elements that had been causing problems.
Second, you can use feature flags to do inbound load-shedding. Once your server is at 95% capacity, all incoming requests are sent to a different page with a request to try again in a given period of time. Although it's still a disappointing user experience, it's much better and clearer than watching an endless loading icon.
As humans, our first priority is helping other humans.
As companies, our first priority is continuing to fulfill our promises—to our customers, our users, our owners, our investors, our governments. The best way we can do that is to continue to work to make software that is reliable, resilient, and useful.