Right Grid
  • Overview
  • Transcript
Trajectory

Making Releases Boring in the Enterprise

John Feminella ThoughtWorks

Enterprises often suffer from a curse of accumulated risk. The applications they're building are large and complex, and thus need a lot of risk management process, which means releases take longer— which makes them riskier, and we're back where we started. What can we do to break out of this deadly, self-reinforcing feedback loop? In this talk, John will explore how combining key ideas from Progressive Delivery and observability lead to better, more resilient releases for teams of all sizes. In particular, John will cover how combining intelligent traffic shadowing with metrics-canary phased rollouts can get you back to a happier place for releasing your software. Releases should be boring—let's make them stay that way.

Downloads slides

John Feminella

John Feminella is an avid technologist, occasional public speaker, and curiosity advocate. He serves as a technical advisor to ThoughtWorks, where he works on helping enterprises transform the way they write, operate, and deploy software. John lives in Charlottesville, VA and likes meta-jokes, milkshakes, and referring to himself in the third person in speaker bios.

Dawn: 

 This is the first of four weekly Nano Series sessions, leading Trajectory LIVE on August 26th and 27th. First, some housekeeping stuff. All participants are required to follow the code of conduct. It's been posted into the chat for you to review. Please, use #trajectorynano, when sharing content on social media today. If you have any questions during the talk, please post them into the chat. We'll be taking questions after John's talk.

 Last but not least, thank you to Rollbar for sponsoring today's talk. Rollbar automates air monitoring and triaging, so code can be deployed more often and with more confidence. Rollbar provides the safety net so teams can catch errors before their users do. Hi I'm Dawn Parzych, a Developer Advocate with LaunchDarkly. I'm happy to have you join us today for the build edition of this Nano Series. At LaunchDarkly, we have identified four key pillars of feature management. Build, operate, learn and empower.

 Feature management is designed to spend multiple teams in use cases, all of which are contained in these four pillars. Teams often start in build and gradually work their way through the other pillars. Well, other teams jump right in and start by using more than one pillar. Today, we're going to focus on the build pillar and hear from John Feminella, a Technical Advisor at ThoughtWorks. Build, sometimes it's called release management. This covers everything from ideation and design to deployment of a feature. This is the process of controlling and determining who sees a feature, when. 

 Giving select groups of users access to release, helps you fine tune a feature before releasing it to everybody. These flags are typically considered short lived or temporary flag or a long-term or permanent flag. Feature flags also enable you to do things like testing production. While your test environment may be robust, chances are they're not exact replicas of your production environment. Being able to see how third party components or microservices interact with new features can help eliminate surprises upon release. Flags can help you deploy a feature and see how it acts in production. Other ways that you can use flags during the build process is through targeted roll-outs and canary launches. There are risks when it comes releasing the feature. One way to reduce those risks is to use a canary launch. Canary launches are also known as percentage roll-outs. You roll features out to a small number of users to assess the reaction of the overall system. Teams use this to measure the reaction from real user canaries and they look for early indicators of success or danger. 

Start with a small percentage of users and gradually scale that up to 100% as confidence in the feature grows. If the feature is not good, you can roll it back or turn it off for the feature flag. Using canary launches is just one way to make a release boring in the enterprise. I'm happy to introduce to you today, John Feminella. John is an avid technologist, occasional public speaker and curiosity advocate. He serves as a technical advisor to ThoughtWorks where he works on helping enterprises transform the way they write, operate and deploy software. John lives in Charlottesville, Virginia and likes meta jokes, milkshakes and referring to himself in the third party in speaker bios. John, the stage is yours.  

 John:

 The conventional wisdom around software releases is that releasing more often is objective better. But in many enterprises, the releases of software products to production environments happen far apart, weeks or months after the software has actually been written. In this conventional thinking, that leads to greater risk around releases. Primarily because there is greater uncertainty about the accumulated differences between subordinate environments and production. 

If you're releasing to an environment that looks really different than production, you're not learning all that much about whether it's going to work in production. You haven't mitigated risk that much. Following the conventional thinking, if releasing more often is better, shouldn't we want to do it as often as possible? Not exactly. Simply trying to go faster without better controls will come at the cost of quality like trying to drive too fast on a winding road during a thunderstorm.

What can we do to make releases routine and perfunctory, so that we can ship to everyone with the least risk possible? In this talk, we'll offer some ideas for how to do just that in an enterprise setting. Hi, my name is John Feminella. I spend a lot of time helping enterprises and technology executives deliver portfolios of software better. If you haven't worked with a large company before, it might be surprising to learn just how much happiness, energy and money are wasted through staggering amounts of inefficiencies. We can do better and if a firm doesn't do better, it's competitors might soon teach them the consequences of making that mistake. 

First, let's talk about some of the risks that might occur depending on the approach to and strategy around releases. It will help to think about risk if we first imagined what ideal software delivery might look like and then observe what kind of deviations we see in any given firm's approach to software delivery. Let's think about what it means to do software delivery well at a firm. Ultimately, we're trying to serve our customers. Those customers want to pay for outcomes that improve something about how they work or how they live or if they're a business themselves, how they help their own customers. 

The job of your firm is to provide those outcomes that the customers want. How you do that will be very different for different businesses. For example, if you're a restaurant, the kind of outcomes for customers want are probably, great dining experiences. Well, if you're an airline back when people were still actually flying on planes, you need to get your customers from A to B safely, expediently and comfortably. That means that, like it or not, any company that delivers software must effectively contain a software delivery organization. Even if the airline or hospital or insurance company doesn't think of itself as being a software company, it must nevertheless need to be good at being a software delivery organization, if that's the key way that its customers get the outcomes that they care about.

That's a tough pill to swallow for some firms. Because it's a pretty big shift in mindset and product thinking but it's true. The first experiences customers have now, take place largely on the web or on a mobile device without ever setting foot in a store. If the first experience they have with your software sucks, you're probably not going to get a second chance. If we need to be good at delivering software, what does good delivery look like? One abstraction I've found helpful is to think of software delivery as an assembly line that various people and teams interact with. On one end, you've got the product owner who is wanting to deliver outcomes to customers. On the other end, you have the customers who want to pay for those outcomes. In the middle between them, you have the teams, the product engineering and development teams who are responsible for delivering a software product to that customer. 

What needs to be true about our delivery pipeline for those product engineering teams to be effective at delivering software while minimizing risk? I think we need three ingredients. First, we want to minimize the time spent in the pipeline to keep it as short as possible. So that we can deliver the outcomes that users want in a timely way. If it takes us a long time to ship software, our ability to respond to feedback when the customer is going to diminish rapidly. And we probably won't be able to react with the enough to changing conditions in either the macro-economic environment or the competitive landscape. 

Second, we want to be able to repeat the pipeline as much as we can in a given time. We want to maximize the frequency, the number of times that the pipeline is traversed in a given time. If there are long lead times into each new iteration of the pipeline, that's going to preclude us from satisfying our first objective. Third, we want to deliver the right product and the right outcome to our customers. If we build something that doesn't address their needs or it doesn't deliver the outcomes that they needed, they're probably not going to be happy. But if the feedback loop to product teams is poor, intermittent or low in quality, we risk doing just that. 

If you have the ability to ship software to users every time you're deploying, you're doing continuous delivery. If you also minimize pipeline time, maximize frequency and user happiness, then you're doing continuous delivery well. But most enterprises don't do this well. In doing so, they create three types of delivery risks. First, simply trying to go faster is more akin to rushing and accelerating. [DevOps 00:09:31] rates will spike, things will slip through the cracks and tests won't be written than it should have been. It also ignores the signals that teams are providing about their own velocity, diminishing whatever autonomy they have and creating secondary cultural knock on effects. 

Second, ramping up the frequency can lead to teams clobbering their own work, releasing to environments or moving through pipeline stages that aren't ready for them yet. This is especially problematic when the pipeline in an enterprise environment has lots of stages you have to pass through before you reach production. Third, enterprises usually don't have great customer feedback loops to the engineering team. It's almost unheard of in many enterprise environments that a customer would ever get to interact with a product owner or an engineer. They probably won't even get direct feedback of any kind. 

That leads to disastrous game of telephone, where the feedback channel isn't very useful for informing what the team should work on next to help customers the most. Even worse, all three of these problems are exacerbated by the fact that when releases happened, they happen to everyone at the same time. When we deploy our software to production, we're deploying it to everybody. Is it possible to do better and limit the blast radius for releases? I think it is. If we can focus on not trying to do this for all users, then we can move away from this all or nothing gamble and the odds are not in the team's favor. 

One approach that's working well for many firms is called progressive delivery. To understand progressive delivery, it might first help to talk about how it compares to its ancestors, continuous delivery and continuous integration. A typical pipeline might look something like this. First, we kick things off at some committed code. Whenever somebody makes a new commit build, they'll branch and we'll trigger the pipeline. Then we'll produce an artifact and run the test that we need. Then we'll record that artifact and canonicalize it as the result of that build. Then we can choose whether that deployment should or shouldn't be delivered to our environment.

In this example, we'll assume there's only one environment which we'll call, production. But you can imagine doing this for several times for any number of other lower environments like staging or QA. Once it's delivered to production, everyone can see the work that we've done. Everyone is the target of that release. There is no way to slice and dice who gets to see which parts. Because we deployed a single thing and we released it to everyone. You say that this pipeline is doing continuous integration, if the step between deployment and delivery is a manual one. If someone has to push a button to deliver the deployment to a particular environment, then the creation of the artifacts is automated But the delivery is not. A human still has to press the button to make a call about whether or not to go. 

That's continuous integration. Because we're always able to produce a new working artifact that represents our code. But it's not continuous delivery because the environment might not be ready for our new artifact. If we're doing continuous delivery, then at any moment, the software can be delivered by pressing that button. It doesn't have to go to some target environment. But it can, if that's the right thing to do at that point in time. In fact a human might not be in the loop at all. For many products, it might make sense to get new things into the hands of customers as quickly as possible. But there is one important thing to note here.

In both continuous delivery and continuous integration, we're still releasing one thing to everybody. They all have the same experience, regardless of what kind of user they are. If this release is bad, we might wind up impacting everybody. But we don't want to be overly conservative. Is there a way to manage the risk of releases without impacting our delivery pipeline? Progressive delivery offers one answer to this. Instead of releasing to everyone and instead of making the functionality that users see gated on releases, let's have them gated on something else instead. Like the level of risk a particular user is willing to tolerate or the impact of their desired outcomes if we get it wrong. 

That happens in two ways, user segmentation and feature delegation. Let's talk about each of those. User segmentation means that you don't think of your users as a homogeneous group of people that receive the same functionality in a given release. Instead, you slice them up into some subset of meaningful groups that you'd like to release to in a phased way. The way in which you choose to do this, this slicing, will depend a lot on the products you're building, the risk of a particular release and a number of other factors that will be specific to your firm and your overall product strategy. But in general, there's a few guidelines that we can apply to everyone. 

First, the best way to slice follows natural divisions and user groups that probably already exists. For example, if you have folks who've opted into some beta, that's a very natural target group that might receive features earlier. Or if you have various pay tiers and a free tier, that's another division you might consider leveraging. If you don't have natural divisions that map well to this segmentation, you might instead consider, a slicing by impact. Which user groups would be most affected by this release? For example, if you're about to roll out or change to a banking platform's authentication systems, maybe you prioritize a sample of folks who log in very frequently to get the new flow so that you'll quickly get evidence about whether it's faster for them to use or not. 

You might also consider slicing by, which groups are most likely to give you feedback? Keep in mind that feedback might not be directed, might come in the form of indirect signals that you obtained through observability. Like spending less time on a page to accomplish a task that used to take them longer or purchasing more widgets than they used to. Once we have those segments, we need to choose some reasonable ordering. In general, we're going to want to place user segments which have the highest value for understanding how to proceed with the release towards the top of the step. In the example I'm showing here, we're valuing rapid feedback from lower risk user segments. 

We might choose to release, first, to our internal developers here and then to public power users. Then finally, to the broader casual user base. You can think of this approach like releasing to an increasingly larger set of users. You start with some trusted core that's likely to give you rapid, honest feedback and where you have incentives that align with that feedback. From there, you expand in a phased way until you cover the entire user base. The other side of this is feature delegation. Who gets to control when the next gate opens for the next set of users? To people who perform the technical work of making a release happen in a pipeline should generally be independent of any particular product.  It doesn't scale well for each team to have separate and structurally distinct operations folks. 

But at the same time, it also doesn't make sense for platform engineering teams or pipeline engineering teams to know about the details of a given release. That's not really their job and it's not something they should be focused on. That control belongs closer to the team. Getting that balance of control right is at the heart of feature delegation. Who turns on the switch for a given feature? Generally, we want control of that switch to be placed with specific teams who have the product phasing responsibility of doing the work. Product owners are the ones who have the accountability for the success of their product. In general, they should be the ones that are in control of that button. Autonomous teams should always be empowered to own their own success in this way.

There's a lot of different technical approaches we can take to make these ideas reality in our own systems. While it would be tough to talk about a comprehensive strategy around this because that's going to be really specific to every firm and their engineering culture, I do want to talk about a couple of building blocks that I think are important to consider. First, feature toggles, sometimes you might hear these called feature flags, let teams refer to and modify system behavior without having to change code. They fall into various usage categories. It's important to take that categorization into account when implementing and managing toggles because toggles introduced a lot of complexity. 

In general, we're going to want to constrain the number of toggles in our system. We want to make sure that a feature is eventually released to everyone. We don't want them to stick around forever. It is possible to go too far with feature toggles or feature flags. They need to be used too distinctly. Like in this example, where an errand featured flag and a production environment that wasn't set correctly led to a half billion dollar trading loss for a company called Knight Capital in 2012. In general, I think you want to tweet feature toggles like their branches in your code repository. There shouldn't be too many of them at the same time. If there are, that's a hint that too much is going on at once. 

If the number is more than a handful of toggles, you'll also want to consider some approach to managing them in a cohesive way. Particularly, if you adopt feature toggles as a strategy not just on one application, but as a thing that you do across your entire portfolio. The other technique I want to talk about is called traffic shadowing. This is a deployment pattern where production traffic is asynchronously copied to a non production service for testing. Shadowing is a close cousin to a couple of other deployment strategies you may have heard of, canary releases and blue-green deployments. Here's how it works. 

Let's say we've got this set up, a client, a proxy @service.company.com. And an API @service-api.company.com. If we introduce an additional copy of the service, the blue box on top, one way to validate if everything's going okay is to deploy the new release of the shadow service and then duplicate traffic to it, to see how our updated release behaves. Typically, for our production environment, we're only going to duplicate a slice of this traffic, let's say, 1% and then we throw the results away without any response. We're not trying to respond back to the proxy, we're just trying to understand, does the traffic that we're getting in our real production environment when it goes to the release, how does it behave? 

We might see, if we regenerate any HTTP 500 or if we craft some other downstream services, it wasn't expecting that traffic, that sort of thing. Shadowing traffic like this has a few benefits over simply doing blue-green and canary testing. First, there's zero production impact. Since traffic is duplicated, any bugs and services that are processing shadow data don't have any impact on production. Second, we're testing production services. Since there's no production impact, shattering lets us test those persistent services in a way that's independent of whatever may already exist. You might configure your test service to store data and some other separate test database and then you shadow traffic to make that possible for testing.

 If you have canary deployments and blue-green deployments, you're going to need more machinery like this for testing. There, you can test the actual behavior of a service in a production environment with production traffic. Unlike say, doing the same thing and staging, you're really using real production traffic without impacting the production service. That lets you do a canary rollout that compares on the basis of production traffic, so it's a way to do user segmentation in a really low effort and attempt. You don't have to actually segment your users, you can just segment the traffic. 

 Those two techniques are essential. But together are essential for figuring out how you might pursue a progressive delivery strategy. They can help you get closer to doing effective user segmentation and effective feature delegation. That's all I had to say about progressive delivery today. If I can leave you with one though it's this, not every user is equal in a release. To the extent that we can differentiate between them and to target the right users for each release, we're going to mitigate our overall risk without compromising our delivery efforts. If we can make our releases lower risk, that we can make the routine and perfunctory just like we want it. In an enterprise environment, that can make all the difference between a burnout team that's pushing through yet another arbitrary deadline and an engineering team that's happily delivering on its commitments to every user. I'm John Feminella. Thanks for listening to my talk.  

 Dawn:

 We're now going to dive to some Q&A. If you have questions, drop them into the chat and I'll be here chatting and sharing your questions with John for the next 10 minutes. Let's kick things off with, you talked about progressive delivery, what prerequisites does a company have to complete before starting progressive delivery? Or could they jump right in without doing continuous integration or continuous delivery first?  

John:

Yeah, no, absolutely not. I would say that it's a fairly advanced strategy and you need to have your house in order before you decide how you're going to release progressively to people. Before you can do that, you need to be able to be doing releases effectively to begin with. If you're not at a point where your releases are reliably getting built, if you don't have continuous integration, that's certainly step one. If you don't have a pipeline, that might be a good litmus test to say, okay, you're definitely not ready for anything remotely resembling in progressive delivery. 

I would say a good litmus test for the fact that you are ready is, if you have continuous delivery but you're still releasing to everyone. And you're seeing some impact to certain segments of users that might be mitigated by taking a more progressive delivery approach, that's probably a good time to start thinking about it. Another good time to start thinking about it is once you have more of a portfolio of applications, if this is something that would be useful to adopt at the portfolio level, taking progressive delivery on one particular part of the overall release strategy of this portfolio might be a good way to dip your toe in the water. It's good to have the machinery in place to do that. You can expand it to everything that you deploy to. 

Dawn:

Great. A good follow-up to that is, how do you get the buy-in to make these changes and to do progressive delivery at your organization?

John:

Yeah, that's a great question. First of all, it's not easy, especially, in an enterprise which was the focus area for this talk. In an enterprise organization, there are going to be a lot of different stakeholders, a lot of different interests in how releases happen and who gets to do them and how they should be delivered to users or not. There will certainly always be that, to some degree, in every company. But I think a important aspect of progressive delivery in the enterprise and getting buy-in for those stakeholders is helping them understand what value you're going to bring by getting feedback from users faster. I think, a lot of times the enterprises default modality for releases is, do a big bang, release to a large slice of our users, if not all of them and then cross our fingers that it's all going to go okay.

If you're a little bit more sophisticated, you might decide to do that based on some kind of user segmentation. But even then, I think the releases still tend to be pretty big bang. Like they... For example, I'm familiar with a couple of banks that will rollout, let's say, changes to how they store user profiles and updating authentication schemes to different customer bases within the overall line of business for the banks. That they might roll it out to say, cards first and then to retail banking and then to different verticals within the business. Again, that's still feels like a pretty big bang release. Because it's just like you released to all users whose names start with the letters, A through F and then G through J and that sort of thing. 

I think that getting to a point where the enterprise is comfortable doing progressive delivery really means having a conversation about, what would be the benefits of having feedback from users sooner? What would we be able to do that we can't do today or we can't do as well today without having that feedback? In particular, one of the things that I think plagues enterprises, like we talked about is, longer release cycles, longer times, not just during a sprint. But longer times between releases, longer time spent in that pipeline between ideation and deploy to production. 

If you can shorten that time by having faster feedback cycles from users, then you lower the risk considerably, that you're going to build something that nobody wants or that you don't really build something that a person has seen until relatively late in the game. You're bringing the first point at which users interact with and or see your product and can give you feedback on it much earlier in the life cycle. I think that, that's an attractive proposition for a lot of enterprises, no matter their size, that the way you're going to win them over is to convince your boss or your manager that calving that feedback earlier and sooner would be good.  

 Dawn:

 Again, a follow-up to that is, part of making your case relies on analytics. What role might analytics play in tracking actual application usage and therefore testing an in turn application development?  

John:

The, what role might analytics play? That's a great question. Like I said during the talk, the natural divisions probably exist in the user base already. It probably won't be a surprised to learn that there are different kinds of users in a user base. I think it's actually comparatively rare read you treat users as a homogeneous thing. This probably most easily manifests in any pay tier versus free tier that you might have. Or if there's something like, the first users that were on your system, the order in which people signed up, those sorts of things will lend themselves relatively naturally to divisions. 

Where analytics comes in and user segmentation is to help you understand, to what extent those natural divisions should be relied on for user segmentation. If, for example, it turns out that you may have a free tier versus a pay tier but if it turns out that the pay tier users don't actually behave any differently than the free tier users, that would be a bad, of purchase segmentation. You'd want to find some distinction between them and maybe it isn't the free versus pay tier. Maybe it's some other interesting dimension like, perhaps, how many times they log in per month or how many airline tickets they book per month, if it's an airline. Or how many times they dine out, if it's a food delivery service, that sort of thing. 

 Whatever the metric or dimension that you're slicing on is analytics can provide richer dimensionality in terms of the library of things you can choose from for segmentation. If you don't do those sorts of analytics, you may not have the right kind of visibility into... If you don't have natural divisions, then it's going to be pretty much a prerequisite to have some understanding about the user base and then, analytics as well will help you get there. If you do have natural divisions, then analytics can help you validate whether those natural divisions are the right ones to be slicing on.

 Dawn:

 Okay. When we're talking about user segmentation for releases, are those driven by building access controls in the application or user segment granularity or are there tools that help you drive and figure out this type of release and how to do the segmentation?

 John:

 If I'm understanding that question correctly, it's, how should we actually gate the segments of users? What should we do to control who gets to see what? Yeah. First, you're going to identify what those segments are, right? Then, how you're gating them, usually isn't an access control thing. It's more like, we understand that this user is of type X and the feature is only available to users of types, X and Y or whatever. Really, you're you're exposing the feature to whatever set of users based on those segments that the product owner is decided it's available to. The be-gating happens through deciding what the segments are and then enabling or not that feature per segment. How that happens could be and most commonly is through feature toggles.

Dawn:

Great. For users, you've mentioned the possibility releasing first to internal users, do you suggest creating a dedicated test accounts in production so that test engineers could be those internal users?  

John:

Oh, that's a good question. Should we create dedicated test accounts in production so that the testing engineers could be the internal users? The intent of releasing first to internal users is really more around, when those people would be using the product, anyway. If you're just faking it by having test accounts in production, that's not really the same thing. There should be more like, if you are on say, LinkedIn and you are an employee at LinkedIn or Microsoft, then if you're using LinkedIn every day, the product owner of some particular feature might choose to release to that segment of users, not the internal users first. 

When I say internal users, I don't mean contriving a set of employees who are forced to smoke test the products. I mean, in the case where your product is one that people use in the company. Because it's, maybe I'd say, employee productivity tool of some sort, for example, then that's the case where you'd release to internal users. Just contriving a test account and then releasing to anybody with a test account is more like smoke testing than it is about user segmentation. Because you're not getting genuine feedback per se from a user, you're just having someone validate some use cases. I'm not saying that's not valuable. But that's not really the point of doing user segmentation in that way.

Dawn:

Okay. We're going to go back to feature toggles and there's multiple parts to this. I'm going to make sure you get it all in there. You mentioned having too many feature toggles or feature flags in flight is a bad thing. What would you consider that too many? 

John:

Yeah.

Dawn:

What would be an ideal timeframe for removing them to eliminate the technical debt? Then, should you avoid releasing new features alongside technical cleanup?

John:

Okay. That question seems to conflate, if I'm hearing it right. It seems to conflate technical debt with feature flags. I don't think feature flags are inherently a form of technical debt. Sorry, let me answer the question in the order they were posed. The first one was, how many is too many? That answer is going to be really specific from the team, the product in question. I mean, I would expect that more complex products might have more features in flight that they may want to gather data on and to test on. It's just about, what's your capacity for doing that well? 

I would expect a smaller team not to have 20 features in flight that they're all simultaneously testing in production with users. There's a balance there and it's based on, how much can your team actually handle working on at the same time?  A good, I think, order of magnitude roughly is, you probably should not have more feature toggles than about the number of gate branches you feel like you would comfortably have, if you're doing trunk based development, for example. If you have 500 branches in your gate repository, most people would say, that's probably too many to be working on simultaneously. 

If you have similarly 500 feature flags, each of which is independently being activated or not, that creates a lot of different permutations and the user experience that's probably going to be difficult for a small team to manage. That's the first part of that question. I think the second part was, what's the right timeframe for removing those? I think you stop using a feature flag when it's available to everyone. Once it's available to everyone, there's no point in having it anymore. Because you're not trying to segment or do that. The exception would be that, if you want to continue to maintain a certain level of operational control over the feature, not for user segmentation purposes but for operational purposes, I consider that somewhat different. 

Then it's more like, operational control on whether that feature is active or not and you could disable it if it later turns out to be problematic during some outage or something like that. That's in a different category for me. But if you're using it for user segmentation, that's where thee gate branch approximation stands. If you have 50 different things that you're trialing to different subsets of users, that's probably too many, for example. If you have one or two, that's probably fine. If you have five or 10, that's probably going to depend on how comfortable your team is with doing that many. 

I think the last part was, should you avoid releasing new features alongside cleaning these things up? I think, again, that's a very specific question for each team. Everybody is going to have their own comfort level with the balance of how much should we be investing in paying down technical debt? And what are we trading off by accruing technical debt or what are we trading off by introducing these additional feature flags? It's tough for me to answer that question in a general way. But I would certainly say, again, if all you do is just keep adding new feature flags and new user segments, you're not actually taking them away. At any point, that's something of a red flag, so to speak.  

 Dawn:

Yeah. Unlike the stand of like, it depends, is the answer to every question ever.

John:

I know that's not very satisfying to hear. But yeah.  

Dawn:

But it's true. Again, this is might feel, it depends as well. What would you say are some best practices to ensure that a release doesn't break the app even when the code is behind a feature flag?

John:

What are some best practices to avoid breaking the app even when the code is behind a feature flag? Well, I mean, apps can... I think that's, again, tough to answer in a general way. There are any number of best practices that limit the blast radius of how a new feature does or doesn't interact with the rest of the system. For example, modularity of the code and ensuring that something can be deployed independently of something else. For example, those would all be operational things you could do to make sure that whatever you just released, isn't going to blow something else up or it can be scaled up and down independently, that sort of thing. The goal of the feature flag for user segmentation purposes is to limit who sees what and to target the feedback you're getting in a phased way rather than trying to release it to everyone at the same time.

The risk there is upsetting too many users or getting it wrong from a product perspective. There's a whole separate category of things that you would or wouldn't do around the operational stability of a release and making sure that you're deploying something in a way that's sensible and so on. That gets more into things like, canary releases or blue-green deployments and so on. You may want to make sure that the code just actually works when you deploy it to the production environment. At the point where you're doing feature flags, you probably have validation that the app will run and start in its container, for example and can receive wipe requests. But maybe the experience that the user has isn't a good one when they're doing that.

There are different concerns. The things you would care about to avoid breaking a release entirely are different than the things you would care about for putting something behind a feature flag. To go back to a previous question, how do you know if you're ready or not? If you're not at the point where you can feel comfortable that a release deployed to production will work at all, then you definitely shouldn't be going to feature flags. You need to be sure that the release is going to actually work before you decide to gate individual features of that release behind a feature flag. 

Dawn:

Great. Thanks. Okay. If there is any regulatory requirement which doesn't allow to release a product to users without compliance approval, so having feedback from users is not in scope, where does progressive delivery fit in to such a scenario?  

John:

Yeah. Great question. Right. Of course, a lot of enterprises have a risk tolerance that maybe requires a sign off from compliance or a sign off specific kinds of people within the organization. What I've found is that, it's worth it to make sure you understand what the actual compliance step is. Because I think that oftentimes the way that enterprises have built their pipelines, is somewhat calcified. Because we've been doing it this way for a while, well, of course it must continue that way for the next five years or the next 10 years or forever. Often, the compliance requirement is not per se, you can't release this to users at all without going through these seven gates. It's more like, when you do things of this kind, when you have these kinds of impacts, we want to make sure that you've talked to us or do you do that kind of compliance.

What enterprises are trying to do when they introduce these compliance steps is to reduce risk, right? What they're doing is introducing it as humans validation to say,  "We wish to increase our confidence, that this is going to work and reduce our risk, that this is going to be out of regulatory compliance. " Viewed in that lens, I think it requires a good mindset from the compliance folks. But I think that opens a door to having a discussion about,  "Okay, what could we do to get feedback from users in a way that would satisfy the compliance obligations? " 

I don't think people working in a compliance division are deliberately trying to be obstructionist with releases, they're just trying to make sure that the risk of going to production with a particular release is minimized. Because that's their job. Is to help minimize and manage that risk for the organization. To that end, I think it's worth having a conversation with the compliance folks, at the very least and say,  "Look, we're interested in getting feedback from our users. Progressive delivery is one way of doing that. But if you won't accept that, what could we do to work together to have that kind of feedback loop in place? " Maybe it doesn't come directly to the developers or maybe it. Maybe they have direct access to the feedback portal where our users submit tickets and questions and things like that. That might be another way that they can get that kind of feedback loop in a more regulatorily constrained environment. 

Dawn:

 Okay. Switching gears a little bit, how can we apply feature flagging and these practices to a microservices architecture?

John:

How can you apply feature flags to a microservices architecture? I don't think there's actually any fundamental difference in that. There are some questions perhaps about, where you manage the flags. If you're doing this at the portfolio level of, it's not confined to a specific service but rather to a graph of them or even bigger, like the whole product portfolio. Being it like I said, you're probably going to want some kind of management interface or some kind of management approach to looking at all the feature flags for everything across your organization. 

Sometimes that takes the form of, for example, a configuration repository that people manage and change. Sometimes it's, a SAS management tool, like LaunchDarkly, for example and it lets you take a look at that across a product portfolio. But whatever it is, it's not... The architectural underpinnings are less of a factor than just understanding where all the feature flags are and who owns them and who manages them. Some special considerations that might occur in a microservices environment are really around the complexity of the feature flags. 

If you're turning things on and off on individual services and those services collectively participate in some larger, more interesting requests that a user cares about, then you're going to want to make sure that the user experiences maps well to the feature flags that you're turning on or off. What that means is that, for example, if the overall feature crosses several microservice boundaries, then you're going to want to make sure that when you turn things on or off in a feature flag basis that you're doing so at the grouping of whatever experience users are trying to have. 

If you just are flipping off and on individual switches on a per microservice basis without considering the larger user context, you're going to wind up, I imagine, in situations that don't make any sense. Like having three services need to participate together to tell a user how many dollars are in their bank account and you turn the feature flag off for one of them, but not the other two. Wait, what does that mean? That probably doesn't lead to a state that makes any sense. You probably want to have those be switched on or off at some global or higher level rather than on a per microservice or a per service basis.  

Dawn:

Yeah. So some shameless self promotion here at LaunchDarkly. We do have a blog on feature flagging and a microservices architecture. We dropped the link to that blog into the chat. If you're looking for more information on that aspect of feature flags and microservices, check out that blog. We're wrapping up now, we've got time for one more question. In that time, can you leave our audience with one piece of advice on building with feature flags in the enterprise environment and making those releases more boring?

John:

Yeah. The number one thing that I think people maybe should think more about that isn't necessarily in the head space of folks in the enterprise is that user feedback is really important. If you're in an environment where you have been operating or building or releasing a product in really what I think is the absence, the complete absence of direct user feedback, it might be worth thinking about, how could access to that kind of feedback be useful for your team. How would it informed the decisions you'd make as a product owner, as a team member or as a technical lead on that team? And what information would be most beneficial? 

I think, oftentimes, progressive delivery or an approach like that, that shortness the feedback cycles and allows you to release quicker to individuals segments of folks is one way of doing that. Regardless of how you get there whether it's progressive delivery or something else, I would encourage enterprise teams at think, especially hard about, what it would mean and how they might get closer or faster, better access to feedback from individual users.  

Dawn:

Great. John, thank you once again for joining us today. This has been so informative. I've really enjoyed this. I want to thank everyone else that took the time out of their schedules to attend. Last but not least, we also have to thank Rollbar for sponsoring today's Nano Series Talk. Join us again this time next week where we'll address, the operate pillar of feature management. Rich Manalang will be joined by Michael McKay from IBM.Â