(upbeat music) - Hey, welcome to Trajectory Live. I'm Edith Harbaugh, and we have two really esteemed guests here today to talk about how feature flags are transforming modern software development. So do you two want to introduce yourselves? Steve, you wanna go first?
Hi, I'm Steve Constantin. I'm a senior principal engineer at Express Scripts.
Hello everyone. My name is Mick Currey. The title is Enterprise Architecture for Cloud and Cybersecurity, but all that means is I help people go to the cloud.
I'm really happy you both are here today to talk to us about software development and feature flags. So I know you both have a lot of experience on this. So tell me about, when was the first time you were introduced to feature flags? And when did you realize how much feature flags could power beyond just simple on off switches? Mick, you wanna go first?
Sure, I can take that first. It was a bit ago and I won't tell you, it was a perfect experience, you know. It was on one of those high-performing agile teams and we were really lucky, after every release, we had a week to do whatever we wanted. And so okay, during one of those weeks we said, "Hey, why not implement feature flags?" And so, we attempted it and it wasn't stellar. It wasn't perfect. And to just set the stage, we worked in weekly iterations and so we had normally do two to three things, each iteration, not one thing. So we thought, "How hard can this be?", (Edith and Mick laughing)
How hard can it be?
Yeah. So we learned the lesson, and it wasn't quite as easy as we thought. And so that was my first experience. It wasn't a perfect experience.
What did you like about them and why are they harder than you thought?
When you first look at anything and you get a POC it's usually extremely simple, right? You can get the POC up and running immediately. But then you get all the, what if's, you know and all of the conditions, and all of the things you're audited on. (steven clears throat) And all of those extra you know bits of data that you have to then code in. And so to actually use it in production, there ended up being a lot more to it than we thought of just turning things on and off. So it's a live and learn.
How about yourself, Steve? What was your first experience like?
So it's been a long time, you know. The first iteration of, I wouldn't even call them feature flags. I'd call them, "Oh, no flags." Right?. So we would have... the functionality we're rolling out into production we have the behind flags, in case when we rolled something out, something catastrophic occurred and we could turn them off through changing the flag and recycling the services and things like that, not seamless, but something that didn't cause us to have to do a new release. So, rudimentary not a dark launch. You launch it just like you launch it and then you have this, "Oh, no, we've broken something. Let's turn this off so we don't disrupt our business." So that's really the first introduction. That's been a long time. I've been doing something like that off and on for, well, I don't want to admit how long, a long time. In more recent years, we've really had, you know expressed... we've been doing a lot of experimentation, testing with our applications. And that's really where we started to see the value of being able to granularly control what the consumer sees, what the end user sees. So we've really started taking more, putting more effort into using feature flags around our experimentation.
Yeah. What led you to decide to invest more in it to go beyond just the simple on
off, is to call up the, like you said, the, "Oh no, button?" What was changing where you wanted to get deeper with it?
So, there was a lot of it. I mean, again, the experimentation has driven us to wanna do that because we're seeing a lot of value in the experimentation that we're performing. On top of that though, we had many teams, doing various ways of managing feature flags, and it's all random, everybody has their own custom implementation. So we were seeing a lot of wasted time and effort on people implementing their own things with varying levels of success. So along with wanting to do more experimentation, and have a better way to manage that, a more granular way to manage that. We really wanted to get away from a lot of different teams, managing things on their own, and having a centralized platform for us to manage these feature flags and to have a better view of what is set when and where.
Yeah. I wrote an article for InfoQ about like mistakes you can make feature flagging. And a big one, it's just different teams that don't even know that other teams have feature flags, which can cause a lot of trouble.
Absolutely, absolutely. Yeah. Or varied feature flags to try and control the exact same thing throughout different levels of the stack. Right.
So if it's not all switched at the exact same time, chaos happens.
Yeah. I compare it to a fuse box in different rooms where it's like, "Well, is it on or off? Is it on or is it off?" Right. "I don't know."
Yeah. I guess you want two separate switches that have to be flipped at the same time when you're launching nuclear weapons, but for features and production, you really don't want that.
Yeah. (Edith laughs) How about yourself, Mick? I know you've had some similar things happening at fidelity with different teams working on different projects. Do you want to share some of your experience?
Well, as you know, in large companies there's everything under the sun. So we have people doing advanced techniques and we have people doing the things that happened in the past. And so we have homegrown systems and think of the old days of development. You know you code on something, every six months you release to production. In that situation, I'll suggest using feature flags is for risk mitigation. It's a different purpose than some of the other teams I'll talk to you next. So in those situations, maybe that home grown system is okay for those people, for what they were doing in the past. In that, if it was updated once a month, you weren't doing weekly releases, then maybe that worked. But now transition. (laughs) Transition to the cloud, transition to agile teams, transition to people want things yesterday, and they need things to go out very quickly. So when you bounce up these new paradigms against the old model, obviously there's a huge contradiction. And not to mention some of the old systems being home grown, there's a lot of care and feeding. And it takes a lot of work to maintain them and keep them going. And I'm not gonna tell you it's the easiest thing to do. So, you go along the curve and you start rationalizing things and you have all these things converge. And so you start to think, "Is there a better way to do it?" And we have some business units are already marching along this path, doing great, using tools to be successful. and then we have other business units that are just starting down the path. Like one of the teams that I'm working with, it's a brand new team, agile team, but they're doing weekly releases, and that's new to them. And for that team, the old ways definitely don't work. So there are new things that they're having to think of and new approaches they're having to take to make things more seamless and to mitigate the risk. Think about it also from another point of view, the old ways you had VM, physical hardware out there, how many teams do you know, with say 50, 60 VMs out there, that were doing blue greens and had 100 total? Half of them did and half of them up, hardly anybody did it. I do know folks that did it. (laughing) But there are the rare folks, where when you have something like feature flags, you can mitigate the risk, without having to spend all that extra money. So that's why something like this, I see it as working in both the old paradigm and the new. It has a different purpose though, in my mind. The old way, risk mitigation, the new way it helps you be more agile, so that you can go at things faster.
I like to think that you can have both.
Oh, definitely can have both.
But, when I work with the two different sets of people, they think in different ways. So the one set of folks, they're gonna be thinking, how do I mitigate risk, highly risk averse. They don't care about going fast. (laughing) So, and then the other group, they mostly care about going fast and then you put controls in place to make sure everything's safe. So that way you have automated safety. So it's different ways of approaching it and people totally different viewpoints in those two groups.
Yeah.. The best thing I like about feature management is that though you can do both of those. You can have less risk and go faster.
Steve what's driving the initiatives at express scripts. Is it reducing risk or moving faster?
Yes, (Mike and Edith laughing) It's both. So there's a lot that we have going. So first off there's our experimentation side and then there's our DevOps side. So on the experimentation side, I mean we're looking to implement more and more experiments because, very minor things that change in a UI or in a mobile application, can have a great impact on how people use the application. So having those feature flags and easily controlling those feature flags out there, that makes our job of doing that much more... It's much simpler. Now from a DevOps side, you know we want to encourage releasing early often. And having all of your new features behind feature toggles, behind feature flags, is going to enable you to deploy more and more often. I mean we're starting to move toward the Dora metrics for DevOps. That's how we're judging our teams and how well they're performing is based on those metrics. And one of the things I always like to say is whatever metrics that you put in front of engineers, they're gonna to try to gain the system. (Edit laughing)
Whatever you do, they're going to try to gain the system. They always do, they always have, they always will. But, if those metrics that they're trying to gain when they gain them, actually pushes them in the direction of what you want. That's what makes it perfect. So, like with a frequency of deployment, honestly, I'd be thrilled if engineering teams were deploying their applications or their software, every hour on the hour, even if there's no changes to it. Because what that does is that proves that they are an agile group that can do these things with little to no risk, often. So then suddenly they're adding code to that, and it's all behind feature toggles. So they're deploying things and nothing changes until they decide, that they want to make those features available for the users out there. Cause that's one thing that. I keep trying to stress with you know changing the wording. Coz everybody talks about deployments, or you know code releases. I am definitely starting to have that, say we have code deployments, and then we have feature releases. It's still trying to get that through. It takes time to change that mindset, but we're working on it and we're hoping to keep moving that forward. But with that, as teams start adopting feature flags and putting all of their new applications by all their new functionality behind feature toggles and being able to manage those feature releases much more granularly, we're starting to give them this carrot of, well you can start to have touch less deployments now and if your code gets merged into production and an hour later, or it gets merged into your main branch and an hour later it's in production. So that's really enticing our teams, to want to move in this direction.
Yeah. I was an engineering manager, that's why I'm so passionate about this. And I used to call those push and pray releases, you know when you were like... (laughing)
Like my theory is that releases should be completely born. You know, It should not be introduced like boring and just, oh.
Exactly. So that's, if you're repeating them constantly every hour on the hour, even if no code has changed, that is amazing. That would be an amazing thing. I'd love to see teams do that. All of our teams do that. We're a ways away from that, but we're pushing in that direction,
Micky I do think you have a framework around the four stages of DevOps maturity. Do you want to talk a little bit more about that one?
Well, as an architect, I tend to think in terms of frameworks, because it's a nice communication vehicle for development teams. And that you can look on a maturity level and you can see where you're at on it. And then you can see the little steps that you have to take to get to the next level. So that's why I like it as a communication model. And for agile DevOps, you know, Dev SecOps, you know a kinda of put all that together in the model. It's any model I view , the bottom level needs to be either at or below where people are at today with that principle, they need to be able to see themselves on the model, so that they can say, yeah, that's real life. So the beginning stage is usually something that is pure chaos, you know. People aren't really doing anything. and then the next level is that they decided that they need to do something. They may not be doing it correctly, but at least they're trying, they're trying to do some automation, they're trying to get something started, they're starting to standardize. And then you get into the level where they're really doing some good things and then you have your things are standardized. You do have consistent automation, you have your self service, you have all the beautiful things that we love. And then you get to the highest level. I tend to deviate from a lot of folks and I call it self healing, where people are at an understanding point of view, where things naturally progress. And if you think of the old agile teams with the old mantra, you're at that stage where everything is in a constant loop of improvement, and so things are constantly getting better. And part of that is from the system, and part of that is from the people and the culture. So just a huge shift in how everything is synergistically working together. And so with that approach, you can go with multiple different teams, multiple different business units, everybody can see where they're at and then people can learn. So like, say you're on the second wrong, but you want to be in the third wrong, then you can talk to teams that are already doing that you can go into gate, and you can take things that they've already coded and it can accelerate you know getting there. So just a simplistic approach, but the simpler, the better to help people adopt it and move forward.
And do you see feature management fitting into those stages beyond just payoffs to self feeling?
It's at one of the top layers because for us, for working through, you know, being a true agile team, being a true DevOps team, you know working through having automated self
service pipeline, so it's end to end and people can get those up, not in, you know two weeks, but in two minutes. So that things are just lightning fast and easily instantiated. And then we keep on adding on top of each layer, so that there's more benefit and more benefit and more benefit. And I know you've heard this a million times, it's all the shift left. You know, how can you automate things for folks, to make it easier on the dev teams, where our true goal is to do even more than that, where you take the common software patterns that people are coding every day, you bundle that up into a hunker code, put it in gate, and then that hunker code has all the security, it has all the audit, It has all the governance everything's built in. So when they check that out, all they have to do is put in the business logic.
Yeah. And then that just ties into your pipeline.
So obviously the only teams doing that (laughing) are at the top, but that's where we're trying to get everybody at.
Yeah. I go back to what Steve just said about separating out releases from features.
I love the way he worded that. That's what we're all trying to get to. But as he stated it's a culture change, it's not necessarily a simple thing, you know. Just because the folks here are believers, it's a different way of thinking. And so we have to change the culture in order to make that the way that people think.
Yes Steve do you have to a lot of culture change to get people to start using feature flags? Or was it something that people were pretty quick to see the benefit of?
Oh, you're assuming that I'm done with that, culture change. It's a constant battle. I mean, it's something we have to go through all the time because it's still, they don't necessarily understand why they should be doing this. Like you're making my code more complex. Why should I do this? And so it's taking time, it takes education, It takes a little bit of hand holding sometimes with teams, but we're getting there, we're getting more and more adoption out there. So it's keep pushing that and again, I think proving to them, showing them the value, of why you would do these things. You know, some teams still have to release their code at night, right? Because they haven't gone through the due diligence they're not mature enough organizations, to where they can release their code any time. I think it's pretty easy to make that argument to them to saying, Hey, if you start doing this, and if you start ensuring that your code is properly automated tested before it goes into production, and when you're deploying to production things don't actually change until you decide, if you do that, you're going to be able to start releasing any time of day. You don't have to be up at two in the morning to do an off hours release. So, you know we still have a lot of teams that have to do that, right? We have other teams that are getting more mature and bringing it earlier into the day. But then we also have to prove to our support organizations because they want you to say our release management and our compliance people. They're very concerned about all this stuff so you also have to get that mentality shift to them. And we have to have a proven track record, of good solid releases that don't interrupt anything. So we can start doing them anytime of day. So it's a constant struggle. I mean, well struggle. It's a constant education that we're doing. And hopefully, in a shorter amount of time, we're gonna start getting that to be the norm versus being the exception. It's more and more adoption that we're getting across the board, but again most of our adoption right now is with in the experimentation's point of view. That's where we're really getting a ton of adoption with feature flags is during the experimentation.
Yeah. I found those 2:00 AM releases, very stressful because if something went wrong, it was 2:00 AM, and usually the person you wanted was asleep or somewhere else.
Exactly. Or then, it's from 2:00 AM and at 4:00 AM, you're still going through testing, and then they're like, okay, this isn't working, you've got to back out, and you have only until like 5:30 in the morning to back things out, and have the previous system up and running. So I was on plenty of those calls, I don't like them not at all.
(laughing) It's like it's not a time or a place, when you can sometimes make good decisions.
Mick how are you doing culture change and what are some strategies you have?
Well, I'm never gonna tell you we're doing it perfectly. You know I'm always learning from other people. And so we traditionally attack it from two different directions. So we'll start at the top, making sure there's buying in at the top and then at the bottom for grassroots. And so that way we're trying to converge and meet in the middle. And usually at the top level, you know that's usually easier, honestly, because you're doing things for very specific reasons that can usually be turned into dollars and cents and a story. And it makes sense to get by. But for the bottom level, normally we have to show a greater efficiency. You know it has to be easier, there has to be a reason for the development teams to do it. You know, what's in it for me. What does it make it better for my development team, that type of thing. So that's the key where we partner with different teams, in order to then find success stories, then promote those teams that, did the great thing. So they get the visibility and then other teams want to you know copy to also do that good thing. So definitely grassroots, but trying to help the people that are forward thinking, and then promote what they do, and make it very apparent, across a wide audience of the great things that they did.
So that's how we're trying to approach it, and always open to learn new ways that are better.
Yeah. To go back to what Steve was saying, you have to shift the culture from rewarding the people putting out the disastrous releases like, oh my gosh, we were up all night saving the release, and shift that to, we got to sleep soundly because everything was working fine.
Yep. And, we had a different approach to that, Steven in a past team, we didn't like staying up at night for installs. (Edith laughing) So, and we were a small team, and we were releasing between all the folks on the team we were releasing every week. So it was just not fun. So we just automated it all (laughs) instead, automated deployments, automated, approvals, automated smoke test. So that way we were just, people were at home, you know just looking for an email, if something didn't go how it was expected to go, something failed in a smoke test. Now I'm not gonna tell you all the teams do that. That was, you know a great thing, but that's definitely the goal. And now that for the folks in the cloud, you know cause that's what. I specialize in, (laughs) we have a lot better controls in the cloud. There are a lot more structure. People can't touch production, so that kind of requires automation in the first place. So we have a huge benefit there where, because of the streamlined, because of the controls we have on place, that it allows a more reproducible process, so that what you end up with, is usually, far ahead of anything in the old days. So I like tweaking it that way. So it gives us a higher probability of success.
I have a quick question for Mick about about how do you see blue, green releases and feature management fitting together? Coz you touched on this earlier.
I see a progression in it. The natural progression that I've seen for teams is blue, green is really simple, it's out of the box, all these frameworks they're using have a built in, so it's a natural way to start. They see these great benefits, you know, you do a rollout, oh the smoke test found, something, you know, oh okay, no big deal. I just flip back. But then these teams start becoming more agile. They start wanting to go faster and then they don't wanna take that hit. They want to target very small things going out. Then it becomes a little bit more difficult for those teams. I'm not saying it can't be done with blue, green, definitely positive things to it. But that's when feature clients, in my opinion, make it an easier approach, because it's so targeted about what you're trying to do. It does the same thing for risk mitigation, so you get the same benefits there, but because it's so targeted, the team can actually go faster, in what they're doing. So I view it as a team's progressing at that agile ladder, the DevOps ladder, they eventually will reach a point, where they'll bottleneck. And you know all agile teams how it works. Eventually get to a point, there are external influences that become your evolve next. So then you figure out how to remove that bottom neck. Eventually this becomes a bottleneck. So they look in their toolbox, and then they add in a new tool, that will remove the barrier.
Yep. Well, we have a few minutes left. I wanted to ask you both, what has been a surprising benefit of feature flags, feature management or something that you were surprised that you could do with a feature flag?
That's an interesting question. I don't know that I've been surprised by anything, necessarily. You know, I mean again, I keep talking about experimentation. I was one of those people that somebody talked to me about AB testing and experimentation, I was like, oh, makes complete sense, why don't we do that for everything? Right? So I don't know that feature flagging and feature toggles have necessarily surprised me at all. I think it's just when you really lay out their logically, it just makes sense, right? That's where I really stand on it.
Cool, I think that, there are opportunities to get rid of all the work that we were doing before supporting the homegrown systems. So I don't view it as something new. It's something that's easier, something that is easier for the developers to use.
Yes. Less work for a centralized team to maintain. So it's gonna be better on the longterm picture because we're not having the technical debt of an old system and easier for us to do things in a more agile manner. So we want those same old benefits. Of course, we can get them, with the new ways and then we want the added benefits of the agile, you know teams. So I don't see it as things that are new, but I see it as, you know enhancing what we already have and making it easier. Now, we know that there are capabilities we haven't tapped into. So when we get there, then probably have a different answer for it.
What is the tipping point for you from going from a homegrown solution to not homegrown?
Its honestly, it's the cloud journey, and we have thousands of applications, and the company is going to the cloud. So it's a kind of easy if you think about it, the people that are going to the cloud want to do things new ways faster and you know just be more efficient. Those are the teams that are doing the new targeted things. And then the teams that are still in the old space, you know there's work there, but there's less work on that side. And when you are supporting thousands and thousands of people and thousands of applications we're okay with that. You know we're okay if it's just the folks, you know over here, going to the cloud.
Steve, anything for you about a tipping point from homegrown to not roll your own anymore?
Well, the homegrown solutions didn't have the flexibility that we needed. Their targeting was very structured and only built for specific use cases, and we needed more flexibility in how we were targeting the enabling or disabling of the individual feature flags. So that's really one of the things that you know... cause we reviewed our homegrown solutions to see, would these solutions be adequate for us? Would they do the job for us? Do we need to go out and purchase a product for this? And they just weren't there right? First off, the usability of the systems were not great. They were maintained as the teams needed new features, they weren't really looked at as products, right? They weren't looked at as products to support this thing. They were looked at as something I need to create in order to get this individual task done, that's it. And then once it supported that task, I didn't necessarily go back and look at it. So we really wanted something that was going to be out there maintained as a product. And so new features just appear for us, right? So new functionality exists for us that we may or may not take advantage of. It may not be perfect for us, or it may not meet our use cases, but at least have that ability to do that out there. And that's really what drove us down the path of looking at a commercial product to start managing our feature flags.
That's good to hear. Well we just have a minute left, any final winners on the less risk or more innovation vote?
Both? (Steven laughing) Well thank you both so much for your time today. I really appreciate hearing your stories. Thank you for sharing them and then thanks for being with us today.
You're welcome. Its a pleasure
Thank you. (upbeat music) (upbeat music)