This July the Test in Production meetup was back at the Heavybit Clubhouse in San Francisco. Adam Zimman, LaunchDarkly’s VP of Platform, spoke about progressive delivery and how it can be used to improve user experiences.
“In continuous delivery, they talked a little bit about the notion of percentage roll-outs, but it wasn’t something that they really kind of emphasized. What we’re talking about is really encouraging an emphasis on release progression, this idea of progressively increasing the number of users that are able to see or are impacted by changes to your service.” – Adam Zimman, VP of Platform at LaunchDarkly
Watch to learn more about progressive delivery, its relationship to continuous delivery, and how teams are using these processes today. If you’re interested in joining us at a future Meetup, you can sign up here.
Adam Zimman: I’m here to make a case for progressive delivery. If you haven’t heard of progressive delivery, don’t worry. I will be talking about what that is in the course of the presentation, but the case that I’m going to make oddly enough involves my daughter and my mother-in-law, so stay tuned for that. Just to introduce myself a little bit, I’m currently the VP of Platform at LaunchDarkly. Prior to that I spent about 10 and a half years at VMware, about a little under five years at EMC, was also at GitHub for awhile, and had been an advisor to a number of companies in the dev tools space in Startuplandia. My background is in physics and visual art and my early childhood career was as a fire juggler, so I know a little bit about testing in production, because faking fire is actually difficult.
All right, so to start things off, I have a chart that … who’s seen this chart before? It floated around Twitter, Facebook about a year ago, a year and a half ago, when the report came out. This was a report that was done by BlackRock, which is a consulting agency that was looking at technology adoption trends. I find this graph fascinating, and let me tell you what I take away from this a little bit, and it kind of sets the stage for the problem or the challenge that I’m going to be discussing tonight.
The thing that I find most fascinating about this chart is that if you look at this from the perspective of birth year, so when an individual is born, if an individual was born basically anytime before about 1960 or so, the technologies that came about in their lifetime, they had the ability to wait not only years, but usually decades, before it hit mainstream adoption. So the kind of ingrained behavior that they had was this notion of, “New technology is going to take time and I have time to adapt,” right?
Whereas if you look at folks that were born after about 1992, 93, basically in their lifetime, every technology that’s been introduced, they have had at most a year before it has reached mainstream adoption. If you look at folks who were born after 2000, in their lifetime, every technology that’s been introduced or technology that’s reached mainstream adoption, it’s happened in a time span that’s measured not only in months but potentially in weeks, before it reaches greater than 60% adoption. Think about the scale and the change of that, right? It starts to make you think about this kind of notion of learned behavior in what we encounter.
Now, I’ve started to refer to this as the daughter-mother-in-law challenge. So let me introduce you to … this is my daughter. She is currently 12, this was taken when she was a little bit younger. This is her replacing the screen on her first iPod touch. It cracked and I said, “I’m not buying a new one, but I’ll help you replace the screen that’s on it.” She was like, “Okay.” So this is a kid who has grown up in this … she was born in 2003, she has grown up in a world where the iPhone has always existed. The Internet is always on. She can take any application and she’ll figure it out in about 3.2 seconds. She’ll have discovered all of the advanced functionality. She’ll have figured out how to customize it to meet her own needs and desires, and if it doesn’t work for her, she’ll kick it to the curb because she knows that there’s another app for that. Anything she wants to do, there’s an app for that.
Now, on the left-hand side of the screen is actually the desktop of my mother-in-law. Now, my mother-in-law’s local, so I am tech support for both of these individuals. You’ll notice that I had to blot out some of the Post-it Notes that are stuck to my mother-in-law’s screen. Those are her passwords. Now, I want to be perfectly clear, because my mother-in-law is by no means technology adverse. She’s actually been a statistical programmer for UCSF for the past 20 years. She works in SaaS on a daily basis, writing SaaS programs. So she is a developer. However, she grew up in a world without computers. For her, when she wants to learn a new task or learn a new workflow, her default behavior is to actually write that workflow down longhand on a sheet of paper that she will then refer back to every time she does that workflow.
Now, if we, as application vendors, change the color of a button or move it from one side of the screen to the other, we literally ruin her week. We have interrupted her productivity for a week, until she can actually figure out where that change occurred and how to accommodate for it in her instruction set. So this is a very different mentality. This is something where this is the difference between someone who is readily capable of adapting to change … you could change the workflow while my daughter’s in the middle of it and she’d just roll with it. She’d be like, “Okay, yeah, sure.” You do that to my mother-in-law and she hates your application. She wants her applications to be … I refer to this as … she wants her applications to be like this hammer she has. In the junk drawer in her kitchen … yes, she has a junk drawer in her kitchen … she has a hammer. This hammer belonged to her father. She knows every time she needs a hammer, she can go to that drawer, she picks it up, she knows that it looks the same, it feels the same, it’s going to do the same thing. She wants her applications to be as predictable as that.
Now, she does recognize that new functionality, new features are great and sometimes provide her with the ability to complete tasks more quickly, or complete new tasks. What she wants from those applications though is the ability to have the cadence of the change be predictable and be something that, ideally, she can even opt into or control.
So the interesting part about this kind of duality of persona that I’ve been thinking about and pondering about for the past few years is how do we as application builders and SaaS or service providers accommodate for these two personas moving forward, knowing that within the next three to five years, these two personas are actually going to be sitting at desks side by side in the same companies, being asked to use the same tools, being asked to complete the same tasks? How do we make both of them have a positive user experience?
A little bit of a quick history, now that I’ve set the stage of what this challenge is and how we got here and why this is something that we need to kind of think through together. This is a simplistic graph showing the change in release velocity over time. This is the pace at which we have, as a software industry, innovated, or the pace at which we have created software development life cycles. In the beginning there was the notion of using Waterfall. This was mainly due to the fact that software was so tightly coupled to hardware. Hardware has a very distinct kind of coupling to a Waterfall methodology where you really need to think things through, build, design, and then you do your fabrication and it’s really hard to make changes after that, right?
Then, hey, look, software came along, yay, this is great. We’re going to be able to make changes more quickly. But we stuck with that old development model. Now, fortunately we did figure out that we could move more quickly and people said, “Oh look, let’s build this new model called Agile or Scrum and we’ll actually start to iterate more quickly on our software. We’ll start to be able to move more quickly.” And the first kind of major transition in the enterprise space was that of Salesforce, right? Salesforce came along and completely disrupted the market for two reasons. One, that they were completely based as a cloud service. You could not deploy Salesforce on-prem, right? They were a very strong stance on that. Now they’ve since backed away and they have some big customers, but that was their stance. “You will take our updates when we push them out. We will give you a predictable cadence, it’ll be quarterly, and we will update faster than any of our competitors,” who were still on 18-to-24-month cadence schedules.
Next, the big change that came about in an industry perspective was really kind of highlighted by Facebook, and that was this notion of, oh yeah, we’re going to just ship every minute. We’re going to actually have your first day as an engineer on the job, you are pushing to production. They said, “The thing that we want to do is move fast and break things.” Now, the subtext to that was of course that they needed to have really stable infrastructure underneath and they needed to have the appropriate dev tooling to be able to make sure that that fast movement and breaking of things didn’t actually take down the service, or at least not take down the entire service for everyone. So the ultimate objective in this was faster time to market, but with less risk. You wanted to be able to maintain the ability to reduce your risk.
So this was the kind of advent of continuous delivery. This notion of, “I want to move fast. I want to make sure that I build this machine not only from perspective of my service, but my infrastructure and my developer tool chain and my developer pipeline. I want to make sure that I can move as fast as possible and I get predictable outcomes every single time.”
Now, the challenge, of course, is that when some people tried to do this, it had problems, right? If you don’t get the tooling right, if you don’t get the workflow right, you can have catastrophic failure that just cascades across your entire infrastructure. So you start to think about, well, how do I actually address this? That was something that continuous delivery tried to do with continuous integration testing, making sure that you were having strong test suites, things like that, that were constantly running, making sure that you were being very thoughtful about how those integrations were taking place throughout your entire code base. But that wasn’t really all of it. There was some underpinnings or some subtexts that weren’t necessarily brought forward. That’s why so many companies got scared about this notion of continuous delivery.
The number one kind of context that was missed so frequently was the notion of separation of deployment and release. Companies that do continuous delivery well fully understand that they want to deploy whenever they want, but that is not the same of actually releasing to customers. Running in production is not the same as exposing everyone on your production service to everything that’s there.
So how do we do this? The industry best practice for this is feature flags. This is the kind of most common way that developers have successfully been able to manage this. Ultimately, a feature flag is a control point in your code. The amazing part about a feature flag is it gives you the ability to actually augment the behavior during runtime, so you don’t need to reinitialize your application. You don’t need to reboot your service or anything like that. It is literally something that you can change on the fly and those changes will ultimately take place as soon as you make them.
Another way of saying this is this is a fancy IF statement, right? If this thing meets this certain criteria, then I will do this new thing. If that criteria is not met, I will do the old thing. So this gave you the ability to put code into production that had a code path and a workflow that was known-good, and a secondary code path that you defined in your same exact code base that was new, that you could try out, but you could try out explicitly for a specific audience. Ultimately, control points in your code. Right? Really cool. Really simple.
Now, this was something that I started thinking about a lot a few years ago, started thinking about even more as I joined LaunchDarkly a couple of years ago, and started talking about with other folks in the industry. One of the individuals that I was talking a lot with was a man by the name of James Governor. He’s a friend who’s an analyst at RedMonk. He and I started thinking about, you know, how is this something that so many folks that, when they first think about continuous delivery, their immediate reaction is fear or anxiety that, you know, that’s great for a Facebook, who doesn’t have enterprise customers that have the requirement of five nines uptime or don’t have other types of requirements on the security or validation of their service?
We realized that a big part of it was that so much of the tooling that the companies that were doing continuous delivery used wasn’t something that was brought to the forefront. So we needed a new term and to highlight some of those new tools to be able to make it so that others could be more successful with this new type of software development model.
The term we came up with was progressive delivery. Now, from a perspective of progressive delivery, this is not something that is taking continuous delivery and throwing it away and saying, no, this is all net new. No, not at all. This is taking advantage of industry best practices of software development. We are iterating. We are making some slight iterations to continuous delivery to make it a better model for more companies.
The two key concepts that we are highlighting in progressive delivery that really were not there in continuous delivery were, one, the notion of a release progression. In continuous delivery, they talked a little bit about the notion of percentage roll-outs, but it wasn’t something that they really kind of emphasized. What we’re talking about is really encouraging an emphasis on release progression, this idea of progressively increasing the number of users that are able to see or are impacted by changes to your service. This is the core tenet of test and production.
The second aspect is the notion of delegation. Delegation is something that’s really interesting when you start to think about also the other changes that are occurring in our workplaces or the professional industry as a whole, and that is that more and more companies are dependent upon software for larger aspects of their business. More and more roles within those companies require their users, the workers, to actually interact with software. With that, we need to start thinking about how do we start to have a delegation model that makes sense and that is actually well articulated to allow the individual that’s closest to the outcome to be able to impart the change?
If you think about it from the context of developers, you want your developers or your engineers to be able to focus on building things. If you have to have a large percentage of their time spent on being able to impart changes around who can access what in the product, then you’re doing a disservice to not only your customers, because they have to wait for the developer to have cycles to do it, but to your developers, because you’re actually taking away from the time that they would use to build new things.
With that, let me talk a little bit about the kind of progressive delivery development life cycle. It’s fairly simple. We split it up into three segments: concept, launch, and control. I’ll go through those quickly so that you can kind of get an idea of what each one of those phases is for.
So the concept phase, this is really kind of the idea of starting from the very beginning. This is what, in previous software methodologies, was oftentimes the design phase or design-build phase. Thinking about how do you get the requirements? The definition of requirements, the constraints on the systems. Ultimately, what we want to get people to do is think things through. The key distinction with the progressive delivery model is that we want you to think through not only what you’re building, but how you’re going to deliver it. How are you going to actually incrementally deliver this to individuals? It’s really important to start thinking about that from the beginning because ultimately that’s going to decide where you put the control points in your code and how you structure those.
The next phase is the launch phase, and this is something that a lot of the most successful organizations that are already doing continuous delivery really have nailed. They’ll do things like Canary releases. They’ll do blue-green deployments. They’ll do beta testing, they’ll actually do cohort segmentation so that they can release to a broader and broader audience. Ultimately, this is the notion of controlling your blast radius. How do you make sure that you are limiting the people that are impacted by the changes that you’re making to your service?
And of course, you know, we’ve taken the opportunity to create this meetup space because we’ve realized that instead of being kind of an Internet meme of … that developers used when they indicated they didn’t have time to actually test their code, we’ve turned this into … we’re hearing an increasing number of large scale enterprises building out TIP, or test in production, initiatives, where they’re saying, “We’ve recognized we can no longer depend on the outcomes of testing and staging. That’s not enough. We may still have a staging or a pre-prod environment, but we need the ability to be able to put real workloads, new code, into production and understand how it’s going to interact with our service as a whole and the Internet as a whole.” Very few organizations are capable of replicating the Internet on a developer’s laptop and so they need to think about how can they actually do this safely and what mechanisms can they use to be able to get the outcomes they need?
The final point is control. The control aspect is an interesting one, and I say this is interesting because as someone who’s worked on the product side for a long time, I have never worked on a product that has been done. Features and products, there’s always room for improvement. Whether it’s, “Oh, there’s a use case I didn’t account for. Oh, there’s a bug that I needed to fix.” There’s always that notion of, “I can make it better. My customers are trying to do more with it. I have new ideas, new things to add to it.” So you want to be thinking about your products or your services as something that are constantly evolving, something that you’re constantly iterating on. If you have the ability to make changes to some of the aspects of those features or products on the fly, you’re significantly reducing the amount of effort and resources needed on the engineering side to be able to change things that are already in place.
Examples of this are things like circuit breakers, that would give you the ability to shut off or turn on new features or new products for certain areas of your infrastructure. That could be geolocation, that could be various API rate limits that you put between your services and your stack. You may want to be able to create feedback loops. This is kind of common in the context of being able to do beta testing or experimentation or AB tests, and being able to make sure that you can actually test things out safely with your users and be able to get realtime results and understand how they’re interacting with your products.
Now I want to circle back around. I’ve talked a lot about this notion of the kind of release progression to this idea of a progressive delegation. Something that we see a lot in our customers is, as they start to be more thoughtful about how they’re building their software and putting these control points in place, they realize that they have these known-good switches, these control points that they can hand off to other members of the organization. Whether that’s something where a developer feels more comfortable handing things off to an operations team to be able to provide them with some sort of safety valve or kill switch for a new feature that’s going out the door, for a product team that wants to be able to do some real beta testing, or for a sales or customer success organization that doesn’t want to have to wait to open a ticket for an engineer or developer to be able to update a feature that’s available for a new customer. Instead, they want to be able to have the ability to control that themselves, and they’re taking on that task so that the developer can focus on actually building things. So there’s a lot of desire that we’ve seen and that we continue to see grow from our customers. It’s something that we think is extraordinarily empowering for those users.
With that, I want to be able to think about how we as a community are continuing to build software that meets the needs of these two personas that I have defined. On the right hand side, I’ve got my daughter who, when she first saw the Rubik’s Cube, she was like, “This is awesome, this is cool.” Then she found a YouTube video with a Mindstorm robot that would solve it for her. So she went and built that instead of actually learning how to solve the cube manually. I’m okay with that. I’m okay with that.
Then, just as a reminder, on the left-hand side, this is my mother-in-law’s flowers, some of them, she loves to garden, and this is a reminder that the kind of pace at which she changes her garden is the pace at which she wants her software to change. So pretty much seasonally. We want to be able to figure out ways in which we can accommodate for that duality of user. Thank you very much, appreciate your time this morning.