AWS Re:Invent 2022 Recap: Building and Operating at Scale with Feature Management

In late November 2022, AWS held its annual Re:Invent conference. As usual, LaunchDarkly was there (Alex Hardman provided a quick recap of our involvement here.) One of the sessions featured our own John Kodumal, CTO and co-founder of LaunchDarkly.

Titled "Building and Operating at Scale with Feature Management," the session emphasized that feature management isn't just about showing or hiding things from end users; it's actually a way of evolving your software stack safely.

"Most of the time, we're not just writing brand new code from scratch; we're actually evolving existing code," said John. "That's the natural state of software in a scaling system. And feature flags are one of the mission-critical techniques that you can use to do that safely, efficiently, and without risk."

We encourage you to watch the event in its entirety here when you have the time, but for now, here are some of the key highlights from John’s session.

After giving the audience a primer on feature management, the history of LaunchDarkly, and our current state (did you know we support 20+ trillion feature flag evaluations per day and 16 million requests per second across our 4,000 customers?), John offered three real-life examples of using feature management to enable app modernization.

Application modernization: a database story

Application modernization is a great and necessary thing, of course. Traditionally, it does have some challenges, such as:

business disruption
inability to control end-user experiences
costly outages
unknown cost and performance impacts

John told the audience how LaunchDarkly used feature management to migrate our primary data store away from MongoDB to CockroachDB—a migration that had the potential to be incredibly disruptive. However, managed downtime was out of the question.

He then explained in detail how feature management with feature flags enabled LaunchDarkly to make the migration seamless by breaking initiatives into manageable milestones, migrating gradually with precise control, reversing changes instantly, and quickly validating components.

John noted that many advanced users of LaunchDarkly have used it for the same use case, such as TrueCar.

"TrueCar is a company that has been moving from bare metal on-premise to cloud-hosted on AWS," John said. "And we were a critical part of their migration strategy. They had a slightly different architecture—they have a huge set of affiliate networks all with different websites. And so they migrated them piecemeal, piece-by-piece, over to AWS, and they actually used LaunchDarkly within Lambda@Edge functions.

"They integrated LaunchDarkly's node SDK into their Lambda@Edge functions to essentially build a rules engine within LaunchDarkly and use feature flags to control which customers got migrated over to the Cloud hosted version of those sites versus the on-premise version of those sites. They saw an incredible increase in deployment cadence from deploying once per week to 20 x per day by using LaunchDarkly, which is pretty impactful."

Using feature management to comply with compliance

As John noted, compliance isn’t the most exciting topic, but it’s become an essential one in today’s new environment of data privacy and retention laws—even if it often slows development.

Still, it presents a ton of challenges that range from the need to hire actual experts on governance and compliance laws to increased controls due to continually adding compliance domains and conflicts with third-party vendors. Fortunately, feature management can address these issues by managing customized deployments with a single codebase and by serving specific or one/off requirements with less overhead for development teams.

John brought up HIPAA compliance as a prime example. For LaunchDarkly, HIPAA compliance imposes some demands we’d rather not enforce universally. John explained how we use a CDN provider called Fastly, which has a HIPAA compliance mode:

"If you flip the HIPAA bit on specific requests (which is like a specific header), it signals to Fastly that they can't cache that data in non-volatile storage," he said. "So, they have to use memory-based caches, and that's great. We can go through HIPAA compliance with our CDN provider. The downside of this is that because you can't use non-volatile storage, it actually impacts cache hit rates negatively, and they actually charge you more for this.

"We didn't want to just turn this header on for all of our customers universally because we'd be paying more to our CDN provider—and there'd actually be a performance hit. So, we used a feature flag. And so when we onboard customers that have HIPAA compliance requirements, we sign a BAA, a Business Associates Agreement with them, and we actually add them to a feature-flagged segment that enables the right bits in Fastly. And so for just that small subset of customers that have HIPAA compliance requirements, we can actually be HIPAA compliant with our CDN provider, and for everybody else, we get all the benefits of higher cache hit rates and lower CDN costs."

Cost performance and optimization

The process of adopting new technologies is great, especially when it saves you money. But sometimes, the road to get there can be more than just troublesome. For instance, you don’t want to jeopardize your current performance and uptime, and you still need to stage the rollout to conform to customer needs. Plus, different workloads have different risks.

Feature management provides an ideal solution, enabling you to use targeting to stage the rollout, measure the impact of the release, and roll back rapidly if needed.

John provided an example of when LaunchDarkly migrated some of our services over to AWS Graviton2 from traditional x86 architectures. As with most any architecture migration, we wanted to ensure that the migration didn’t create differences in performance or differences in the semantic behavior of those services. We achieved this with a gradual rollout controlled by LaunchDarkly feature flags.

"We wanted to stage the rollouts, and we didn't want to jeopardize performance," John said. "And feature flags were kind of critical to this as well because they provided us the ability to measure the impact, whittled down by which architecture we were serving requests on. So, we could actually seamlessly route traffic between Graviton and traditional architectures in real time without redeploys. We could target specific systems—non-mission critical systems first—and set up monitoring to automatically roll back the shift over to the new architecture, should we detect anomalies. The impact of this was fairly substantial... it ended up being around $91,000 a year, which is no small amount of change."

John then noted that, as with many of our own internal success stories with feature management, this one was later repeated by one of our customers:

"Honeycomb actually went through the same migration and utilized LaunchDarkly to do the migration," he says. "And they found that even though Graviton was 5% slower for them, it was 20% less per millisecond. So, it was a 17% cost savings, and the performance impact was not material to their customers or their specific use cases, which ended up being a win for them overall."

As noted, this is just a summary of some of the key points of John’s presentation. We left out most of the technical details, which you’ll definitely want to see in action, so be sure to watch the entire session in full when you have the time.