Powering a Viral Network Rollout with Feature Flags

119
Nolf-Curve-Test-in-Production

In February, Matt Nolf, a software engineer at Curve, spoke at the Test in Production Meetup in London. Curve is a London-based fintech startup that allows people to consolidate all of their credit cards in the Curve app and then use the app to manage payments. Matt explained that his team had recently built a peer-to-peer payments feature in which Curve users could send and receive money.

They wanted to do a phased rollout of the feature rather than a “big-bang” release. This entailed creating different user segments for each phase of the launch. Unfortunately, the social dependencies of the feature presented a challenge—to work properly, it required the participation of a sender and a recipient. If a sender were to transfer money to a user who lacked access to the feature, then the sender’s money would get stuck in limbo.

Matt’s team couldn’t possibly predict which users would receive money from feature-enabled senders. As such, they struggled to create reliable user segments. Thankfully, they found a solution.

Watch Matt’s full talk to learn how Curve used LaunchDarkly feature flags to execute a successful phased rollout of the peer-to-peer payments feature.

FULL TRANSCRIPT

Matt Nolf:

Cool. I think this is working. Hi everyone. Thanks for coming and listening to what I’m going to talk about. I hope you’re all doing great tonight. I’m going to talk about a viral network rollout that we did at Curve. One of our features that we released at the end of last year. A bit about why we decided to do something like that and what we learned from it as well.

For those of you that aren’t familiar with Curve, we’re a fintech here in London and we’re building an over-the-top banking platform. You can add all of your cards to our app and then we’ll give you one of our cards, and then when you come and spend with it, then we’ll charge that card, so it’s all your cards in one, we’re trying to rebuild your wallet. And there’s other cool things like cash back and go back in time to move transactions around and stuff like that. And we’re building it with some pretty cool stuff.

We’re using Go to build our microservices architecture, deploying to Kubernetes with CI/CD, we recently adopted GRPC so that we can have service-to-service talking, public KPIs with [inaudible 00:00:58], and we’ve started to do feature flagging with LaunchDarkly. So if any of that’s exciting to you, as was mentioned before, we are hiring so please come to talk to myself or any of the Curvers that are here tonight. To explain what we were doing and why we ended up doing something like this. Really want to give a bit of context around what the feature was that we’re building. Just so you can understand why were we thinking about feature flagging in this way and why we wanted to solve it with something like this. The feature was called Curve Send and essentially allowed our Curve users to send money to each other through the platform.

So it was kind of facilitating peer-to-peer payments. The really nice thing about it is the apps—it was any account to any account. So when you upload your cards to our platform, you can send your money to your friends through any particular card. To give you an example, if I want to send my friend Joe 10 pounds, I’ll go on the app, I’ll choose Joe, 10 pounds, and then I select which card I want to send from, and then Curve will then go and charge that card that amount. Then Joe will get a notification saying, “Hey, Matt sent you 10 pounds. Can you come and tell us which card you want to accept it onto please?” They get it on the card that they want and it goes to the right account.

We thought this was really cool. We thought it was a feature that we’re going to enjoy using ourselves and our customers would like it too. But we wanted to get it into the hands of customers and understand how they would use it and how they would find it. When we were thinking about rolling out and how we were going to do that, we realized that this feature is really- it’s intrinsically social. There are two people involved to make this thing work. There’s the sender that’s going to send the money to someone and there’s a recipient on the other side there’s got to choose where it’s going to go. Without one of those two people you really can’t get this to work at all, you need both.

So we need to think very carefully when we want to give it to someone. We want to enable it for a set of users. Who’re we going to give it to and who else we need to give it to so that we can facilitate those things. Cause it wasn’t good enough just to enable sender if they go on the app and they choose, “I can’t send to anyone because no one else has that feature”. Similarly, we can’t send it to someone else if they don’t have access to the money. If they don’t have access to the feature, they can’t accept the payments. So that money is stuck in limbo. We really needed to make sure that both the sender was enabled and the recipient was enabled so that we can facilitate both the sending and the accepting.

How would you do that? How would you have someone that’s got the app, got access, and someone doesn’t have access. How do we then give access to that person at the right time? And this was something that we realized was going to be a bit of a challenge if you want us to do that.

Before we’re kind of doing beta groups, we’re giving access to a bunch of people early that have a prebuild or something or a version of the app that they allows them to go and test things before everyone else. And they’ll give you some feedback on what works well, what doesn’t, bugs that we might find, or pain points in the experience that we might want to improve over time. This is really nice because users get access to really have a contribution towards what you’re building and you know they can contribute and give you feedback and really help the direction to make sure that we’re building something that we like.

But it is still a focus group of people. It’s users that have accepted the rest of the things might not be exactly what we want, it might not be that seamless experience that we promised for everyone and things might go wrong, things might break and you might have to contact support or there’ll have to be some manual intervention there. The feedback is very good, but as I say, it’s a focused understanding. And when we want to understand how everyone else used the feature, this is a narrow view.

It came time where we needed to think, “how are we going to get this to customers? How are we going to release it?” And we’ve talked about how releasing for scale and giving it to all of our users. As that user base grows, how do we make sure that we’re building something and deploying it in a way that is scalable and going to give it to the right people.

We thought, “it’s going to be a fair amount of work maybe to figure out how we’re going to do that thing. Maybe we should just do a big bang and once this is ready, we’ll just give it to everyone at the same time.” Users will be curious, so if we give something to users we can’t expect them not to use it because then we have to expect that everyone’s going to try it out. We need to be very deliberate with enabling it. And we can’t just take it away from someone as well. If the feature disappears, people are going to ask questions and be unhappy, so we can’t really do that very easily. That needs to be a very deliberate action. Do we want to think about building something where we can roll it out in phases, and iterations, and build on that, and get more and more users trying over time so that we don’t have something where it’s a big bang and we have to tell people, “ah, sorry, we’ve changed a few things here, you might have to hold back for a few minutes, a few hours, a few days.”

We wanted to say, “let’s do something where we can roll out to users over time, but we need to enable the right people at the right time as well. The sender and the recipients. So to sum up the requirements of what we’re thinking here, we still want to enable users in batches, so I want to enable 10% of users at a time to understand how that 10% interacts with a feature. What are the pain points? But we need fine-grained control over the users as well, so not just a 10% segment, but I also want to know that this user is going to get access to that feature at the right time because he’s received money, so he has to have access. And also we wanted to learn for the future, so this was something that we were kind of new to. We’ve just started using LaunchDarkly, how are we going to get the most out of this tool? How are we going to really build something where we can have a good feature flagging experience for users where they can understand exactly who has access to what. They can enable things by themselves and disable things. How do we build something into the core of our system where we can support this kind of functionality in the future.

So we’ve tried a number of feature flagging techniques that would maybe give us access to this, but it was a journey to understand where we’d get to, how we’d get there. And lots of things we tried just didn’t quite work out for us because we were doing something which was somewhat different to what other people have tried before.

So we looked at how can we do a on/off? How can we just get that working first so we can turn a feature on and off for people? This is nice because it gives you a lot of safety, gives you safety to turn your thing off, turn any feature off as fast as you turned it on. And that’s nice because you have safety and you know what you’re doing and everything’s very- you’re understood and deliberate, and we can see things aren’t turning out we can go backwards just as fast. But it is all or nothing, it’s a binary, so we can’t do the fine-grained control things that we wanted. It’s a controlled all or nothing. It’s a controlled big bang, because you’re giving access, but you have to remove the access to everyone at the same time.

So sadly we had to do more here, we knew this wasn’t going to do it for us. So we moved on to a percentage-based rollout. Let’s see how we can roll out in waves as we wanted to do. We want to give access to 10%, to 20%. And the nice thing is as well is that you can discriminate on attributes like device version, device locale, app-to-app versions, things like that. And we were wondering maybe we can discriminate on users that have received money. But we still found that this wasn’t giving us enough control. That individual number, that percentile, that wasn’t enough so we can understand who has access to our feature. Who has access to the right time and making sure that that is definitely happening. So it’s more controlled, definitely. It’s better than an on/off their situation like this, but it only solves for half the problem. It solves for the sender, we can make sure they have access, but there’s no guarantee that the recipient will be in that percentage that we’ve enabled for. And because of that, that money could be stuck in limbo. So wasn’t good enough again.

So we ended up starting to try and be a little bit clever. So we have segments of users, which is just a collection, just a list of users. Let’s see if we can bundle all our users, their receive money into a group and then let’s continue doing a phased roll out, so 10%, 20%. And all those users that also have received money, we’ll put them in a segment and they access.

And then we thought, we’re doing more and more of these percentages, so if you want to go from 20% to 50% or something like that, then maybe we need to have more than one segment. Maybe we need to be able to understand at what point they are enabled. Maybe that segment isn’t going to support the number of users that will get enabled in that wave. But we found that we were ending up with something a little bit more like this and things were getting complicated and our rule for evaluation of if someone has access to a feature was multi-stage and there’s lots of different areas. And to understand if a user has access was getting more difficult as well. There was lots of different places we had to look. Were they in this place, within this place. And it just felt like the whole thing was getting a little bit too complicated and we were going the wrong way. And again, it wasn’t going to scale, this was something that we really wanted to understand how we can do more with this in the future. And this felt like it was very specific to this problem and it wasn’t going to allow us to really make these learnings and take it forward.

So frustratingly, we started to feel a bit like this, that we were really trying hardest to push something in, but it just wasn’t going to fit. And the more we tried, the more it would resist and the further away from where we wanted to go. Although this was a frustrating point for us, it allowed us take a step back and try to understand what we were actually trying to do.

The way we were looking at it maybe it wasn’t the right way. So as I said, we’re starting to think about this wrong and we’re starting to capture the problem in the wrong way. And maybe our problem statement that we were to find, where we want to enable specific users to have access to our feature, wasn’t the right thing to do. So we started to think what it means to be enabled, why should a user have access to our feature? And it’s not- what it is is you’ve either been lucky, you’ve hit the lottery and you’ve been in a percentage rollout or your friend has it and he sent you some money and now you also have access. And that’s the way it was. There was no specifics around if you were this user or if you’ve received money in this time period. It was nothing like that. It was very simple and it was very clear that the user has nothing to do with it at all. It’s all about the rule, it’s all about the attributes that define the user and that rules if they should have access or not.

So when we flipped it on its head and we started to think about it this way, we started to understand that this problem really had nothing to do with building complex segments and rules to capture which users had which attributes and which behavior. It was all about the rule that should define and be the source of truth for who should have access. So that’s what we ended up with. We turned on ahead and we decided what would define an attribute that kind of decided if the user should have access or not.

So pretty much what it was was just an integer. User is not enabled if they’ve had nothing to do with the feature before and this pretty much just equates to nothing, there’s nothing gained, nothing lost for the user. They’ll be none the wiser. If they’ve received money, if they receive money at any point, then we’ll bump that score to two. And then at that point their score line, that’s a little bit hard to say, but at that point they have access to that feature. And again, if they’re releasing a wave, then we bump it again and they’re three. We only have two levels here, really they’re important. It’s still a binary, it’s enabled or not enabled, but it allows us to be super clear. And there’s no complicated things going on with users and specific users grouping them together, it was very much about a single case.

And this was really nice because it meant that our rules for evaluation of a feature flag were very simple as well. It was just, you’ve got a score, is it greater than one? If it is, cool, you can use that. If not, then sorry you don’t have access. And we didn’t do anything more here, but you could do more with if it’s level two then you can receive money, but you can’t send it to someone else. If it’s three, you can do all three. We didn’t go for anything that’s complicated as that because we didn’t need to. We wanted to enable some sort of viral rollout where once someone’s accepted as part of that ecosystem of that feature, they can go and try it out and see what happens next and see if more people are using it.

So that’s what we did and it worked really nice because our evaluation became very clear then and we weren’t grouping users together and having complex segments. So to sum up how viral did we get? How much did people start using it because of this? And we had some interesting, interesting results, but did it solve the problem? And the answer is sort of. When you start doing things like this, you’re taking some of the heavy lifting that you have to do away from LaunchDarkly. It will give you percentages and segments and stuff like that. It allows you to not have to worry about those things. But when you want to define rules that are based on data and these attributes of users, you have to do some of these things. You have to take on some of this responsibility.

So making the data serviceable for the clients that want to do the evaluation for your exchanges or something, you have to make that available. So you have to find some way that the apps can go and make a request to find out what’s my Curves send score. And that’s not a massive change, but it’s still something you have to do. And then notifying users as well that their score has changed is something that the apps aren’t going to do. The apps aren’t going to notice that that score has changed because why would they? They’re not constantly checking unless they are. But the way we did it was with those new app notifications. So when I use the receive as a payment, the app knows that that curve send has come through and they can then go and update that Curve send score, go and do a fetch and figure that out.

You can do more with it and you could stream data or subscribe to changes in these values, but we didn’t see it was going to change the result of this. With more data that might be something we’re interested in doing. And finally, because we weren’t doing a percentage yet in somewhere else, we have to figure out how are we going to do these batches now because that’s not handled by a slider somewhere with our feature flagging software. It was fairly novel, but it’s still something that you have to consider and you have to figure out how you’re going to do that. And if you might want to prefer to enable to some people first and some people later, so you get more control there as well.

But to talk about how viral we went. There was some int- these are just some generic- some graphs to show some of the patterns. There was definitely some virality there, but not a huge amount. What we found was that there would be a spike in people enabling and using the feature, but it would tail off pretty quickly and people wouldn’t use it to enable other people. So we kind of had to keep kicking the tires and enabling more and more people. But we realized pretty quickly that we have to keep doing this. And the virality wasn’t quite there, it wasn’t quite what we expected cause they were sending to the same people and they didn’t have a huge network of people they wanted to send money to.

But it was interesting none the less, it allowed us to understand some of the pain points that uses his heart throughout the experience and allowed us a bit of a window room, a bit of a confidence to make some of the changes, and to make some differences to pivot in some areas. So that’s what we did. We pivot in a couple areas, we changed a couple of things. And at that point, having done this, we’ve gained enough confidence in seeing how some people had been using it already and knowing how our changes will affect things that we can continue and we can roll out to everyone.

So having understood that older people didn’t use it as- it wasn’t as viral as we wanted. Was it worth doing something like this? Was it worth investing the time in building something where we can propagate and network out the feature and allow people to do enable us? And the answer is yes. And there’s a few key reasons that I want to touch on, which really are the key learnings for me in enabling something like this that although it didn’t target particularly enabling it for those users, there are some learnings going forward.

So when you’re defining your rules for enabling users to access features in your system, you really need to think about why access should be given. What does the user need to do to enable themselves? What events do they need to have done? What attributes do they need to have? Not what user they are and grouping them by users. So the LaunchDarkly term, you might say you’re building for rules, not for segments. Don’t group users together in a way that’s, they have similar attributes. Just develop your rules so that they pick on those attributes.

And secondly, the more you put in, the more data you can throw in at this, the more you get out. It seems fairly obvious, but something that definitely was true for us. We had our tier system of one, two, and three. And that was really nice but we didn’t do most of it. We didn’t enable more and more control based on more and more you’ve interacted with it. It was a binary, but that’s something you can definitely do. And the other thing about it is you do have to consider what data you want to expose to something like this. It has to be very deliberate. And that’s something that you should definitely define the start of the processes. What data should contribute to being able to access a feature.

And finally, doing something like this really gave us confidence to iterate and to move faster. And to make deploying and iterating on our feature a non- event. So when we’re making our changes, we understood exactly who was going to have access. They have already seen it, we can understand more about the users that have access to that feature that time. And making deployments and iterations non-event really gives you confidence to move faster and to do more.

So that’s it for me. I would welcome any questions you have about what we did and how we might take it further.

Speaker 2:

Hi Matt. What other Curve functionality do you think we’re going to be using this for as well?

Matt:

So it’s an interesting one. I kind of see a world where you have a control panel and you can put everything in some kind of panel way. You can limit things very, very specifically, very granularly. You have to be careful with each flag so you know that you’re not keeping them around forever when they’re not useful, and something becomes a core feature that you’re not going to ever turn off. Why would you keep something behind a feature flag there? So you need to understand what still is a feature flag and why it’s just an integral part that you know is tacked up by keeping around. But for me, I think you want to have as much control as possible about who has access to what. And then you can do things like give developers access to do certain things in the app or on your system to get more information and you can control that in a nice easy way.

Speaker 3:

So if your feature had gone viral, would there be a risk of having too much load suddenly?

Matt:

Sorry, could you say that again?

Speaker 3:

Would you have a risk of having too much load too suddenly?

Matt:

Too much load?

Speaker 3:

Sorry, advantage of rolling out by percentages that you can see how things are handling 10%, 20% and whether or not you can comfortably go up to 100%. But if your thing had gone viral, you could have gone from 10 to 100% instantly. [crosstalk 00:19:58]

Matt:

That’s absolutely right. It was something that we were monitoring very closely and it was a risk that we were willing to take. If this was as successful as perhaps we would have hoped then that’s great and it shows people liking it, but leaves things for abuse and stuff like that. The way we handled it was just by monitoring it and we always had a kill switch to turn off enabling other users so it would cut that virality factor going out. It comes down to measuring and understanding exactly how it’s being used. It’s something that needs to be tracked.
Speaker 4:

So that work well for a credit card app where you have limited interactions, but do you think it would work for a games company or something where the huge graph of users and the virality would explode?

Matt:

Yeah, it’s a good question that, as I say, you need to understand the velocity of how that virality is going and to understand your product and to understand, have expectations around how much is going to be used. It’s definitely important to align those expectations beforehand. If you think it’s going to explode, then maybe this isn’t the best thing to do it and you would do it in a more sensible way. In that, I mean that you have more control than the user. This is something that was cool to see and it was cool to understand how people might spread that access to that feature. And it also solved the problem for us of having access to both the sender and receiver so they can accept the payments. So for us, it was almost crucial to allow us to test for a certain amount of users because if we didn’t, that money might be stuck in limbo. It needs to be taken with a pinch of salt that it doesn’t fit all use cases.

Speaker 5:

Hi Matt. Was there cases where you found where people were sending money to people who didn’t actually have their feature turned on? And what percentage was that?

Matt:

So it was something that we realized would be a case before we released. So when we did the beta groups, we would enable specific people and we would see that there wasn’t a whole lot of interaction because they didn’t have other people in the beta group. So what we learned from that, that we had to make sure that these constraints existed if we wanted to test it in production and give it to people that weren’t expecting to see it. So that was how we would figure that out before we ended up in that situation.

Speaker 6:

Any more questions? Thank you, Matt.

Matt:

Your welcome. Thank you.