Right Grid
  • Overview
  • Transcript
Trajectory

Experiment Culture: From Tradition to Data-Driven

Arjan Franzen Maxeda, DIY

Traditional brick & mortar retail is struggling with e-commerce for some time now. Our organisation is a successful traditional retail company that has been around for more than 40 years. At this same time we see a number of challenges coming from the "online world". We started our new strategy more than 2 years ago and changed almost all critical things in our organisation. Architecture, people and process. We went from traditional VM to cloud native, from scrum-but to LeSS and from supplier/project driven to a multi-devcenter organisation. We changed our toolset and added Launch Darkly to our toolset in 2017. This started a transformation on developer workflow that took some time to materialise: from long lived feature branches to trunkbased development/github-flow. After much small batch deliveries and tests in production we switched over the last parts autumn of 2018. By this time we’ve gotten familiar with a process that is data driven. By this time a significant percentage of our features are tested using A/B test. We use Launch Darkly for toggling these features.

Downloads slides

Arjan Franzen

Arjan has been active in the software industry since 2000. He's had the opportunity to develop software for several large enterprises and governmental organizations. Ever since he became familiar with Agile, Scrum and Lean, he has applied these methods into his professional environment. Arjan wants to enable organizations to reduce the risks of software releases to an absolute minimum. In addition, by applying principles of Lean (and Lean start-up) it must be possible for companies to measure effectiveness and value changes in the software: " Build the right thing ".

Heidi Waterhouse: So I'm super excited because we have someone who is a customer who has come in from Amsterdam to talk to us about experiment culture and how Maxeda DIY moved from traditional structures to data driven in a remarkably short period of time. I think this is a great story of organizational transformation and figuring out what works and what doesn't and discarding what doesn't work as quickly as possible so you can move on to things that are really actually useful.

So this is your last talk. After this, we're all going to go get lunch and then we'll meet you back in whichever ... I have to go look at the schedule again because evidently I can't keep it in my head, but this is the last talk before lunch. We'll figure out what you're doing after lunch. Arjan has said that he has built in lots of time for questions, so be thinking about your questions so that we can go ahead and get this asked right away. Thank you.

Arjan Franzen: Thank you very much. So my name Arjan Franzen. I'm from the Netherlands and I will try to first introduced to you a bit of the company that I'm from. I'm from Maxeda DIY, which I would be surprised if anybody here has ever heard of it. It's not even well known in the Netherlands, although the brands that we support are very well known. So, we are a company in north western Europe based in Belgium and the Netherlands.

We have a annual revenue of around 1.4 billion euros, 5,000 employees, not all of them in tech. Obviously store employees mostly and 300 stores. You'll see a map of the Netherlands on the rights completely covered in stores. That's, well almost 200 and almost 100 stores in Belgium you see on the left. So these are the two brands that Maxeda DIY supports. Praxis which is the Dutch brand. It's headquarters in Amsterdam. And it's a household name in the Netherlands alone.

I think that brand recognition is upwards of 90%. So, it's also home of my department, the IT online. So, we support the praxis.nl and Buteyko.be website. The first store, as you can see on the left, it was in 1978. Also the year that I was born, but by now, we have 186 store, which in a small country as the Netherlands, it's quite a few. Five years ago, we moved I think something similar happened here also, the maker's movement.

This is the company slogan for the makers to try to help people when they are redecorating their house. So because young people nowadays in, well, north western Europe do less and less in their house when they need to redecorate. And so we try to help them out. Also, the same story about Belgium, started a bit earlier still, headquarters there is based in Brussels which is well, interesting for a lot of reasons, but mainly the language aspects of it.

French and Dutch because Belgium is a country divided with two languages. And that's also for us in the online department. That's a fun story. There's 159 stores both in Belgium and in Luxembourg. And here you see a beautiful picture of what we try to sell you in the '70s or '80s, a fantastic blue sync. Yeah, stuff didn't look that pretty way back when you're not okay. Anyway, so what do we offer online? We offer online this thing which is a mark recipe as we call it.

So it's instructions on how to do simple things in your garden, like clean your garden furniture. But also how to create a garden chair or all sorts of these things. And mostly we try to sell you stuff online, which we do. Like laminate flooring is very big for Praxis and Blinko. And this is the website. This is eCommerce, and we do this either click and collect. So, which means that you buy something online and you pay for it, but then you go to the store and pick it up.

Which is from a retail perspective, very handy because that allows the store to try to sell you some more. Store home delivery, which literally means that somebody from the store pick something from an aisle, puts it in a box, and sends it to your home, and drop shipment, which is stuff that goes from supplier directly to your house. So, that's the introduction of what we do. What Maxeda DIY does. I guess the closest comparison to an American company would be the home depot. So there you go.

Then two years ago, all was not so fine. This is a bit of corporate jargon I said in her talk. Correct. But this is essentially what our executive leadership team demanded us doing. We needed to improve the net promoter score, which is a fancy word for how happy are the people online when they use your site? It has a relation with the second one, which is well, the site speed really wasn't that fast. It was an aging website.

Conversion. So yeah, we're losing out to the competition there which means that the revenue because, well, everything's going online so we need to up the revenue online. And to me as development lead, most interesting to me was the development velocity. Some features took forever to build. There really wasn't any speed in development. So this is the challenge. At first dive a bit deeper into what we were using. I think this is the most fun part.

On the left you see a inpatient customer, she's waiting for either the Blinko of the Praxis side, it doesn't matter, but the sites were essentially split up into two. We have been running in AWS Cloud for quite some time now. However, we did a bit of call it VM ware style. We have some virtual servers in the Cloud and we used to run hybrids on it. So quick show of hands. Is anybody familiar with SAP hybrids? Have you ever heard of it? One, two, good. It can be very useful.

We're still using it. Now we're happy hybrids users. But like the woman in the picture of way back when, that was our monolith. We did some horrible things to it, which in turn well it did horrible things to us in return. More on that later. So, the site was essentially split up into two; the eCommerce platform and the community platform. So, remember the recipes, the mark recipes for your garden furniture, that's what we call the community platform.

It run beanstalks, which is pretty cool. It's getting less common done our way, but we use it. But the problem starts at the right side of the slide because we have two suppliers. These two suppliers were very project oriented. And they didn't talk to each other very well because there were different suppliers. And that project orientation made that well, things just didn't improve, they didn't work together. And that had some dire consequences.

The monolith decided to break down tremendously on 14th of July from a Friday to Saturday. And that means that we suffered downtime in total. It does have a happy end luckily, but that it didn't feel like of 18 hours of downtime. So, 18 hours of no sales on an eCommerce website. And we have normally like an emergency procedure. So if all else fails, just hold the servers down and gradually bring them back up.

This procedure filled on us continuously because as soon as we set off the machines, the database essentially overheated. And then you think this is the Cloud, right? So you stopped the database, the ids and then go to the dropdown if you do it manually, we're all in Stresso outgoing infrastructure's code for us. So you go to the dropdown and you select the most powerful database servers and you pay $6,000 a month for one database instance. But hey, we're in panic. That also didn't work.

Then even with Cloud, you're out of luck in emergency mode. So luckily, somebody figured out that what job, because it turned out to be one job that was failing. He removed that and things went relatively back to normal. But what this whole incident showed us that we were not on the right path, the monolith that we have, the hybrids, it was the old hybrid, hybrid's five, really should go. And so we set off a plan to prevent this occurrence from ever occurring again.

And the plan has three parts; a process part, an architecture part, and a culture part. So first, the process. We decided it would be best if rather than having all these suppliers and having them managed in a supplier, customer type fashion have four development centers. One in Amsterdam, one in Brussels, one Kiev, and one in Odessa, last two are in the Ukraine. Go from many backlogs and project plans into one backlog.

Have centralized means of delivery because currently all the teams had their own way of going to production. And we thought if we are going to do a platform, let's make sure that we also centralize how you go to production and how you test. Then we introduced T-shaped teams. Is everybody familiar with that term? A quick show of hands, T-shape teams. Okay. So, the T-shaped teams essentially means that while you are an expert at something like testing or developing or front end development, or back end development, it's highly encouraged of you to pick up tasks that are outside your circle of comfort.

It could also be a culture change, but it's here in process because it's essentially how you form teams, how you say,  "Well, you're going to work together or what are the best competencies in a team to have? " So T-shaped teams were not common. Like what you see in many organizations is that you separate development from testing teams, none of that. This is the exact opposite. And then last but not least, we introduced less.

Again, is anybody familiar with that scaled agile framework? Two. Okay. Essentially what that boils down to is rather than ... when you introduced Scrum, I trust everybody is familiar with that, you typically do that with a team. And then another team uses Scrum and then another team uses Scrum. And so you have many teams use Scrum. And what the sales pitches of less is let's do Scrum with multiple teams as opposed to every team doing Scrum on their own.

Because and this is actually a core thing of corporation collaboration. If all the teams are on their own doing Scrum, the cooperation between the team break down very quickly. And we noticed that, remember the supplier situation, they were very Scrum, but they didn't work together very well. So we decided to use less and compared to safe, less has much less, is much less prescriptive. Which allowed us to think for ourselves for ones rather than read the book and do what's in the recipe.

Size of the team by the way, we're 35 engineers big 20 engineers in the Ukraine, 12 in the Netherlands, and three in Belgium. So, that's part one of the plan. Part two is the architecture. As you see, well now we essentially said,  "Let's not have these separate bits and pieces of Cloud, but integrate them into one. " And you see the customers immediately very happy because we decided to merge everything.

Also we started using containerized microservices which allowed us to kill the monolith or at least make sure that the very big piece of software that is hybrids was maintainable now. And in addition to that, we started using like the ELK stack and all of these well modern and necessary tools that you need when running microservice. But we tried running them on our own for a while, so just provision VMs, et cetera.

But we found out that like LaunchDarkly, you can much better just buy these server as a service. So we do exactly that for application performance data that goes into the SaaS Cloud infrastructure metrics, telemetry, which is a fancy or DevOps word for saying, the functional bits that teams themselves wants to know about their software, like failed logins or revenue for instance.

All of that stuff is managed in Cloud SaaS platforms.

And logging information, obviously. So we used to run our own EEO but we stopped doing that mid this transformation, this digital transformation. So in addition to that, we use docker for the microservices and ECS for making sure that these microservices are actually run in a cluster. And rather than doing everything in a single SQL database, we have now well, a more diverse set of persistency layer.

At the bottom you see a team because that's also very important to us. Make sure that there is not a very big separation between parts of the infrastructure but also the people who maintain it. And last but not least, the plan included a bit of culture. There you go. So, rather than being very oriented to your supplier, your company, your tribe, so to speak, be much more aligned with the goal of the company, goal of the department, goal of your teams.

And this really helped, this took quite some time to get sorted because yeah, when you work for a near shore company, that is your employer. So, in the end, we got the older feature teams and the non-feature teams to be aligned with the goals of the department. Furthermore, feature teams essentially is the nice antidotes to project thinking. We needed to transition from project thinking like batch up all of the features that you wish into a big batch, start working on it, and deliver that entire batch into production, and do it more on a feature by feature basis.

And it comes in very handy because that allows us for a single feature to find an audience for that feature. And this meant that we could finally get rid of these horrible project plans and initiation documents. That was very fun. Continuous delivery. Some of the talks also referenced that, well, the next step as opposed to CI. But first to get that working, you need CI, but more on branching emerging later.

So continuous delivery was the goal and the plan and then experimentation. So we have the feature and it's now being managed as one or as a group rather than as a batch and a project, we could now start to experiment. We could now change the websites and see what the effects are if I've had it logged on or turned on or turned off. And we found interesting things in that sense, which allows us to finally prove the effectiveness of the feature.

And I think a lot of talks today has been about that. Everybody can have a good idea, but if you are able to show the data of how the feature is performing, that's what you really should be aiming for. So, that's culture. So as we all know from change management, you have a plan and you press a button and everything is awesome immediately. No. So, it took us some time to get there and we took one and a half, two years to accomplish this.

So, first we decided to do what everyone does. Well not to be too negative, but that's the easy part, right? Top down process change. Right guys, we're going to work in a different way now. Everybody's on the training. And well, that's how you start. Also, this is what we did and we had some success with that. We were able to do some work standardization, which essentially meant that the people from different teams starting to work together on how they approach certain problems, how they went to production, how they delivered.

And what came out of that was, well, standardized procedures rather than you do it your way, I do it my way and we never talk to each other. That was the old situation. Also, we had more discussions not about the components. There's also something from the last framework, go beyond component owners, try to remove component owners because those component owners, they cling to their code very heavily. However, this is not how the organization makes it money.

So if you can have a discussion and responsibility in terms of the functional areas that the various teams have for your organization, the component model is the next step to that. So first have the functional discussion. Guys, what are we doing here? How does this organization make its money? How do we functionally divide up all hash over? And so like less has a lot of lean stuff in it. We became more familiar with the lean.

So, either you did Scrum, but if you hated it or you think that was really not applicable for your team, that process, you are free to choose carbine instead. But then you get web limits. And you don't have retrospectives, but you have Kaizen, which is sort of the same but slightly different. And then I guess still, one of the more controversial processes in Agile is going to trunk based development, or get a hub flow while the old monolith obviously use get flow as a branching emerging strategy.

The new microservice emerged and they didn't make all of those mistake because the components were still very small. So they used either trunk based development or get help flow. And this allowed also the developers who were with us for quite some time to get familiar with this different way of working because they really clanged to the old way of working, which was get a flow. So a lot of branching, a lot of merging, integration testing, release branching this is, to put it in lean terms, this is all waste.

If your branch is not in production, preferably behind a toggle if it's not finished, you can consider it waste. But these words especially in the beginning were not accepted by the teams. I was wrong, they said, I didn't know how to code. Well, okay, you can think that. But it turns out that I'm not saying that I am rights, but there is some truth to having you identify all your work that is on branches before you have it to massive production and deem it waste.

So you should by all means have a continuous delivery process that allows you to get these branches out into production and get it integrators as soon as possible. Also, this allowed us to because of the microservices, so, remember the monolith, the humpty dumpty thing. We decided to not make that mistake again. SO, we're essentially binning all that code. We installed a new version of hybrids as a beck and a process.

On top of that, we have a layer of microservices. So that allowed us from an architecture point of view to say,  "If you want to modify Hybris significantly, don't do that. Either you build a plugin, you do what's possible. But if it's really substantial, you create microservices and manage your process from outside of Hybris. " This allowed us to have a out of the books installation of Hybris and have all the added value in microservices it, essentially containing it.

So it now never has the ability to grow to a monolith. Then this fits in nicely with the strangler pattern. This is from continuous delivery. And essentially that means that you run the old system and the new system side by side, and don't consider this waste because the new system that microservices can integrate with the old system, but you should be moving everything to smaller, newer services. At the same time, you diminish for the value of the old monolith until you can't even see it anymore and you can safely turn it off. We're not there yet.

The old version of our silver steel still run some of the backend processes. The other thing is that people in my development organization were very unfamiliar with Google analytics. That was something marketing did. So don't bother us with that. But then the discussion of how effective is the feature that you just developed? Yeah, I don't know. But that also changed. So now people are more into linking these tools, linking the AB test into Google analytics to see the effect in production on real people or on people actually on our website.

So during the transition, we also implemented obviously our first features logo in LaunchDarkly, and the first one was a redirect for testing. So, we had bits and pieces of the new platform ready and LaunchDarkly was there to essentially elect certain people and move them into the new platform so that we could essentially test their behavior on the new platform, but also see if the platform would hold. And the results were very good.

But this is that overview, getting familiar with LaunchDarkly or introducing it was also very interesting because the first suggestion from one of the teams was, okay, so now I want to have some time to develop a new component. And in that component, I will make sure that the oldies feature toggles and these feature toggles then route either to an old version of the service or a new version of the service. So, that took some time and some consulting by the people already on the site CI and LaunchDarkly to essentially get the point across that this was not the way that you should be using feature toggles.

You should integrate that already because essentially what that proposal was to have long lift branches again, because you have literally two versions of the same components running in production side by side for a prolonged period of time. This is not integration. This is forking essentially. Luckily, this was in the beginning and we managed to solve that very quickly. And as the implementation goes, we now have many features that we test on production for instance the delivery date selection to observe how the behaviors of users when that feature is toggled on.

The ... organizations, Belgian organization doesn't have their logistics in order for that feature, so we can easily switch it off for now for them. So AB testing and feature management, one of the important things and things that's we're still working on is the ability to link the tools. You need to be able to link, for instance, LaunchDarkly to other like Google analytics. And pros advise, you need to change the discussion that you have with stakeholders rather than a stakeholder coming in saying,  "I want this and it must be developed like so. "

You need to go and discuss what's the hypothesis that's behind it and what's the variance? And these words sound alien to stakeholders when you first mentioned them, but this was also part of the process of getting this implemented. Operations toggles, I'd like to finally shade some light on. This allows the system administrators or the operations or the dev ops. We have a very bad name for the team by the way.

It's called system team and everybody who's familiar with Safe knows that there's a system team there, but we don't use safe. So I don't know why, that's just it. And so these people, our system team engineers, they are able to once the bad stuff happens in production, they're able to switch off parts of the site when needed. So, everything is almost done. We're about to go to production in July of 2018, and we had a heat wave. These are very rare in the Netherlands.

It's like that's always cold and rainy, but nuts today. We were in the middle of a heat wave. And so, people all over the Netherlands were searching for air conditionings and fans because apparently all of them broke. We didn't have any, but that meant that the ... and we decided to switch because of the system being available. And we must have it by that Tuesday to switch on a Tuesday morning.

However, because of the heat wave, everybody on that Tuesday morning was heavily searching our website for any leftover fans or ACs. But that man, that Ford only that morning, it wasn't pretty on the site. Well, people everywhere were looking for something cool or jump off into a canal as you see from the picture. We switched that morning and the platform has been stable since. Only that morning we suffered some challenges, so to say.

But the platform because of its testing, because of the feature toggle and being able to direct certain groups of people into your new platform, we were able to essentially de-risk the delivery of this new platform. So the results. By the end of 2018, the return rates dropped. Some suppliers have up to four percent of return rights. That was not so good. But well these numbers is something that the people in ELT, they really hoped it turned out as well. And it did. Revenue up 35% and conversion rate, 20% uplift and mobile conversion even 90%.

That's not cheating because the old side wasn't meant to run in a mobile browser and the new one was. But anyway, it shows that the mobile conversion, the UX people really did a great job there. Site speed was obviously much better, hence the conversion rate. So cheers all round, however are we done yet? Absolutely not. 2019 and the rest, well, and following years, we will attempt to become more multi-cloud because, well, Google analytics is our standard for spying on you when you browse the website.

But there's much more that you should do to integrate it with feature development. And so to do that, we need more tools of the Cloud platform and more tools of AWS and link them together. Because currently, there now like two suppliers that I mentioned in the first couple of slides, they are managed very separately. So, the T-shape teams is going on a sort of a roadshow in the organization. We're going to have cross organizational teams.

So the rest of the organization wants to work the way that we did. So, they are open to having these cross organizational teams with Blinko and Praxis. And that means that if we can link like the raw data from, for instance, LaunchDarkly into AB testing and see how the behavior, how we can link that to Google analytics, I guess, or no, I know that we will be having much more insight into the actual value or features that we develop on a daily basis. That's it. I will gladly take any of your questions.

We started with and we still use new relic for that. And we're also looking at, well a regional company from San Francisco, a signal effects for the APM products. Logging, well, we ran our own ELK stack until, yeah, I broke it and fixed it. But I broke it and now we use logs IEO for that. So, that's sauce called profile. The use case is essentially the integration of the various products. If you look at the data services, the data oriented services that Google-Cloud platform offers, when you think about exporting from GA and then having well all these other services integrate very nicely into it.

AWS can offer you also that parts of that and not all parts, but the integration and the operation, you need to take more into account when running on AWS. So we thought, let's just keep what we have. And essentially, build it towards each other rather than trying to get the capabilities of GA and bolting that onto AWS. But you don't need to do it. You can do it in a different way as well. That's correct. This one?

Audience Member: Yeah.

Arjan: Yeah. Traditionally they are, yeah. So to repeat your question, I don't know if everybody heard it. The different tools that you use for either the business stakeholders versus the more technical stakeholders. Is that a summary of your question?

Audience Member: Yeah. And the overlapping.

Arjan: And where they overlap.

Audience Member: [inaudible].

Arjan: Yeah. For instance, when we use logs IO, which allows us to dashboard, it's very well challenging to get business stakeholders aligned on logging information and deriving any business value from that. But conversely, well the Google analytics, which is well the standard marketing has very little value of the technical operation to it.

So what we do is where the origin is, that's also where essentially you build the dashboard for, but there isn't like a single solution for that. So essentially, it depends. If it's more business oriented, you will be able to find it either in Google analytics or Google data studio. If it's more well logging oriented or technical oriented, you'll either find it in tools like new relic or in lexile. Correct.

Essentially, the progress of development. So, we had the new site as ELT consistently referred to it, is the new site ready? Being able to demo that on production using the features logos really gave ELT some peace of mind. But also kept our well peace of mind. Yeah. Thank you very much then. Oh, sorry.

Audience Member: [inaudible].

Arjan: Sorry, from Scrum to less, you want to know ...

Audience Member: [inaudible].

Arjan: Yeah, sure. I'll be happy to talk after the talk also with you about our experience. But, yeah, there's obviously course material and some very good presentation by the people who thought of the framework, Craig Larman and Bas Vodde, if I'm not mistaken. But yeah, sure. I'll happily speak to you about that. If that's all. Thank you very much.