Recap: Your Questions for Gene Kim Answered

In our recent webinar with Gene Kim—CTO, researcher, author, and the founder of IT Revolution— a lot of ground was covered.

With the focus of the conversation centering around getting into the mindset of a high-performing, Gene and our Director of Developer Relations, Cody De Arkland, touched on topics like what metrics to measure, the role of architecture in software, the impact of company culture on performance, and ways to reduce deployment anxiety.

Things shifted a bit for the audience's Q&A session, exploring topics like the future of development, how authoring a book compares to writing software, and why it matters who you hang out with.

In this article, you'll find the full Q&A session, and you can watch the entire webinar now.

Cody:

At a high level, what is it like writing a book when you actually dive into this type of a project? How is it like writing software? I haven't written a book so I think a lot about the coding project I built. How similar is that? What's it like when you sit down and say... Do you pick out chunks of the time? At a high level, what is it like?

Gene:

On a good month, I'll spend half the time writing, half the time hanging out with the best in the game. Of course, this is certainly one of those things like DevOps Enterprise Summit. Then maybe 20% of the time coding. That's maybe one part of it. That means probably three-to-four hours in an office, actually, mostly at a Starbucks. Except for during the pandemic, most of the words I've written in the last 12 years were written in one of two Starbucks here in this area. The goal is 800 words a day. Two crappy pages per day. Jerry Seinfeld, when he was interviewed by Tim Ferris, said, "Comedy is a game of tonnage." It is primarily an act of writing, so I thought that it was super interesting that even Jerry Seinfeld has two places of work: his desk, where he does the writing, and the club, where he does the performing. He finds out which is good and which is garbage, which is most of it. I love that.

The second thing I would note is that many people ask me, "Boy, when I look at the way you write, it's like the ultimate waterfall project." I was like, "Yeah, yeah." Not everyone works that way. The guy who wrote Outliers. Who wrote Outliers? One of my favorite books, Outliers. Malcom Gladwell.

I listened to an interview with him, and he said he hates books because it takes three years. You never get to see the end of it until like three years later, if you're lucky. He identifies as a writer. He loves writing articles for the New Yorker because he can actually see the article weeks after he writes it.

There's been a lot of authors who release a chapter at a time and that is just not the way I've found that it works. I have to write a lot. I look at the end and what I see is a big, steamy pile of garbage that says nothing, means nothing and takes a lot of rewriting to have anything salvageable from it. Those are usually the low points that are part of the journey...

Cody:

There's a funny analogy here to software deployment in the concept of larger, slower releases versus smaller, frequent releases.

Gene:

For whatever reason, I choose a large batch release because I have found you have to write about 140,000 words, look at it with disgust, and try to find out what are the screams in the sea of whispers.

Cody:

Books are monolith and blogs are the microservice.

Gene:

That's right.

Cody:

Richard H. from the chat asked, "Do you see inter-team dependencies and coordination as evidence of friction or are they learning opportunities for the organizations?"

Gene:

Man, that's a great question. I would say the answer is it depends. I can say what it depends upon. I'll tell you the bad mode. If you have two teams that have to work together and your average dependency per feature is 3.5 teams, every feature you have to coordinate with an average of three other teams, that's really bad. That means you have to schedule together, coordinate together, write together, deploy together. There's a whole bunch of bad things that go along with that. That's almost invariably bad.

What makes it worse is that in order to actually make contact with the team, you have to level up to let's say three levels and then go down three levels every interaction. That's also really bad, especially if there's a cross-dependency with databases or whatever, then you have to go up maybe six levels or eight levels and then go down. It means that there's no sanctioned interface between those two teams. No one's actually said, "This is the way you communicate with each other," and better yet, the contrast of actually having a DBA embedded inside of your team. Now you have no cross-team dependencies. I think that is what creates these great working relationships.

In fact, in the book, Team of Teams, there's a word for that. They call it the "liaison officer." Within the special forces community, U.S. Navy Seals, they would embed their best officers into those other units so they have a clean, clear path of communication. They say, "The more important the message, at a sufficient point, you send a messenger." One last point, the need for this is so much greater now than it was, say, 50 years ago. In a hospital 70 years ago, you basically had doctors and nurses. These days, you have scores of specialties. You have radiologists, you have nocturnists, you have all these different functional specialties. It's the same in technology. How many specialties do we have now? We have containment experts, logging experts, feature flagging experts. It means that we have to have a lot more cross-communicating concerns, a lot more disciplines that we ideally want to put into cross-functional teams. Does that resonate with you?

Cody:

Yeah, it really does. To me, when communication between teams is forced for the sake of communication between teams, it's not a good place for it to start. When it's created to reduce friction, when the mechanism is made... I love your approach of embedding someone on a team and having them be the communicator liaison officer. The best performing teams that I've seen take that approach. The person who was on the DBA team but is the more forward looking, but understands the process. To your earlier point, the APIs to communicate with their other DBAs. The person who understands Oracle wrap the best and can fix their group fast, as opposed to Cody having to navigate the ether of tickets submitted. Maybe it gets to the junior DBA, maybe it doesn't.

Gene:

My favorite part in the Unicorn Project is the ticket machine where she opens up a ticket. It gets closed. She opens up another ticket and it's a different person. All she wants to do is talk to the same person again. Exactly to your point.

Cody:

Totally, it's interesting to think through. Henry P. had a great question here: "How are the qualities of high-performing teams similar to well constructed software functions, objects, and endpoints?" I think where Henry's going with this is how much do individuals and teams function like really well-written software and API interactions? So like, low coupling, single points of responsibility, single interaction points, interface integration, conversion… what are the similarities there?

Gene:

I feel like I'm not smart enough yet to really answer that question. Here's something kind of spooky. I've been looking at the book, Design Rules, by Dr. Carliss Baldwin at the Harvard Business School. I mentioned Dr. Steven Spear. It turns out that she was a critical influence on him as he was writing that famous, "Decoding the DNA of the Toyota Production" paper. Turns out she's also a huge influence on another person I'm a huge fan of which is Dr. Mik Kersten who is the founder/CEO of TaskRabbit. He wrote the Project to Product book. It's curious how this one person has influenced two people that I admire even though they're almost two generations apart. They both have PHDs, one in software engineering, one in manufacturing.

Who is Carliss Baldwin? She wrote a famous book in 1999 called Design Rules. Essentially, it's about modularity systems. Forgive my little tangent here. Let me just set a timer and give myself three minutes. One of the case studies in her book is the IBM-360 system. It was this multi-billion dollar bet where IBM said, "Harvard's scaling because of Moore's Law. We want to build five main frames that can target every niche from low to high, and we want to have one common operating software system with compatible cross." One of the things that they did was create hard partitions between peripherals like memory drives, CPU, keyboards, whatever, and they tried to do the same thing with the software. What it allowed was for the first time, allowed these different component teams to work independently, to evolve independently. It was so modular that you could even swap out components.

She said the option value it created, the economic value it created was… it allowed thousands of engineers to work independently along these modular boundaries. It also allowed all the engineers to leave IBM and create disk drive companies, moderating companies, memory companies. Within 20 years, the market capital of those companies is even larger than IBM itself. As she put it, "It created so much option value that it blew the entire industry apart."

I love this because it reminds me of the Amazon API story. How bad was it when no one could work independently? You could only do tens of releases a year back in 2004, but by creating a hard interface between teams is something that allowed teams to work independently, evolve independently, experiment independently. Maybe experiment is the key word there. You can make many more bets. I don't know exactly how that relates to high performing teams, but I know it does. I don't know to what extent we can say it's isomorphic, but I think it's going to be more on the high end than the low end. Your thoughts, Cody?

Cody:

Nothing to add. I think you hit it pretty spot on. This is one of those questions that can be taken in so many different directions that it's simple or it's complicated. I think the high performing teams do feel like really well written software functions, but you still end up importing functions from different helpers and different platforms to drive communications. Look at how much the middleware space... If you want to get super deep into engineering, look at the middleware space. The amount of entangled middleware that exists in large applications. I think it depends. And to your point, there's a lot to explore on this question.

Gene:

Maybe the most important point is you hide the implementation from the interface. The more you can do that, you create option value. Yeah, exactly.

Cody:

Good point. Something in my head that connected well, but it brings up an interesting point, thinking about change of rules and as we build out these software platforms, these environments, I still have some strong opinions about this space. What's your space on how the change process, release management, all the stuff plays in with these high performing engineering teams.

Gene:

That is a great question. One of the zingers, I think it was 2017, it was around change approval boards. What we found was the more organizations have to rely on approvals from distant change approval boards, the worse performance was, like every dimension. That was pretty awesome and one of the holdovers from ITIL. I spent more than a decade in the ITIL world. My first book was in 2004 called The Visible Ops about how to implement ITIL in four practical steps. I have a tremendous amount of empathy and understanding of what they were trying to achieve. That was in a different age. I think that comes from a place of, "What do you do if you have very strong, powerful change approval boards?"

I will point you to this amazing presentation given by Gus Paul at Morgan Stanley at the DevOps Enterprise Summit Virtual Europe. He told this amazing story how back before 2007, he worked side by side as a developer on the trading floor competing for every microsecond they could shave off to win in the marketplace on behalf of clients. After the 2007 Financial Crisis, it became a financially significant institution and then came the rules. It's like the "Lego Movie," "First came authority, then soon nothing could change it all." It was this incredible story about what he did about it. He and another group of mavericks said, "Our goal is to liberate the 8,000 developers at Morgan Stanley." They tried to automate the change approval process by creating some rules that say, "Hey, the smaller your change size, the more you're using things like feature flagging, the more we're going to allow you to take the fast path and focus manual approvals on the riskiest."

The funniest part of the story is there was a group of human change approvers who thought they could do a better job than machines. I thought I had heard every flavor of that story where it's like, "I can do schema changes better than machines. I can do deployments better than machines." I've never heard the, "I can look at a change ticket and know whether it's going to work better than a machine." They back tested the feature with like 40,000 real production changes and did a real live pilot test. They found that the rate of incidents was 0% for the change assisted review versus 0.6% for the humans. By doing that, they were able to unleash the full potential of developers. Demand of change is better.

Sorry, one quick thing. One of the few objections I heard was, "You're just gamifying the system. You're letting developers gamify the system." He's like, "That's the point. We know it drives up risk. We want to discourage that."

Cody:

Totally, I'm going to hit you with a fun question this time. We have a ton of questions in here, so we're not going to be able to get through all of them. I'm going to give you a fun one. One of the questions that came through was, "On your bookshelf behind you, what is your favorite of those books and why?"

Gene:

Certainly, one of them is High Velocity Edge by Dr. Steven Spear who has inspired me so much over the years. The second one is Transforming Nokia by Risto Siilasmaa. He was the founder of F-Secure. You might scoff at him saying, "Here's the guy who was partially responsible for the decimation for 93% of the value of Nokia," but oh my gosh. It's one of the best books I've read in 10 years. I had shared, in my presentation earlier this year, that my favorite line is when he said when he learned in 2010 that the bill times for the Nokia mobile phones was two days. It felt like being hit in the head with a sledgehammer because he knew that if any engineer required two days to know whether a change worked or would have to be redone, then all their hopes, dreams, and aspirations that resided was an illusion.

A quick little trivia point, I talked with him a month ago and got to ask him about that. He laughed and said, "No, it wasn't the bill times that took two days. It was the compile times that took two days." The bill times, with the drivers and so forth, was two weeks because he had to get all the modules from the different R&D centers around the globe.

Cody:

Wow.

Gene:

Yeah, pretty wild.

Cody:

I'm going to intentionally stay quiet on this next question until you finish answering. One of the questions that came through from Les Carter, "What's your advice from things you've seen in organizations around how to decouple deployments for releases when a client's not doing CI/CD? If they're out of the automated CI/CD space and they're still building manually, so to speak. They're not running traditional automated CI/CD and they can only release at specific times, how would you see people breaking apart to deploy for release in that case?"

Gene:

Holy cow. That's interesting. I'd love to hear what you have to say about this, but my first instinct is I'm not sure the first two are necessarily related. I think if you're not in a place where you can do deployments frequently, it becomes even more important that you can actually turn off functionality at run time. I would say as your inability to deploy frequently increases, the more you need something like a run time change where you're just going to have to change how they behave and appear to the customer at run time. Cody, how's that intuition?

Cody:

Of course, you want this automated world where you're automating the way your code's built and shipped, and you control the release side of it. The more I experience the way people use the tool and the space, not from a LaunchDarkly's perspective but the space in general, I truly believe that release is one, it's not coupled to the CI/CD process. There are certainly traits and behaviors.

If you're a really advanced CI/CD organization, you're going to get more mileage out of something like a release management/feature management platform, but they are separate. If you're not doing CI/CD, it becomes incredibly important because you're losing such a huge edge on the automation of your code base and your control, the deployment. Having something that lets you operate at scale... because remember, CI/CD gives you scale. It gives you the ability to not have people sitting at a keyboard manually building and manually shipping. If you're doing those now manually, you've got to have something to build your scale around releasing. To me, it's more important when you don't have a CI/CD space. I totally align with what you said.

Gene:

Good, so more ammo for you.

Cody:

Right, sounds good. Another question from Quinn, "What weak signals do you have about possible trends, tooling, methodology, anything that seems obscure now but could be where we're going in the next five to ten years?" I think this is a forward-looking question, looking at the trends that we're seeing today around things like tooling methodology are kind of distant now, but in five to ten years are going to be a really big deal for us?

Gene:

One of the things that show up in the DevOps handbook, obviously in a slightly different way, was the notion of postmortems. There was an incredible story told by [inaudible 00:50:52] who spent many years at Google and is now at Stack Overflow. He said one of his favorite things...

Gene Kim:

Oh no it's Randy Shoup, that ran the Google App Engine team, and he was chief architect at Ebay. He said everyone loves stories, so when something goes wrong, when there's a customer impacting outage at Google, everybody was waiting for the [explanation.] Write down, "Here's what's happened." Everyone's got to pounce on it because they want to know what happened, and we're just wired to hear stories. He said something really interesting. I heard this about ten years ago that just had a huge impression on me. As you do good postmortems, PIRs, and you start making a safer system, you eventually run out of customer impacting incidents to do PIRs on. What do you do then? He said you just increase the tolerance or decrease the tolerance. Instead of doing PIRs just for customer impacting incidents, he called it, "Let's do them on team packing incidents."

Any place where you have a near miss, maybe the point where it took longer than expected, it might have been scary, it might not have been, but let's not do a PIR on that. Of the seven safe cards that were there to prevent a customer impacting incident, six of them failed. Well, how do we make sure that we never get that close to the edge? When I hear weak signals of failure, I love those because when something bad happens, there's a school of thought called the Swiss cheese theory. You have a whole bunch of things that went wrong and any one of them is not single-handedly responsible for the incident. It takes a conflux of many things that went wrong, like seven things going wrong. Anytime you can dissect this weak signal failure, that becomes very worthwhile and is how we create safer conditions. To me, that's the most exciting aspect of weak failure signals. I hope that answered the question.

Cody:

Something tangential to that question, I think that was half the question and the other half was around, "Are there anything that we're seeing now as early indicators of things that are going to be really relevant for us in the next five to ten years as we look at software?" I think five years is such a long ways away, so that makes the question a little bit more challenging. As an example, I think in the microservers conversation we had, we've generally accepted that people have made microservices as a destination. They don't always look at the impact of microservices, right? Three years ago, it would have been signaling. Everybody's moving to this architecture and now they're finding they've created these complex beasts three years later. Do you see anything like that in your research now? It might be early in the books...

Gene:

Oh no, that's...

Cody:

Indicators like that... that are like, "Wow, I want to watch that in the next three to four years and see how it plays out in the enterprise."

Gene:

That's definitely one. I was talking to Mike Nygard, who wrote the Release It! book. He's one of the smartest people I know. We were talking about the exact same topic. Man, most people created monsters, but congratulations on creating domain-driven designs that have separated. Now, you've created this complexity coordination challenge. There's a coordination cost every time that you do this, so does it really make sense to have 3,000 different microservices. The intuition is you can get a lot of the same benefits just by having hard partitions and module boundaries. Deploying together is just fine, so I think that's one.

The other one I would mention is, holy cow, is it sure hard to write an app of UI. How many apps have I wanted to write in the last two years where I just thought about it and I'm like, "I just give up. It's not worth it because you have to write two apps." You have to write a front end app. You have to write a back end. It just doesn't even get the Java descriptor to the browser. It's so absurd. When I think about writing programs with UIs, I think about the 1990s where you could write a simple Windows 32 application with the minibar, a dialogue box, buttons. That was super easy and these days, it's so freaking difficult. I think there are a lot of people looking at this problem. The answer's not another JavaScript.

Cody:

You just said it. The framework issue. The solution today is to create a new framework. "The UIs hard, I'll just create another JavaScript."

Gene:

I think it'll sound more like, "I want to write one program, not two." For me, there's a lot of interesting projects that try to make it easier to write an app for simple UIs, and I'm so looking forward to seeing what they come up with because it's freaking exhausting.

Cody:

Yeah. We are cruising up to the end. I wanted to pay you a compliment. I love culturally how much you refer to people as mentors, even someone who's done so much as you have in the community. You still have people who I can tell you look up to and you respect. There are so many of us that look up to the things you've taught us, and I think it's such a good signal you're so humble and you still reference these people who taught you a ton and are still teaching you a ton. I just wanted to pay that compliment to you and say I think that culturally, that is such a cool example to set for those of us who are still younger in our careers in the science.

Gene:

Can I just maybe add one more thing to that? I love that saying that says, "You're only as good as the top five people you hang out with." I think it's so important to find people better than you because if you're hanging out with people who are worse than you, then the average doesn't work in your favor. Find the people that you would like to learn something from. The trick is to ask, "What can I do to make it worth their time? How can I be helpful to them?" Suddenly you'll find relationships that are productive, fulfilling, and often last a lifetime.

If you haven't yet, check out the full conversation with Gene here.