Pain-Driven Development: Why Greedy Algorithms Are Bad for Engineering Orgs

I recently wrote about the importance of understanding decision impact and why it’s important for building an empathetic engineering culture. I presented the distinction between pain displacement and pain deferral, and this was something I wanted to expand on a bit.

When you distill it down, I think what’s at the heart of a lot of engineering orgs is this idea of “pain-driven development.” When a company grows to a certain size, it develops limbs, and each of these limbs has its own pain receptors. This is when empathy becomes important because it becomes harder and less natural. These limbs of course are teams or, more generally speaking, silos. Teams have a natural tendency to operate in a way that minimizes the amount of pain they feel.

It’s time for some game theory: pain is a zero-sum game. By always following the path of least resistance, we end up displacing pain instead of feeling it. This is literally just instinct. In other words, by making locally optimal choices, we run the risk of losing out on a globally optimal solution. Sometimes this is an explicit business decision, but many times it’s not.

Tech debt is a common example of when pain displacement is a deliberate business decision. It’s pain deferral—there’s pain we need to feel, but we can choose to feel it later and in the meantime provide incremental value to the business. This is usually a team choosing to apply a bandaid and coming back to fix it later. “We have this large batch job that has a five-minute timeout, and we’re sporadically seeing this timeout getting hit. Why don’t we just bump up the timeout to 10 minutes?” This is a bandaid, and a particularly poor one at that because, by Parkinson’s law, as soon as you bump up the timeout to 10 minutes, you’ll start seeing 11-minute jobs, and we’ll be having the same discussion over again. I see the exact same types of discussions happening with resource provisioning: “we’re hitting memory limits—can we just provision our instances with more RAM?” “We’re pegging CPU. Obviously we just need more cores.” Throwing hardware at the problem is the path of least resistance for the developers. They have a deliverable in front of them, they have a lot of pressure to ship, this is how they do it. It’s a greedy algorithm. It minimizes pain.

Where things become really problematic is when the pain displacement involves multiple teams. This is why understanding decision impact is so key. Pain displacement doesn’t just involve engineering teams, it also involves customers and other stakeholders in the organization. This is something I see quite a bit: displacing pain away from customers onto various teams within the org by setting unrealistic expectations up front.

For example, we build a product MVP and run it on a single, high-memory instance, and we don’t actually write data out to disk to keep it fast. We then put this product in front of sales folks, marketing, or even customers and say “hey, look at this cool thing we built.” Then the customers say “wow, this is great! I don’t feel any pain at all using this!” That’s because the pain has been moved elsewhere.

This MVP isn’t fault tolerant because it’s running on a single machine. This MVP isn’t horizontally scalable because we keep all the state in memory on one instance. This MVP isn’t safe because the data isn’t durably stored to disk. The problem is we weren’t testing at scale, so we never felt any pain until it was too late. So we start working backward to address these issues after the fact. We need to run multiple instances so we can have failover. But wait, now we need stateful request routing to maintain our performance expectations. Does our infrastructure support that? We need a mechanism to split and merge units of work that plays nicely with our autoscaling system to give us a better scale story, avoid hot instances, and reduce excess capacity. But wait, how long will that take to build? We need to attach persistent disks so we can durably store data and keep things fast. But wait, does our cluster provisioning allow for that? Does that even meet our compliance requirements?

The only way you reach this point is by making local decisions without thinking about the trade-offs involved or the fact that what you’ve actually done is simply displaced the pain.

If someone doesn’t feel pain, they have a harder time developing a sense of empathy. For instance, the goal of any good operations team is to effectively put itself out of a job by empowering developers to self-service through tooling and automation. One example of this is infrastructure as code, so an ops team adds a process requiring developers to provision their own infrastructure using CloudFormation scripts. For the ops folks, this is a boon—now they no longer have to labor through countless UIs and AWS consoles to provision databases, queues, and the like for each environment. Developers, on the other hand, were never exposed to that pain, so to them, writing CloudFormation scripts is a new hoop to jump through—setting up infrastructure is ops’ job! They might feel pain now, but they don’t necessarily see the immediate payoff.

A coworker of mine recently posed an interesting question: why do product teams often overlook the need for tools required to support their product in production until after they’ve deployed to production? And while the answer he posits is good, and one I very much agree with—solving a problem and solving the problem of solving problems are two very different problems—my answer is this: pain-driven development. In this case, you’re deferring the pain by hooking up debuggers or SSHing into the box and poking about instead of relying on instrumentation which is what we’re limited to in the field. As long as you’re cognizant of this and know that at some point you will have to feel some pain, it can be okay. But if you’re just displacing pain thinking it’s actually disappearing, you’ll be in for a rude awakening. Remember, it’s a zero-sum game.

I’m looking at this through an infrastructure or operations lens, but this applies everywhere and it cuts both ways. Understanding the why behind something rather than just the how is critical to building empathy. It’s being able to look at a problem through someone else’s perspective and applying that to your own work. Changing your perspective is a powerful way to deepen your relationships. Pain-driven development is intoxicating because it allows us to move fast. It’s a greedy algorithm, but it provides a poor global approximation for large engineering organizations. Thinking holistically is important.

Decision Impact

I think a critical part of building an empathetic engineering culture is understanding decision impact. This is a blindspot that I see happening a lot: a deliberate effort to understand the effects caused by a decision. How does adopting X affect operations? Does our dev tooling support this? Is this architecture supported by our current infrastructure? What are the compliance or security implications of this? Will this scale in production? A particular decision might save you time, but does it create work or slow others down? Are we just displacing pain somewhere else?

What’s needed is a broad understanding of the net effects. Pain displacement is an indication that we’re not thinking beyond the path of least resistance. The problem is if we lack a certain empathy, we aren’t aware the pain displacement is occurring in the first place. It’s important we widen our vision beyond the deliverable in front of us. We have to think holistically—like a systems person—and think deeply about the interactions between decisions. Part of this is having an organizational awareness.

Tech debt is the one exception to this because it’s pain displacement we feel ourselves—it’s pain deferral. This is usually a decision we can make ourselves, but when we’re dealing with pain displacement involving multiple teams, that’s when problems start happening. And that’s where empathy becomes critical because software engineering is more about collaboration than code and shit has this natural tendency to roll downhill.

The first sentence of The Five Dysfunctions of a Team captures this idea really well: “Not finance. Not strategy. Not technology. It is teamwork that remains the ultimate competitive advantage, both because it is so powerful and so rare.” The powerful part is obvious, but the bit about rarity is interesting when we think about teams holistically. The cause, I think, is deeply rooted in the silos or fiefdoms that naturally form around teams. The difficulty comes as an organization scales. What I see happening frequently are goals that diverge or conflict. The fix is rallying teams around a shared cause—a single, compelling vision. Likewise, it’s thinking holistically and having empathy. Understanding decision impact and pain displacement is one step to developing that empathy. This is what unlocks the rarity part of teamwork.

Shit Rolls Downhill

Building software of significant complexity is tough because a lot of pieces have to come together and a lot of teams have to work in concert to be successful. It can be extraordinarily difficult to get everyone on the same page and moving in tandem toward a common goal. Product development is largely an exercise in trust (or perhaps more accurately, hiring), but even if you have the “right” people—people you can trust and depend on to get things done—you’re only halfway there.

Trust is an important quality to screen for, difficult though it may be. However, a person’s trustworthiness or dependability doesn’t really tell you much about that person as an engineer. The engineering culture is something that must be cultivated. Etsy’s CTO, John Allspaw, said it best in a recent interview:

Post-mortem debriefings every day are littered with the artifacts of people insisting, the second before an outage, that “I don’t have to care about that.”

If “abstracting away” is nothing for you but a euphemism for “Not my job,” “I don’t care about that,” or “I’m not interested in that,” I think Etsy might not be the place for you. Because when things break, when things don’t behave the way they’re expected to, you can’t hold up your arms and say “Not my problem.” That’s what I could call “covering your ass” engineering, and it may work at other companies, but it doesn’t work here.

Allspaw calls this the distinction between hiring software developers and software engineers. This perception often results in heated debate, but I couldn’t agree with it more. There is a very real distinction to be made. Abstraction is not about boundaries of concern, it’s about boundaries of focus. Engineers need to have an intimate understanding of this.

Engineering, as a discipline and as an activity, is multi-disciplinary. It’s just messy. And that’s actually the best part of engineering. It’s not about everyone knowing everything. It’s about paying attention to the shared, mutual understanding.

But engineering is more than just technical aptitude and a willingness to “dig in” to the guts of something. It’s about having an acute awareness of the delicate structure upon which software is built. More succinctly, it’s about having empathy. It’s recognizing the fact that shit rolls downhill.

Shit Rolls Downhill

For things to work, the entire structure has to hold, and no one point is any more or less important than the others. It almost always starts off with good intentions at the top, but the shit starts to compound and accelerate as it rolls effortlessly and with abandon toward the bottom. There are a few aspects to this I want to explore.

Understand the Relationships

This isn’t to say that folks near the top are less susceptible to shit. Everyone has to shovel it, but the way it manifests is different depending on where you find yourself on the hill. The key point is that the people above you are effectively your customers, either directly or indirectly, and if you’re toward the top, maybe literally.

And, as all customers do, they make demands. This is a very normal thing and is to be expected. Some of these demands are reasonable, others not so much. Again, this is normal, but what do we make of these demands?

There are some interesting insights we can take from The Innovator’s Dilemma (which, by the way, is an essential read for anyone looking to build, run, or otherwise contribute to a successful business), which are especially relevant toward the top of the hill. Mainly, we should not merely take the customer’s word as gospel. When it comes to products, feature requests, and “the way things should be done,” the customer tends to have a very narrow and predisposed view. I find the following passage to be particularly poignant:

Indeed, the power and influence of leading customers is a major reason why companies’ product development trajectories overshoot demands of mainstream markets.

Essentially, too much emphasis can be placed on the current or perceived needs of the customer, resulting in a failure to meet their unstated or future needs (or if we’re talking about internal customers, the current or future needs of the business). Furthermore, we can spend too much time focusing on the customer’s needs—often perceived needs—culminating in a paralysis to ship. This is very anti-continuous-delivery. Get things out fast, see where they land, and make appropriate adjustments on the fly.

Giving in to customer demands is a judgement game, but depending on the demand, it can have profound impact on the people further down the hill. Thus, these decisions should be made accordingly and in a way that involves a cross section of the hill. If someone near the top is calling all the shots, things are not going to work out, and in all likelihood, someone else is going to end up getting covered in shit.

An interesting corollary is the relationship between leadership and engineers. Even a single, seemingly innocuous question asked in passing by a senior manager can change the entire course of a development team. In fact, the manager was just trying to gain information, but the team interpreted the question as a statement suggesting “this thing needs to be done.” It’s important to recognize this interaction for what it is.

Set Appropriate Expectations

In truth, the relationship between teams is not equivalent to the relationship between actual customers and the business. You may depend on another team in order to provide a certain feature or to build a certain product. If the business is lagging, the customer might take their money elsewhere. If the team you depend on is lagging, you might not have the same liberty. This leads to the dangerous “us versus them” trap teams fall in as an organization grows. The larger a company gets, the more fingers get pointed because “they’re no longer us, they’re them.” There are more teams, they are more isolated, and there are more dependencies. It doesn’t matter how great your culture is, changing human nature is hard. And when pressure builds from above, the finger-pointing only intensifies.

Therefore, it’s critical to align yourself with the teams you depend on. Likewise, align yourself with the teams that depend on you, don’t alienate them. In part, this means have a realistic sense of urgency, have realistic expectations, and plan accordingly. It’s not reasonable to submit a work item to another team and turn around and call it a blocker. Doing so means you failed to plan, but now to outside observers, it’s the other team which is the problem. As we prioritize the work precipitated by our customers, so do the rest of our teams. With few exceptions, you cannot expect a team to drop everything they’re doing to focus on your needs. This is the aforementioned “us versus them” mentality. Instead, align. Speak with the team you depend on, understand where your needs fit within their current priorities, and if it’s a risk, be willing to roll up your sleeves and help out. This is exactly what Allspaw was getting at when he described what a “software engineer” is.

Setting realistic expectations is vital. Just as products ship with bugs, so does everything else in the stack. Granted, some bugs are worse than others, but no amount of QA will fully prevent them from going to production. Bugs will only get worked out if the code actually gets used. You cannot wait until something is perfect before adopting it. You will wait forever. Remember that Agile is micro failure on a macro level. Adopt quickly, deploy quickly, fail quickly, adjust quickly. As Jay Kreps once said, “The only way to really know if a system design works in the real world is to build it, deploy it for real applications, and see where it falls short.”

While it’s important to set appropriate expectations downward, it’s also important to communicate upward. Ensure that the teams relying on you have the correct expectations. Establish what the team’s short-term and long-term goals are and make them publicly available. Enable those teams to plan accordingly, and empower them so that they can help out when needed. Provide adequate documentation such that another engineer can jump in at any time with minimal handoff.

Be Curious

This largely gets back to the quote by John Allspaw. The point is that we want to hire and develop software engineers, not programmers. Being an engineer should mean having an innate curiosity. Figure out what you don’t know and push beyond it.

Understand, at least on some level, the things that you depend on. Own everything. Similarly, if you built it and it’s running in production, it’s on you to support it. Throwing code over the wall is no longer acceptable. When there’s a problem with something you depend on, don’t just throw up your hands and say “not my problem.” Investigate it. If you’re certain it’s a problem in someone else’s system, bring it to them and help root cause it. Provide context. When did it start happening? What were the related events? What were the effects? Don’t just send an error message from the logs.

This is the engineering culture that gets you the rest of the way there. The people are important, especially early on, but it’s the core values and practices that will carry you. The Innovator’s Dilemma again provides further intuition:

In the start-up stages of an organization, much of what gets done is attributable to resources—people, in particular. The addition or departure of a few key people can profoundly influence its success. Over time, however, the locus of the organization’s capabilities shifts toward its processes and values. As people address recurrent tasks, processes become defined. And as the business model takes shape and it becomes clear which types of business need to be accorded highest priority, values coalesce. In fact, one reason that many soaring young companies flame out after an IPO based on a single hot product is that their initial success is grounded in resources—often the founding engineers—and they fail to develop processes that can create a sequence of hot products.

Summary

There will always be gravity. As such, shit will always roll downhill. It’s important to embrace this structure, to understand the relationships, and to set appropriate expectations. Equally important is fostering an engineering culture—a culture of curiosity, ownership, and mutual understanding. Having the right people is essential, but it’s only half the problem. The other half is instilling the right values and practices. Shit rolls downhill, but if you have the right people, values, and practices in place, that manure might just grow something amazing.

The Sharing Economy: A Race to the Bottom

Last year, Airbnb hosted more than four million guests around the world.1 A million rides were shared on Lyft just over a year after it launched in 20122. These data points alone seem impressive, but the growth of this phenomenon is staggering. The “sharing economy”—as it’s being called—enables just about anyone to become their own micro-entrepreneur. New companies like Uber, TaskRabbit, and Airbnb are popping up at a remarkable rate, and they’re disrupting traditional businesses in astonishing fashion. An entire conference dedicated to this new socio-economic system occurred just a few months ago, but the truth is the sharing economy is little more than marketing sleight of hand.

What Rhymes with Sharing?

A significant driving force purportedly behind the sharing economy is a social one—a notion of friendship, community, and trust. The rideshare service Lyft uses a tagline “your friend with a car.” Venture capitalist Scott Weiss of Andreessen Horowitz calls it “a real community—with both the drivers and riders being inherently social—making real friendships and saving money.” The two-day Share conference took place in May, organized by Natalie Foster, former New Media director to the Obama campaign. Foster claims that “we’re building a movement” with a guiding principle that access trumps ownership.

One of Airbnb’s founders, Nate Blecharczyk, suggests, “We couldn’t have existed ten years ago, before Facebook, because people weren’t really into sharing.” The paradoxical irony is that people have never been more disconnected and seemingly connected at the same time than any point in history. A Trulia survey3 last year indicated that almost half of all Americans don’t know their neighbors’ names. An Australian sociologist found relations in “a precarious balance” after investigating community responses to the 2011 flooding in Queensland, concluding that “we are less likely than ever to know” our neighbors.4 Yet, Foster and others assert the sharing movement is “recreating the virtues of small-town America [by] rejecting the idea that stuff makes us happier, that ownership is better than access, that we should all live in isolation.”

It’s Not Voodoo (Economics)

Shockingly, the reality of the sharing economy isn’t a sociological one, it’s an economic one. The selling point of Lyft isn’t the fist bump passengers are greeted with. Technology has reduced the barrier for the peer-to-peer exchange of goods and services, but such an exchange is hardly new. The sharing economy is simply a euphemism for micro-subletting developed by marketers to allow companies to insert themselves as transaction brokers. That’s not to conflate the ideas of “sharing” and “free,” but to perceive these companies as community-first, business-second would be disingenuous. Union Square Ventures partner Brad Burnham made this clear at Share, diverging from some of the self-congratulatory talk. “What we’re talking about is the natural tendency of capitalism to consistently find a more efficient way of delivering something,” he says. “It’s information technology lowering transaction costs and revealing assets that can be utilized.”

The concept of for-profit sharing, specifically as a business model, isn’t alarming. In fact, it’s the nature of capitalism. However, the sharing economy isn’t what it is because people want or need a lifestyle of access-over-ownership, it’s because, for some, it’s all there is. On one hand, it’s a supplementary source of income for people rich in assets. On the other hand, it’s a livelihood for those who aren’t.

The Bottom is a Long Way Down, Let’s Split a Cab

Uber and Lyft have been engaged in a savage ground war, both in pricing5 and business tactics. The same can be said of other such companies offering services for less under the guise of “community” and “sharing.” It’s a troubling race downward, but what’s more troubling is the reason many of these companies are able to disrupt incumbents so pervasively. Airbnb et al. bypass industry-specific taxation, insurance, and further regulations. They’ve felt it in fines and other legal difficulties. In some sense, “micro-entrepreneurs” are really just employees less a salary or wage, health insurance, paid-time off, and employer protection.

“We are enabling micro-entrepreneurs to build their own business, to set their own schedules, specify how much they want to get paid, say what they are good at, and then incorporate the work into their lifestyle,” says TaskRabbit founder Leah Busque.6 Doing laundry covered in cat diarrhea or breaking down boxes probably isn’t the American Dream for most, but it’s becoming increasingly indicative that income inequality is a driving undercurrent of the sharing economy.

A Bloomberg national poll7 conducted late last year revealed that nearly two-to-one Americans believe the U.S. no longer offers everyone an equal chance to get ahead. This is felt by many participating in the sharing economy. Burnham raises doubts about the long-term viability of companies like Airbnb, Uber, and Lyft, all of which have raised hundreds of millions of dollars in venture capital. His concern is that every dollar returned to investors is a dollar the users of the service don’t see, yet they created the value in the first place. “Those companies won’t be able to get out from under that structure,” Burnham says, suggesting that a new generation of “thin” share-economy companies will take their place. The tendency, he proposes, will be for competition to become “thinner and thinner to the point where you end up at decentralized autonomous corporation” along the lines of Bitcoin.8

The Future of Sharing

While “sharing economy” is a misnomer, the businesses that participate in it are disrupting markets. What’s unclear is how this will shakeout in the long term. The likely outcome is that this new model will become assimilated into existing models and embraced by incumbents. Airbnb, Uber, and company may continue to exist in some capacity, but they face challenge from leaner “skinny platforms” using more innovative funding strategies. It’s improbable these disruptive newcomers will remain unfettered from regulation. In a sharing economy with no floor, a race to the bottom is without end.

  1. https://www.airbnb.com/annual []
  2. http://techcrunch.com/2013/08/08/lyft-1m-dc []
  3. http://info.trulia.com/neighbor-survey-2013 []
  4. http://www.macleans.ca/society/the-end-of-neighbours []
  5. http://fortune.com/2014/05/28/in-price-wars-some-uber-and-lyft-drivers-feel-the-crunch []
  6. http://www.businessweek.com/articles/2012-09-13/my-life-as-a-taskrabbit []
  7. http://www.bloomberg.com/news/2013-12-11/americans-say-dream-fading-as-income-gap-hurts-chances.html []
  8. http://www.forbes.com/sites/jeffbercovici/2014/05/13/why-uber-and-airbnb-might-be-in-big-trouble []