An Elegant Puzzle- Systems of Engineering Management
Page 4
2.6.1 What do you do?
The first step in succession planning is to figure out what you do. This seems like it should be easy, but I’ve found it surprisingly hard! There are the obvious things you do—one-on-ones, meetings, head count planning—but you’re probably filling in a hundred little holes that you don’t even think about.
The approach I’ve taken is to consider your work from several different angles:
Figure 2.10
Succession planning.
Take a look at your calendar and write down your role in meetings. This goes for explicit roles, like owning a meeting’s agenda, and also for more nuanced roles, like being the first person to champion others’ ideas, or the person who is diplomatic enough to raise difficult concerns.
Take a second pass on your calendar for non-meeting stuff, like interviewing and closing candidates.
Look back over the past six months for recurring processes, like roadmap planning, performance calibrations, or head count decisions, and document your role17 in each of those processes.
For each of the individuals you support, in which areas are your skills and actions most complementary to theirs? How do you help them? What do they rely on you for? Maybe it’s authorization, advice navigating the organization, or experience in the technical domain.
Audit inbound chats and emails for requests and questions coming your way.
If you keep a to-do list, look at the categories of the work you’ve completed over the past six months, as well as the stuff you’ve been wanting to do but keep putting off.
Think through the external relationships that have been important for you in your current role. What kinds of folks have been important, and who are the strategic partners that someone needs to know?
After exploring each of these avenues, you’ll have quite a long list of things. Test the list on a few folks whom you work closely with and see if you’ve missed anything. Congratulations, now you know what your job is!
2.6.2 Close the gaps
Take your list, and for each item try to identify the individuals who could readily take on that work. Good job, cross those out.
For items without someone who is ready today, identify a handful of individuals who could potentially take it over. (Depending on the size of your list, it may be helpful to cluster similar items into groups to reduce the toil of running this exercise.)
If you’re working at a well-established company, you may find that there aren’t too many gaps that couldn’t be readily filled by someone else. However, if you’re at a company going through hypergrowth,18 it’s common to find that everyone is already working in the most complex role of their career, and you’ll uncover gaps, gaping and cavernous.
Filter the gaps down to two lists:
The first should cover the easiest gaps to close. Maybe it’ll require a written document or a quick introduction. You should be able to close one of these in less than four hours.
The latter will be the riskiest gaps. These are the areas where you’re uniquely valuable to the company, where other folks are missing skills, and where getting the tasks done is truly important. You’d expect closing one of these gaps to require ongoing effort over several months.
Write up a plan to close all of the easy gaps and one or two of the riskiest gaps. Add it to your personal goals, and then, congrats, you’ve completed a round of succession planning!
This isn’t a one-time tool, but rather a great exercise to run once a year to identify things you could be delegating. This helps nurture an enduring organization, and also frees up time for you to continue growing into a larger role as well. You can even get a sense of how well you’re doing by taking a two- or three-week vacation and seeing what slips through the cracks.
Those items can be the start of next year’s list!
3
Chapter 3
Tools
Figure 3.1
System diagram for hiring and training new managers.
Tools
If you ask a manager about their proudest moments, they will probably tell you a story about helping someone grow. If you ask that same manager about their most challenging experience, they will probably talk about a layoff, a reorganization, a shift in company direction, or the time they weathered an economic downturn. In management, change is the catalyst of complexity.
The best changes often go unnoticed, moving from one moment of stability to another, with teams and organizations feeling stable at every step. The key tools for leading efficient change are systems thinking, metrics, and vision. When the steps of change are too wide, teams get destabilized, and gaps open within them. In those moments, managers create stability by becoming glue. We step in as product managers, program managers, recruiters, or salespeople to hold the bits together until an expert relieves us.
This chapter provides a box of tools for managing change, both from the abstract chair of guiding change and from the more visceral role of serving as glue during periods of transition.
3.1 Introduction to systems thinking
Many effective leaders I’ve worked with have the uncanny knack for working on leveraged1 problems. In some problem domains, the product management skill set2 is extraordinarily effective for identifying useful problems, but systems thinking is the most universally useful tool kit I’ve found.
If you really want a solid grasp on systems thinking fundamentals, you should read Thinking in Systems: A Primer3 by Donella H. Meadows, but I’ll do my best to describe some of the basics and to work through a recent scenario in which I found the systems thinking approach to be exceptionally useful.
3.1.1 Stocks and flows
The fundamental observation of systems thinking is that the links between events are often more subtle than they appear. We want to describe events causally—our managers are too busy because we’re trying to ship our current project—but few events occur in a vacuum.
Figure 3.2
System diagram for developer productivity.
Big changes appear to happen in a moment, but if you look closely underneath the big change, there is usually a slow accumulation of small changes. In this example, perhaps the managers are busy because no one hired and trained the managers required to support this year’s project deadlines. These accumulations are called stocks, and are the memory of changes over time. A stock might be the number of trained managers at your company.
Changes to stocks are called flows. These can be either inflows or outflows. Training a new manager is an inflow, and a trained manager who departs the company is an outflow. Diagrams in this chapter represent flows with solid dark lines.
The other relationship, represented in figure 3.1 by a dashed line, is an information link. This indicates that the value of a stock is a factor in the size of a flow. The link here shows that the time available for developing features depends on the number of trained managers.
Often, a stock outside of a diagram’s scope will be represented as a cloud, indicating that something complex happened there that we’re not currently exploring. It’s best practice to label every flow, and to keep in mind that every flow is a rate, whereas every stock is a quantity.
3.1.2 Developer velocity
When I started thinking of an example of the usefulness of systems thinking, one came to mind immediately. Since reading Accelerate: The Science of Lean Software and DevOp, by Nicole Forsgren, Gene Kim, and Jez Humble,4 I’ve spent a lot of time pondering the authors’ definition of velocity.
They focus on four measures of developer velocity:
Delivery lead time is the time from the creation of code to its use in production.
Deployment frequency is how often you deploy code.
Change fail rate is how frequently changes fail.
Time to restore service is the time spent recovering from defects.
The book uses surveys from tens of thousands of organizations to assess each one’s overall productivity and show how that correlates to the organization’
s performance on those four dimensions.
These dimensions kind of intuitively make sense as measures of productivity, but let’s see if we can model them into a system that we can use to reason about developer productivity:
Pull requests are converted into ready commits based on our code review rate.
Ready commits convert into deployed commits at deploy rate.
Deployed commits convert into incidents at defect rate.
Incidents are remediated into reverted commits at recovery rate.
Reverted commits are debugged into new pull requests at debug rate.
Linking these pieces together, we see a feedback loop, in which the system’s downstream behavior impacts its upstream behavior. With a sufficiently high defect rate or slow recovery rate, you could easily see a world where each deploy leaves you even further behind.
If your model is a good one, opportunities for improvement should be immediately obvious, which I believe is true in this case. However, to truly identify where to invest, you need to identify the true values of these stocks and flows! For example, if you don’t have a backlog of ready commits, then speeding up your deploy rate may not be valuable. Likewise, if your defect rate is very low, then reducing your recovery time will have little impact on the system.
Creating an arena for quickly testing hypotheses about how things work, without having to do the underlying work beforehand, is the aspect of systems thinking that I appreciate most.
3.1.3 Model away
Once you start thinking about systems, you’ll find that it’s hard to stop. Pretty much any difficult problem is worth trying to represent as a system, and even without numbers plugged in I find them powerful thinking aids.
If you do want the full experience, there are relatively few tools out there to support you. Stella5 is the gold standard, but the price is quite steep, with a nonacademic license costing more than a new laptop. The best cheap alternative that I’ve found is Insight Maker,6 which has some UI quirks but features a donation-based payment model.
3.2 Product management: exploration, selection, validation
Most engineering organizations separate engineering and product leadership into distinct roles. This is usually ideal, not only because these roles benefit from distinct skills but also because they thrive on different perspectives and priorities. It’s quite hard to do both well at the same time.
I’ve met many product managers who are excellent operators, but few product managers who can operate at a high degree while also getting deep into their users’ needs. Likewise, I’ve worked with many engineering managers who ground their work in their users’ needs, but I’ve known few who can fix their attention on those users when things start getting rocky within their team.
Figure 3.3
Iterative process of product development.
Reality isn’t always accommodating of this ideal setup. Maybe your team’s product manager leaves or a new team is being formed,7 and you, as an engineering leader, need to cover both roles for a few months. This can be exciting, and, yes this can be a time when “exciting” rhymes with “terrifying.”
Product management is a deep profession, and mastery requires years of practice, but I’ve developed a simple framework to use when I’ve found myself fulfilling product management8 responsibilities for a team. It’s not perfect, but hopefully it’ll be useful for you as well.
Product management is an iterative elimination tournament, with each round consisting of problem discovery, problem selection, and solution validation. Problem discovery is uncovering possible problems to work on, problem selection is filtering those problems down to a viable subset, and solution validation is ensuring that your approach to solving those problems works as cheaply as possible.
If you do a good job at all three phases, you win the luxury of doing it all again, this time with more complexity and scope. If you don’t do well, you end up forfeiting or being asked to leave the game.9
3.2.1 Problem discovery
The first phase of a planning cycle is exploring the different problems that you could pick to solve. It’s surprisingly common to skip this phase, but that, unsurprisingly, leads to inertia-driven local optimization. Taking the time to evaluate which problem to solve is one of the best predictors I’ve found of a team’s long-term performance.
The themes that I’ve found useful for populating the problem space are:
Users’ pain. What are the problems that your users experience? It’s useful to go broad via survey mechanisms, as well as to go deep by interviewing a smaller set of interesting individuals across different user segments.
Users’ purpose. What motivates your users to engage with your systems? How can you better enable users to accomplish their goals?
Benchmark. Look at how your company compares to competitors in the same and similar industries. Are there areas in which you are quite weak? Those are areas to consider investing in. Sometimes folks keep to a narrow lens when benchmarking, but I’ve found that you learn the most interesting things by considering both fairly similar and rather different companies.
Cohorts. What is hiding behind your clean distributions? Exploring your data for the cohorts hidden behind top-level analysis is an effective way to discover new kinds of users with surprising needs.
Competitive advantages. By understanding the areas you’re exceptionally strong in, you can identify opportunities that you’re better positioned to fill than other companies.
Competitive moats. Moats are a more extreme version of a competitive advantage. Moats represent a sustaining competitive advantage, which makes it possible for you to pursue offerings that others simply cannot. It’s useful to consider moats in three different ways:
What do your existing moats enable you to do today?
What are the potential moats you could build for the future?
What moats are your competitors luxuriating behind?
Compounding leverage. What are the composable blocks you could start building today that would compound into major product or technical leverage10 over time? I think of this category of work as finding ways to get the benefit at least twice. These are potentially tasks that initially don’t seem important enough to prioritize, but whose compounding value makes the work possible to prioritize.
A design example might be introducing to an application a new navigation scheme that better supports the expanded set of actions and modes you have today, and that will support future proliferation as well. (Bonus points if it manages to prevent future arguments about the positioning of new actions relative to existing ones!)
An infrastructure example might be moving a failing piece of technology to a new standard. This addresses a reliability issue, lowers maintenance costs, and also reduces the costs of future migrations.11
3.2.2 Problem selection
Once you’ve identified enough possible problems, the next challenge is to narrow down to a specific problem portfolio. Some of the aspects that I’ve found useful to consider during this phase are:
Surviving the round. Thinking back to the iterative elimination tournament, what do you need to do to survive the current round? This might be the revenue that the product will need to generate to avoid getting canceled, adoption, etc.
Surviving the next round. Where do you need to be when the next round in order to avoid getting eliminated then? There are a number of ways (many of them revolving around quality trade-offs) to reduce long-term throughput in favor of short-term velocity. (Conversely, winning leads to significantly more resources later, so that trade-off is appropriate sometimes!)
Winning rounds. It’s important to survive every round, but it’s also important to eventually win a round! What work would ensure that you’re trending toward winning a round?
Consider different time frames. When folks disagree about which problems to work on, I find that the conflict is most frequently rooted in different assumptions about the correct time frame to optimize for. What would you do if
your company was going to run out of money in six months? What if there were no external factors forcing you to show results until two years out? Five years out?
Industry trends. Where do you think the industry is moving, and what work will position you to take advantage of those trends, or to at least avoid having to redo the work in the near future?
Return on investment. Personally, I think people often under-prioritize quick, easy wins. If you’re in the uncommon position of understanding both the impact and costs of doing small projects, then take time to try ordering problems by expected return on investment. At this phase, you’re unlikely to know the exact solution, so figuring out cost is tricky, but for categories of problems that you’ve seen before you can probably make a solid guess. (If you don’t personally have relevant experience, ask around.) Particularly in cases where wins are compounding, they may be surprisingly valuable over the medium and long term.
Experiments to learn. What could you learn now that would make problem selection in the future much easier?
3.2.3 Solution validation
Once you’ve narrowed down the problem you want to solve, it’s easy to jump directly into execution, but that can make it easy to fall in love with a difficult approach. Instead, I’ve found that it’s well worth it to take the risk out of your approach with an explicit solution validation phase.