An Elegant Puzzle- Systems of Engineering Management
Page 19
Teams and organizations have a very limited appetite for new process; try to roll out one change at a time, and don’t roll out the next change until the previous change has enthusiastic compliance.
Process needs to be adapted to its environment, and success comes from blending it with your particular context.
7.1.1 Line management
Around the time your team reaches three engineers, you’ll want to be running a sprint process. There are many successful ways to run sprints. Try a few and see what resonates for you.
The criteria I use to evaluate if a team’s sprint works well:
Team knows what they should be working on.
Team knows why their work is valuable.
Team can determine if their work is complete.
Team knows how to figure out what to work on next.
Stakeholders can learn what the team is working on.
Stakeholders can learn what the team plans to work on next.
Stakeholders know how to influence the team’s plans.
One of my coworkers, Davin Bogan,1 likes to say that “shipping is a habit,” and a well-run sprint both helps teams establish that habit and serves as a mechanism that creates visibility within a team that hasn’t quite gotten there yet. As a team’s direct manager, you can use this to ground concerns around individuals who might not be ramping up successfully, and, as you move into middle management, sprints are useful for debugging within your organization.
Figure 7.2
Org chart for a line manager.
Within your sprint process, your backlog is particularly important: it’s the context-rich interface that you’ll use to negotiate changes in direction and priority with your stakeholders. It’s always more interesting to discuss which of two things we should do next, rather than whether something is worth doing.
As your team gets larger and the number of stakeholders you’re working with grows, you’ll also want to develop a roadmap describing your high-level plans over the next three to twelve months. Planning does not inherently create value, so aim to keep your roadmap as short as possible and allow teams to coordinate.
Initially, the distinction between your backlog and your roadmap may be quite small: your backlog a bit more detailed, your roadmap looking a bit further into the future. The value in having both is that this lets you specialize the backlog to be more useful for your team and design the roadmap to be more useful for your stakeholders, rather than relying on one tool to satisfy both sets of constraints.
At this point, most teams will be tracking operational metrics, with a focus on tracking day-to-day user and system behavior. These metrics tend to be exclusively focused on helping the team operate, and in particular detect outages, regressions, and other interruptions.
7.1.2 Middle management
As you move into middle management, you’ll become responsible for two to five line managers. As a result, you’ll need to shift away from day-to-day execution to give your line managers room to make an impact (and in order to free yourself up to make a larger impact as well).
You’ll be spending more time on your roadmaps as:
Your move from receiving asks from stakeholders to deeply understanding what is motivating those asks.
You invest in learning what other folks are working on in order to continuously validate that your teams’ efforts are valuable.
Figure 7.3
Org chart for a middle manager.
As you spend less time with your teams, you’ll want to start a weekly staff meeting with your managers. The best versions I’ve seen start with brief updates from each attendee, at most a couple of minutes per person, and then move into group discussions on shared topics. Topics might include running effective sprints, planning, career development, or whatever else proves useful. Done well, these discussions are the key learning forum for you and the managers you work with.
As your teams and the organization around you grow, you’ll start to see more and more cases of preventable misalignment: two teams working on similar projects because they’re unaware of each other, another team struggling because they don’t have a reliable email service when your team actually does have one. At that point, it’s time for each team to write a vision document: a concise statement of the team’s goals and the strategy for accomplishing those goals.
It’ll be extremely frustrating for some teams to write their vision documents, because it’ll force them to recognize and reconcile areas of distributed and unclear ownership. It’s worth the pain! Once your vision comes together, it becomes your roadmap’s North Star, and it will help you reconcile stakeholder asks with your longer-term product and technology strategies.
This is also when you’ll want to start skip-level one-on-ones to ensure that there are direct, open channels for feedback about your managers and your teams. Typically, if you’re learning something negative during a skip-level, you should have learned it somewhere else first, but rigorous processes have some redundancy. Nothing works consistently every time.
7.1.3 Managing an organization
As your organization starts to get even larger and you’re mostly managing middle managers, the playbook shifts again. Your staff meeting has changed in one of two ways:
The meeting has so many managers in it that they can’t even provide important updates. Plus the discussions have become unwieldly, with a couple of folks dominating conversations.
Alternatively, your meeting now includes your middle managers, who themselves are likely missing some of the context about what their teams are working on or struggling with.
The mechanism I’ve found most helpful in this case is to ensure that every team has a clear set of directional metrics in an easily discoverable dashboard. The metrics should cover both the longer-term goals of the team (user adoption, revenue, return users, etc.) and the operational baselines necessary to know if the team is functioning well (on-call load, incidents, availability, cost, and so on.) For each metric, the dashboard should make three things clear: the current value, the goal value, and the trend between them.
Figure 7.4
Org chart for a director.
Now your staff meetings can start with a quick metrics review to give a sense of whether there is somewhere you need to drill in, and, rather than peanut buttering, you can focus your attention on projects that are exceeding or struggling.
The other mechanism I’ve found to be exceptionally useful at this point is team snippets. These come out every two to four weeks and give snapshots of each team’s sprints: what they’re doing, why they’re doing it, and what they’re planning to do next. These are valuable for you to retain a sense of what your teams are working on, but they are invaluable for decentralizing coordination and communication between teams in your organization, as you become increasingly ineffective in that role.
Along the way, remember that your old problems still exist, it’s just that other folks are dealing with them instead. As you roll out new processes to solve your personal pain points, you should be handing off processes to your managers, and keeping those practices intact and running. This will leave you with a tapestry of reinforcing processes, which support you and each layer of management that you support.
7.2 Books I’ve found very useful
Folks occasionally ask me to recommend books to help them in their professional career. I can usually think of a couple recommendations in the moment, but I always feel as if I’m forgetting far more good books than I’m recommending. In the hope of providing a better answer going forward, I’ve written up some of the general purpose, leadership, and management books I’ve read.
Not all of these are classically great books—some are even a bit dull to read—but they’ve changed how I think in a meaningful way. They’re roughly sorted in order from those I found most valuable to least:
Thinking in Systems: A Primer
by Donella H. Meadows
For me, systems thinking has been the most effective universa
l tool for reasoning through complex problems, and this book is a readable, powerful introduction.
Don’t Think of an Elephant! Know Your Values and Frame the Debate
by George Lakoff
While written from a political perspective that some might find challenging, this book completely changed how I think about presenting ideas. You may be tempted to instead read Lakoff’s more academic writing, but I’d recommend reading this first as it’s much briefer and more readable.
Peopleware: Productive Projects and Teams
by Timothy Lister and Tom DeMarco
The book that has given generations of developers permission to speak on the challenges of space planning and open offices. Particularly powerful in grounding the discussion in data.
Slack: Getting Past Burnout, Busywork, and the Myth of Total Efficiency
by Tom DeMarco
Documents a compelling case for middle managers as the critical layer where organizational memory rests and learning occurs. A meditation on the gap between efficiency and effectiveness.
The Mythical Man-Month
by Frederick Brooks
The first professional book I ever read, this one opened my eyes to the wealth of software engineering literature waiting out there.
Good Strategy/Bad Strategy
The Difference and Why it Matters
by Richard Rumelt
This book gave me permission to acknowledge that many strategies I’ve seen professionally are not very good. Rumelt also offers a structured approach to writing good strategies.
The Goal: A Process of Ongoing Improvement
by Eliyahu M. Goldratt
Explores how constraint theory can be used to optimize process.
The Five Dysfunctions of a Team
by Patrick Lencioni
The Three Signs of a Miserable Job
by Patrick Lencioni
Another Lencioni book, this one explaining a three-point model for what makes jobs rewarding.
Finite and Infinite Games
by James P. Carse
Success in most life situations is about letting everyone continue to play, not about zero-sum outcomes. This seems pretty obvious, but for me it helped reset my sense of why I work.
INSPIRED: How to Create Tech Products Customers Love
by Marty Cagan
A thoughtful approach to product management.
The Innovator’s Dilemma: When New Technologies Cause Great Firms to Fail
by Clayton M. Christensen
A look at how being hyper-rational in the short run has led many great companies to failure. These days, I think about this constantly when doing strategic planning.
The E-Myth Revisited: Why Most Small Businesses Don’t Work and What to Do About It
by Michael E. Gerber
The idea that leadership is usually working “on” the business, not “in” the business. Work in the business to learn how it works, but then document the system and hand it off.
Fierce Conversations: Achieving Success at Work and in Life, One Conversation at a Time
by Susan Scott
How to say what you need to say. This is particularly powerful in giving structure to get past conflict aversion.
Becoming a Technical Leader: An Organic Problem-Solving Approach
by Gerald M. Weinberg
Permission to be a leader that builds on your strengths, not whatever model that folks think you should fit into.
Designing with the Mind in Mind
by Jeff Johnson
An introduction to usability and design, grounding both in how the brain works.
The Leadership Pipeline: How to Build the Leadership Powered Company
by Ram Charan, Steve Drotter, and Jim Noel
This book opened my eyes to just how thoughtful many companies are in intentionally growing new leadership.
The Manager’s Path: A Guide for Tech Leaders Navigating Growth and Change
by Camille Fournier
High Output Management
by Andy S. Grove
The First 90 Days: Proven Strategies for Getting Up to Speed Faster and Smarter, Updated and Expanded
by Michael D. Watkins
The Effective Executive: The Definitive Guide to Getting the Right Things Done
by Peter F. Drucker
Don’t Make Me Think: A Common Sense Approach to Web Usability
by Steve Krug
The Deadline: A Novel About Project Management
by Tom DeMarco
The Psychology of Computer Programming
by Gerald M. Weinberg
Adrenaline Junkies and Template Zombies: Understanding Patterns of Project Behavior
by Tom Demarco, Peter Hruschka, Tim Lister, Steve McMenamin, Suzanne Robertson, and James Robertson
The Secrets of Consulting: A Guide to Giving and Getting Advice Successfully
by Gerald M. Weinberg
Death by Meeting
by Patrick Lencioni
The Advantage: Why Organizational Health Trumps Everything Else in Business
by Patrick Lencioni
Rise: 3 Practical Steps for Advancing Your Career, Standing Out as a Leader, and Liking Your Life
by Patty Azzarello
The Innovator’s Solution: Creating and Sustaining Successful Growth
by Clayton M. Christensen and Michael E. Raynor
The Phoenix Project: A Novel about IT, DevOps, and Helping Your Business Win
by Gene Kim, Kevin Behr, and George Spafford
Accelerate: The Science of Lean Software and DevOps: Building and Scaling High Performing Technology Organizations
by Nicole Forsgren PhD, Jez Humble, and Gene Kim
7.3 Papers I’ve found very useful
I’ve long been a fan of hosting paper reading groups,2 where a group of folks sit down and talk about interesting technical papers. One of the first steps to do that is identifying some papers worth chatting about, and here is a list of some papers I’ve seen lead to excellent discussions!
“Dynamo: Amazon’s Highly Available Key-Value Store”
If you read only the abstract, you’d be forgiven for not being overly excited about the Dynamo paper. This paper presents the design and implementation of Dynamo, a highly available key-value storage system that some of Amazon’s core services use to provide an always-on experience. To achieve this level of availability, Dynamo sacrifices consistency under certain failure scenarios. It makes extensive use of object versioning and application-assisted conflict resolution in a manner that provides a novel interface for developers to use.
That said, this is in some senses “the” classic modern systems paper. It has happened more than once that an engineer I’ve met has only read a single systems paper in their career, and that paper was the Dynamo paper. This paper is a phenomenal introduction to eventual consistency, coordinating state across distributed storage, reconciling data as it diverges across replicas, and much more.
“Hints for Computer System Design”
Butler Lampson3 is winner of the ACM Turing Award (among other awards), and worked at the Xerox PARC. This paper concisely summarizes many of his ideas around systems design, and is a great read.
In his words:
Studying the design and implementation of a number of computers has led to some general hints for system design. They are described here and illustrated by many examples, ranging from hardware such as the Alto and the Dorado to application programs such as Bravo and Star.
This paper itself acknowledges that it doesn’t aim to break any new ground, but it’s a phenomenal overview.
“Big Ball of Mud”
A reaction against exuberant publications about grandiose design patterns, this paper labels the most frequent architectural pattern as the Big Ball of Mud, and explores why elegant initial designs rarely remain intact as a system goes from concept to solution.
From the abstract:
While much attention has been focused on high-l
evel software architectural patterns, what is, in effect, the de-facto standard software architecture is seldom discussed. This paper examines this most frequently deployed of software architectures: the BIG BALL OF MUD. A BIG BALL OF MUD is a casually, even haphazardly, structured system. Its organization, if one can call it that, is dictated more by expediency than design. Yet, its enduring popularity cannot merely be indicative of a general disregard for architecture.
Although humor certainly infuses this paper, it’s also correct that software design is remarkably poor. Very few systems have a design phase and few of those resemble the initial design (and documentation is rarely updated to reflect later decisions), making this an important topic for consideration.
“The Google File System”
From the abstract:
The file system has successfully met our storage needs. It is widely deployed within Google as the storage platform for the generation and processing of data used by our service as well as research and development efforts that require large data sets. The largest cluster to date provides hundreds of terabytes of storage across thousands of disks on over a thousand machines, and it is concurrently accessed by hundreds of clients.
In this paper, we present file system interface extensions designed to support distributed applications, discuss many aspects of our design, and report measurements from both micro-benchmarks and real-world use. Google has done something fairly remarkable in defining the technical themes in Silicon Valley and, at least debatably, across the entire technology industry. The company has been doing so for more than the last decade, and was only recently joined, to a lesser extent, by Facebook and Twitter as they reached significant scale. Google has defined these themes largely through noteworthy technical papers. The Google File System (GFS) paper is one of the early entries in that strategy, and is also remarkable as the paper that largely inspired the Hadoop File System (HFS).