by Will Larson
Peer reviews are written by an individual’s peers, and are useful for recognizing mentorship and leadership contributions that might otherwise get missed. Structured properly, they are also useful for identifying problems that you’re missing out on, but peers are generally not comfortable providing negative feedback.
Upward reviews are used to ensure that managers’ performance includes the perspective of the individuals they manage directly. Format is similar to peer review.
Manager reviews are written by an individual’s manager, typically a synthesis of the self-peer, and upward reviews.
From these four sets of reviews, you can establish a provisional designation, which you can then use as an input to a calibration system. Calibrations are rounds of reviewing performance designations and reviews, with the aim of ensuring that ratings are consistent and fair across teams, organizations, and the company overall.
Figure 6.9
Diagram of a three-by-three grid, with one axis representing performance and another trajectory.
A standard calibration system will happen at each level of the organizational tree. It’s pretty challenging to strike the balance between avoiding calibration fatigue from many sessions and ensuring that the people doing the calibration are familiar with the work they are calibrating. Promotions are typically also considered during the calibration process.
Calibrations fall soundly in the unenviable category of things that are terrible but have no obvious replacement. Done poorly, they become bastions of bias and fierce politicization, but they’re pretty hard to do well even when everyone is well-meaning! Some rules that I’ve found useful for calibrations:
Adopt a shared quest for consistency. Try to frame calibration sessions as a community of coworkers working together toward the correct designations. Steer them away from anchoring on the designations they enter with, and toward shared inquiry. Doing this well requires a great deal of psychological safety among calibrators, which needs to be cultivated long before they enter the room.11
Read, don’t present. Many calibration systems depend heavily on whether managers are effective, concise presenters, which can become a larger factor in an individual’s designation than their own work. Don’t allow managers to pitch their candidates in the room, but instead have everyone read the manager review. This still depends on the manager’s preparation, but it reduces the pressure to perform in the calibration session itself.
Compare against the ladder, not against others. Comparing folks against each other tends to introduce false equivalencies without adding much clarity. Focus on the ladder instead.
Study the distribution, don’t enforce it. Historically, many companies fit designations to a fixed curve, often referred to as stack ranking.12 Stack ranking is a terrible solution, but here’s the problem it tries to address: it’s easy for the meaning of a given designation to skew as a company grows. Instead of fitting to a distribution, I find it useful to review the distributions across different organizations and to discuss why they appear to be deviating. Are the organizations performing at meaningfully different levels, or have the meanings skewed?
Figure 6.10
Diagram showing performance cycles happening in Q2 and Q4 but not Q1 or Q3.
Somewhat unexpectedly, performance designations are usually not meant to be the primary mechanism for handling poor performance. Instead, feedback for weak performance should be delivered immediately. Waiting for performance designations to deal with performance issues is typically a sign of managerial avoidance. That said, it does serve as an effective backstop for ensuring that these kinds of issues are being addressed.
6.5.3 Performance cycles
Once you have career ladders and performance designations defined, you need a process to ensure that designations are being periodically calculated in a consistent and fair fashion. That process is your performance cycle.
Most companies do either annual or biannual performance cycles, although it’s not unheard of to do them quarterly. The overhead of running a cycle tends to be fairly heavy, which leads companies to do them less frequently. Conversely, the feedback from the cycle tends to be very important, and it serves as a primary input into factors that individuals care about a great deal, like compensation, so there is also countervailing pressure to do them frequently.
The most important factor I’ve seen for effective performance cycles is forcing folks to practice. Providing well-structured timelines is very helpful, particularly if they’re concise, but there tend to be so many competing demands that people do the most minimal skimming they can get away with.
Having teams do a practice round, at least for new managers or after the cycle has been modified, is the only effective way I’ve found to get around this. You can often direct this practice as an opportunity for folks to get feedback on their self-reviews, ensuring that they find it useful even if they’re initially skeptical.
Finally, there is an interesting tension between improving the cycle as quickly as possible and allowing the cycle to stabilize so that people can get good at it. My sense is that you want to change the cycle at most once every second time. This lets everyone adapt fully, and it also gives you enough time to observe how well any changes work.
This is a small survey of some of the basics of designing performance management systems, and there is much, much more out there. It’s valuable to start from the common structures that most companies adapt, but don’t fall captive to them! Many of these systems are relatively recent inventions, and they take a particular, peculiar view of the ideal relationship between an employee and the company where they work.
If you’re looking for more, Laszlo Bock’s Work Rules!13 is a good read.
6.6 Career levels, designation momentum, level splits, etc.
The fundamentals of performance management systems can be a bit bland! What’s really interesting are the rough edges and unexpected emergent behaviors that come into play when you start to design and run these performance systems with lots of real people involved. These topics are particularly interesting to me because they are entirely unplanned yet they crop up consistently at pretty much every company as they scale.
Because these issues show up consistently, it’s possible to be prepared for them, rather than being caught by surprise. Since surprise is the cardinal sin of performance management, they often create a bunch of trouble for managers earlier in their career, and hopefully these notes will help you navigate these confusing waters!
Figure 6.11
The relationship between Performance designation and quantity of designations.
Designation momentum. This is the term for the natural tendency of a performance process to consistently produce the same evaluations for the same people despite changes in performance. If you are receiving good designations, this is an exciting phenomenon, because it means you are likely to continue receiving them. But I find that this is unexpectedly somewhat demotivating for high performers, who want consistent, direct feedback on their work so that they can keep improving. Those receiving poor designations, unsurprisingly, find this phenomenon quite frustrating, in particular because it makes it challenging for them to determine if it’s a lagging indicator of a previous issue or if they’re continuing to do poorly.
Many employees rely entirely on their manager to come up with a step-by-step path to high performance. That only works when designation momentum is taking you in a direction you’re happy with. If it’s not, you need to be the active participant in your success.
Propose a set of clear goals to your manager, and iterate together toward an explicit agreement on the expectations to hit the designation you’re aiming for. The goals need to be ambitious enough that your manager can successfully pitch the difficulty to their peers during calibration. If your manager is pushing back on your goals’ ambition, that is probably their way of saying that they think their peers will challenge their difficulty. That doesn’t mean your plan isn’t difficult enough—it may well
be very appropriate—but it does mean you’ll have to work to help them explain why the goals are appropriate.
Designation momentum occurs for individuals, but it also happens for teams and organizations. For teams in this position, setting clear goals is a good start, but it’s also necessary to align with your peers and leadership about why your work is important. It’s your work as a leader to explain why your work is important in terms that the organization understands and appreciates. This is a good example of where the failure to do so will have long-running costs.
Tit for tat. Calibration systems without strong process and fair referees can degrade into tit-for-tat favor trading. It’s very rare to see active collusion during calibration; rather, the most common case occurs when folks are silent instead of raising concerns. This silence may seem benign, but it isn’t: it pushes all responsibility for consistent outcomes onto the individuals refereeing the calibration process.
Encouraging engagement requires the calibration referee to model the behavior, but more importantly, it depends on building psychological safety and trust across the folks who are calibrating together.
Level expansion. As a company ages, it will inevitably add levels to support career progression. This happens even if a company remains the same size, and it’s primarily driven by company age, not size. This is frequently driven by a small cohort of the highest-level executives.
If a company is experiencing particularly frequent level expansion, it is usually a sign that progression, compensation, or recognition has been overly tied to your level system, and you should identify mechanisms to reduce pressure on leveling. Training and education are useful here, as is getting more structured in assigning important projects.
The other scenario that typically leads to level expansion is one in which very senior executives from other, typically older, companies are hired. These people benefited from level expansion as that company aged, and it’s hard for them to walk away from that heady cocktail of status, compensation, and recognition.
Level drift. Because level expansion is typically driven more by the need for career progression than by the introduction of objectively distinct accomplishment, levels added at the top create downward pressure on existing levels. Expectations at a given level decrease over time.
This inflation feels uncomfortable because we often rely on scarcity to determine value, but it’s very uncommon for companies to adjust compensation in response to level drift. Thus, in practice, it is a rising tide that raises all boats. From a company perspective, it’s important to manage level drift explicitly, so that it’s possible to apply the shifts consistently.
Opening of the gates. The combination of level expansion and level drift leads to periodic bursts in which a cohort of individuals cross level boundaries together. This happens most frequently at the second-highest level, one or two cycles after a level expansion.
As a manager, you need to coordinate with your peers to ensure that you are opening the gate together in a consistent fashion. It’s easy to miss these moments, but if you do you may inadvertently eject individuals from their natural cohort of peers. You can usually fix this in a subsequent cycle, but you’ll have missed out on momentum. After each cycle, take an hour and try to guess when the gates might open next, and talk with your peers about it.
Career level. For every role, a given level will be established as the career level, and most individuals are not expected to progress beyond that level. Over time, this often leads to career level clustering, with the normal distribution centered on the career level, as opposed to the typical goal of the distribution centering at mid-level.
Time-at-level limits. Employees who haven’t yet reached career level are expected to progress toward the career level at a consistent pace. This is typically used as a backstop for situations in which performance management seems appropriate but is not occurring. My experience is that most companies have time-at-level limits, but that there are many other ways to accomplish the same goals; such limits are useful as part of an overall system, but they aren’t necessary in many configurations. The only bit I’ve found predictably important here is being consistent in how they are applied.
Level split. Over time, it is common for the career level to experience level drift, leading to increasingly distinct clusters of workers who reached career level at its highest expectations and those who have reached it recently. Given the greatly elevated expectations beyond the career level, upward mobility remains evasive. Many companies decide to perform a level split: separating the career level into two halves.
This allows the distinct cohorts to inhabit distinct levels, and it extends the runway of career progression for most employees. Less obviously, the split tends to solidify the moat guarding access to post-career levels. The extended moat doesn’t catch those right on the border. It’s easy to handle these folks properly, but the moat absolutely does slow the progression for the cohort who were about a year away from changing levels.
Figure 6.12
Distributions of levels across different functions.
Crisis designations. These are alternatively known as retention-driven designations. Sometimes companies find themselves in a difficult situation, in which they have key individuals or even key teams that they consider to be at-risk, and one of the tools for addressing the situation is to recognize these individuals’ importance through elevated performance designations. These are intended as temporary, but they tend to reset expectations permanently in ways that sacrifice long-term usefulness of the performance system in order to manage through short-term difficulty. Sometimes stuff gets really hard, and if that’s the case, then use the tools at your disposal, but generally try to avoid doing this if possible.
There are, surely, hundreds more interesting topics when it comes to how performance systems work in practice as opposed to in design. Although these systems seem quite simple, I keep learning something new each time I go through a performance cycle, and I suspect that is a widely shared experience.
6.7 Creating specialized roles, like SRE or TPMs
People are sometimes surprised to learn that I started out working as a front-end engineer. I’d like to imagine it’s because I’m so terribly knowledgeable about infrastructure, but I suspect that it’s mostly grounded in my unconscionably poor design aesthetic. Something that has stuck with me from my front-end experience was feeling treated as a second-tier engineer: coworkers were unwilling to do any front-end work, but were careful to categorize it as trivial.14 The following decade has seen radical improvements in browser compatibility and JavaScript tooling, and today’s front-end engineers occupy an esteemed position in the hive mind’s subtle hierarchy of roles.
While nodes have swapped positions, the hierarchy of roles remains alive and well, which is at its clearest when someone proposes creating a job description15 or career ladder16 for a new role. Most recently, the question of whether to create a dedicated career ladder for site reliability engineers17 has been on my mind.
This particular question is dear to me, as I had the chance to design the initial iteration of Uber’s SRE role, and while I think that the design was reasonably good, there are also so many ways it could have gone more smoothly. Faced with the decision of whether to do it a second time, my first instinct was to freeze and think of the ways it didn’t work.
Grappling with the problem for some time, I remained conflicted, and decided to get more systematic around making this decision. I’ve written up the results of my musings here. Altogether, there are four interesting questions to dig into:
What are the pitfalls that these roles fall into?
If we do decide to create one, how do we set them up for success?
What are the benefits of specialized roles?
Putting it all together, when should you make a new role?
At the end, creating a new role will still be a difficult decision, but we’ll be armed with a framework to help make it.
6.7.1 Challenge
s
The major challenges I’ve encountered rolling out new roles are:
Class systems. Folks often look at new roles as less important, framing them as service roles to absorb work they’re not interested in. Sometimes roles are even explicitly designed this way, intended to reduce work for another role as opposed to having an empowering mission of their own.
Brittle organization. As you move away from generalized roles and toward specialists, an unexpected consequence is that your organization has far more single points of failure. Where everyone on a team was once able to perform all tasks fairly effectively, now if your project manager leaves, you’ll find that no one is able to fill the role very capably. This brittleness is particularly acute in organizations with frequent structural changes.18
Pattern matching. Designing a new role for your organization tends to involve dozens of important decisions in order to align it with your needs. Unfortunately, folks generally don’t take much time to appreciate these distinctions, and instead pattern match on how they’ve seen the role done elsewhere. This is a powerful force. Some meaningfully large percentage of people will both avoid taking any steps to learn how the role is intended to function—reading documentation, asking about the approach—and continue to express surprise that it doesn’t work exactly the way they saw at a previous company.
Task offloading. When a new role is created, the role’s designers have a very clear vision of how they want the new function to work. Many other individuals are not particularly concerned with how the creators want the function to work, and will view it as an opportunity to offload tasks that they find challenging, difficult, or uninteresting. This can lead to new roles being immediately underwater, which often feels like success to leaders attempting to grow the size of their organization. However, that can can easily translate into an unlovable work experience for those performing the role.