The Pentium Chronicles: The People, Passion, and Politics Behind Intel's Landmark Chips (Practitioners)
Page 1
THE PENTIUM
CHRONICLES
_4~.
Press Operating Committee
Board Members
John Horch, Independent Consultant
Mark J. Christensen, Independent Consultant
Ted Lewis, Professor Computer Science, Naval Postgraduate School
Hal Berghel, Professor and Director, School of Computer Science, University of Nevada
Phillip Laplante, Associate Professor Software Engineering, Penn State University
Richard Thayer, Professor Emeritus, California State University, Sacramento
Linda Shafer, Professor Emeritus University of Texas at Austin
James Conrad, Associate Professor UNC—Charlotte
Deborah Plummer, Manager—Authored books
IEEE Computer Society Executive Staff
David Hennage, Executive Director
Angela Burgess, Publisher
IEEE Computer Society Publications
The world-renowned IEEE Computer Society publishes, promotes, and distributes a wide variety of authoritative computer science and engineering texts. These books are available from most retail outlets. Visit the CS Store at http://computer.org/cspress for a list of products.
IEEE Computer Society / Wiley Partnership
The IEEE Computer Society and Wiley partnership allows the CS Press authored book program to produce a number of exciting new titles in areas of computer science and engineering with a special focus on software engineering. IEEE Computer Society members continue to receive a 15% discount on these titles when purchased through Wiley or at wiley.com/ieeecs.
To submit questions about the program or send proposals please e-mail dplum-mer@computer.org or write to Books, IEEE Computer Society, 100662 Los Vaqueros Circle, Los Alamitos, CA 90720-1314. Telephone +1-714-821-8380. Additional information regarding the Computer Society authored book program can also be accessed from our web site at http://computer.org/cspress.
THE PENTIUM
CHRONICLES
The People, Passion,
and Politics Behind
Intel’s Landmark Chips
Robert R Colwell
CONTENTS
Foreword
Preface
I INTRODUCTION
P6 Project Context
Betting on CISC
Proliferation Thinking
The Gauntlet
Developing Big Ideas
Defining Success and Failure
Senior Wisdom
Four Project Phases
The Business of Excellence
2 THE CONCEPT PHASE
Of Immediate Concern
Success Factors
Clear Goals
The Right People
P6 Senior Leadership
Setting the Leadership Tone
Managing Mechanics
Physical Context Matters
The Storage Room Solution
Beyond the Whiteboard
“Kooshing” the Talkers
A Data-Driven Culture
The Right Tool
The “What Would Happen If” Game
DFA Frontiers
Performance
Benchmark Selection
Avoiding Myopic Do-Loops
FloatingPoint Chickens and Eggs
Legacy Code Performance
Intelligent Projections
All Design Teams Are Not Equal
Overpromise or Overdeliver?
Customer Visits
Loose Lips
Memorable Moments
Microsoft
Novell
Compaq
Insights from Input
Not-So-Secret Instructions
Help from the Software World
The Truth about Hardware and Software
Establishing the Design Team
Roles and Responsibilities
Presilicon Validation
Wizard Problem Solving
Making Microcode a Special Case
Cubicle Floorplanning
Architects, Engineers, and Schedules
Coding, Egos, Subterfuge
3 THE REFINEMENT PHASE
Of Immediate Concern
Success Factors
Handling the Nonquantifiable
Managing New Perspectives
Planning for Complexity
Behavioral Models
Managing a Changing POR
The Wrong Way to Plan
Engineering Change Orders
The Origin of Change
When, Where, Who
Communicating Change
Timely Resolution, No Pocket Vetoes
The ECO Czar
ECO Control and the Project POR
The Bridge from Architecture to Design
Focus Groups
Product Quality
Mismanaging Design Errors
Make an Example of the Offender
Hire Only Geniuses
Flog Validation
Avoid/Find/Survive
Design to Avoid Bugs
When Bugs Get In Anyway, Find Them Before Production
Identifying Bugs
Tracking Bugs
Managing Validation
Plan to Survive Bugs that Make It to Production
A Six-Step Plan for High Product Quality
The Design Review
How Not to Do a Review
When to Do a Review
Another One Rides the Bus
4 THE REALIZATION PHASE
Of Immediate Concern
Success Factors
Balanced Decision Making
Documentation and Communication
Capturing Unit Decisions
Integrating Architects and Design Engineers
Performance and Feature Tradeoffs
(Over-) Optimizing Performance
Perfect A; Mediocre B, C, and D
The Technical Purity Trap
The Unbreakable Computer
Performance-Monitoring Facilities
Counters and Triggers
Protecting the Family Jewels
Testability Hooks
Gratuitous Innovation Considered Harmful
Validation and Model Health
A Thankless Job
Choosing a Metric
Health and the Tapeout Target
Metric Doldrums
Coordinating with Other Projects
Performance Estimation
The Overshooting Scheme
Psychological Stuff
Simulator Wars
Project Management
Awards, Rewards, and Recognition
The Dark Side of Awards
Project Management by Grass Cutting
Marginal Return from Incremental Heads
Project Tracking
The Experiment
The Mystery of the Unchanging Curve
Flexibility Is Required of All
The Simplification Effort
5 THE PRODUCTION PHASE
Of Immediate Concern
Functional Correctness
Speed Paths
Chip Sets and Platforms
Success Factors
Prioritizing War Room Issues
Managing the Microcode Patch Space
Product Care and Feeding
Test Vectors
Performance Surprises
Feature Surprises
Executive Pedagogy and Shopping Carts
Managing to the Next Processor
/>
The Windows NT Saga
Product Rollout
On Stage with Andy Grove
How Not to Give Magazine Interviews
Speech Training
6 THE PEOPLE FACTOR
Hiring and Firing
Rational Recruitment
Hiring and the Promotion List
Firing
Policy Wars
Corporate Disincentives
The Led Zeppelin Incident
Exiting the Exit Bag Check
Sailboats and Ditches
Orbiting the Bathrooms
Management by Objective
We Are So Rich, We Must Be Good
Burnout
7 INQUIRING MINDS LIKE YOURS
What was Intel thinking with that chip ID tag, which caused such a public uproar?
Was the P6 project affected by the Pentium’s floating point divider bug?
Why did Pentium have a flawed floating point divider, when its predecessor, the i486, did not?
How would you respond to the claim that the P6 is built on ideas stolen from Digital Equipment Corp.?
What did the P6 team think about Intel’s Itanium Processor Family?
Is Intel the sweatshop some people say it is?
How can I become the chief architect of a company such as Intel?
Why did you leave Intel?
And In Closing I’d Just Like To Say
Bibliography
Appendix
Out-of-Order, Superscalar Microarchitecture: A Primer
Plausibility Checking
Glossary
Index
FOREWORD
This book is simply a treat. Ever since Bob Colwell contacted me about writing the foreword for the book, I have not been able to stop reading the draft. There is something extremely seductive about this book but it took me a while to realize why that is so. Bob is the first person, to my knowledge, who has managed to tell the story of an ambitious engineering project in a way that is truly insightful and funny. With every page I turn, I run into crisp, insightful episodes that can make both engineers and executives smile. Having managed several large projects in a research environment, the chapters take me back to so many stressful moments in my career and provide insights that I was not able to articulate nearly as clearly as Bob has. Bob’s insights are by no means limited to microprocessor development projects. They are fundamental to any ambitious undertaking that requires a large team of talented people to accomplish, be it movie making, a political campaign, a military operation, or, yes, an engineering project.
Please bear with me for a moment and allow me to disclose how I came to cross paths with Bob. My professional life began when I decided to work on a superscalar, out-oforder processor design called HPS for my PhD thesis at the University of California at Berkeley in 1984. I was a first-year graduate student who had just read papers on data flow machines, CDC6600, IBM 360/91, and CRAY-1. It was during a time when VLSI technology had just become capable of realizing interesting processor designs. There were several vibrant processor design projects going on in academia and industry. Many of us dreamed of changing the world with great ideas in processor design.
My office mate, Mike Shebanow, my thesis advisor, Yale Patt, and I started to kick around some ideas about using parallel processing techniques to drastically improve the performance of processors. In retrospect, we had an intuition that turned out to be correct: there is tremendous value in using parallel processing techniques to speed up traditional sequential programs. We felt that the parallel processing ideas back then were flawed in that they required programs to be developed as parallel programs. This is like requiring a driver to hand-coordinate the engine combustion, the transmission, the turning of the wheels, and the brake pads while driving down a busy street. Although there may be a small number of race car drivers who really want to explicitly control the activities under the hood, the mass market demands simplicity. Achieving massive speedup that requires explicit parallel programming efforts would not have nearly as much impact as achieving modest speedup for simple programming models. We settled on the concept of designing a dataflow machine whose sole responsibility was to execute instructions in traditional programs with as much parallelism as possible. We envisioned having a piece of hardware to translate traditional instructions into dataflow machine instructions, execute them in parallel, and expose their results in a sequential manner. All parallelism would be “under the hood.”
In the summer of 1984, Yale, Mike, and I received an invitation to go to the Eastern Research Lab of Digital Equipment Corporation to try to apply our ideas to VAX designs. We started at the conceptualization phase described by Bob. After a few sessions, we realized that we needed some way to quantify the amount of opportunity we could expect from real code. I ended up developing the equivalent of the data flow analyzer tool Bob describes in “The Right Tool” Section in Chapter 2. Joel Emer and Doug Clark graciously allowed us to access the execution traces they had acquired from real VAX-11/780 workloads. I studied the 780 microcode so that I could write a decoder program that translated the VAX instructions into simple microoperations. The dataflow analysis tool was then used to analyze the amount of intrinsic parallelism available among these operations. From our experience with the dataflow analysis tool, we quickly realized that we would need to be able to overlap the execution of operations that are separated by conditional branches. We would also need to generate these operations and expose their results in a fast enough manner to keep the dataflow execution units busy. We managed to focus much of our work with the help of the analysis tool. We also had a firsthand experience as to how much energy one can waste by trying to convince each other without quantitative results.
Now we are getting to the point of my long-winded story. Between 1984 and 1987, I worked passionately with Yale and my fellow students Mike Shebanow and Steve Melvin on a first design of HPS. I personally experienced the conceptual phase and the refinement phase as part of a four-person team. There were many questions that we tried to address with our limited ability to formulate a potential design and acquire simulation data. In retrospect, we left many issues not addressed; our project generated many more questions than we answered. It was a heroic pioneer-style exploration of designing a speculative out-oforder-execution processor. We presented the core ideas and discussed the design challenges in a pair of 1985 papers (see references [1] and [2] in the Bibliography). We showed a high-level schematic design of a conceptual out-of-order processor in which we described a vision for the branch prediction, speculative execution, instruction scheduling, and memory ordering mechanisms. Variations of these mechanisms have indeed appeared in the industry designs, including Intel’s P6. I used to feel that we deserved all of the intellectual credit for the out-of-order execution processors in the marketplace today. As I grew more experienced, I began to see that what we had actually still required a lot of intellectual work to finish. Bob’s chapters give great explanations of these intellectual efforts. I have never seen a book that gives such an insightful, comprehensive treatment of the intellectual challenges involved in managing a large-scale project to success.
When I presented my thesis work at major universities in 1987, I received numerous skeptical comments from many prominent computer architecture researchers. Many of them were working on much simpler design styles referred to as RISC [3]. The message that I consistently received was that the HPS style processor design was much too complicated. No one in his or her right mind would build a real machine based on the concept. Even if someone did, the complexity would result in such a slow clock that the processor would not be competitive. Yale and I argued forcefully about the feasibility of the model but had no hard evidence. While Bob and his team created a product that is a great business success, they also created hard evidence that our original vision was on the right track. It takes an extraordinary team to go against so many prominent naysayers to achieve a vision that is considered pie in the sky. Persona
lly, I feel forever indebted to these great individuals for redemption of personal value.
In 1987, I decided to join the faculty of the University of Illinois, Urbana-Champaign. During the second semester of my teaching career, I taught an advanced graduate course in computer design. Naturally, much of the material I covered came from my thesis work, which I taught with great conviction. There was a student in my class named Andy Glew, whom I noticed from day one. At every lecture Andy would ask me extremely probing questions. Sometimes, I had to politely tell him to stop so that I could move on to the next topic. He was especially intrigued by the concept of register renaming and what I did in my PhD thesis. The structure that performs register renaming was one of the trickiest parts of my thesis work. We named the structure Register Alias Table (RAT) in our papers, partly due to the fact that it was such a pain to design. For those of you who are familiar with the P6 microarchitecture [34], the register renaming structure assumes the same name. This should not be a surprise considering the fact that Andy was the architect who did the initial work on the structure. It is also amazing how the real technology transfer always takes place in almost random student placements.
Please do not let me get away with convincing you that this is a book just for computer designers. The real soul of the book is about how to manage intellects in the real world to accomplish incredible goals. In an honest manner, Bob gives the ultimate principles of great project management: acquire the best minds, organize them so that each of them operates within their comfort zone, protect them from the forces of mediocrity, help them to take calculated risks, and motivate them to tie up the loose ends. Mediocrity is the equivalent of gravity in the world of creative projects. Whether you are in the business of making movies, running a presidential campaign, developing a new drug, shooting a spacecraft to Mars, or starting a new chain store, you will find the insights here to be of ultimate value. Buckle up and enjoy the ride.
February 2005
PREFACE
Microprocessor design, on the massive scale on which Intel does it, is like an epic ocean voyage of discovery. The captain and his officers select the crew carefully, so as to represent in depth all the requisite skills. They pick the types, sizes, and numbers of ships. They stock the ships with whatever provisions they think they will need, and then they launch into the unknown. With a seasoned crew, good weather, and a fair amount of luck, most of the voyage will go as expected. But luck is capricious, and the crew must inevitably weather the storms, when everything goes awry at once and they must use all their experience to keep the ships afloat.