The Pentium Chronicles: The People, Passion, and Politics Behind Intel's Landmark Chips (Practitioners) Page 15 Read online free by Robert P. Colwell

Home > Other > The Pentium Chronicles: The People, Passion, and Politics Behind Intel's Landmark Chips (Practitioners) > Page 15

The Pentium Chronicles: The People, Passion, and Politics Behind Intel's Landmark Chips (Practitioners) Page 15

by Robert P. Colwell

Do nothing while the meeting degenerates into an emotional minefield. The “mines” can be either a reviewee who believes the design is his baby and glowers at anyone who dares criticize it, or reviewers who see the review as an opportunity to inflict a little professional payback or humiliation. Either attitude can permanently damage relationships and will prevent any forward progress on improving the design. To defuse the mines, project management must make it clear that design reviews are important, and, if necessary, personally attend the review to keep emotions in check.

Do not worry if people become sensitive or zealous about the review process. There is a fine line between reviewing work and sticking your nose into other people’s business. A mature, professional team dedicated to designing the world’s best products will find this line and observe it scrupulously. Project managers must monitor the process and make sure no designers are being driven quietly crazy by anyone else, under the auspices of the design review culture. Oddly enough, in my experience both designers who won’t let anyone see their work and reviewers who are overly assiduous about checking others’ work are likely to be doing substandard work. Either behavior should be a warning flag to project leaders.

As a reviewee, feel free to ignore certain issues the review raises. Reviewees cannot be allowed to judge the validity of an issue. If it was important enough for a reviewer to bring up, it is important enough to be properly disposed of. Simply ignoring it is not an option. Neither is proof by assertion or simply glowering at the reviewer.

When to Do a Review. Some companies or design groups have a long list of reasons to avoid design reviews. “There’s not enough time in the schedule” tops the list and indicates a far deeper problem with project management. If a project does not have enough time to check that they are getting the product right, it will not have enough time to fix the product later (when they discover that they got the product wrong). Design reviews, properly done, facilitate vital communication among designers and between the design team and project leadership. If the project cannot make time for a review, it does not have enough time to do the product properly. And a project leader who does not see that is probably not doing his job properly either.

Design reviews do not have to be done every day, or every time the design changes. A good rule of thumb is to review every important subsystem in the design at least once when the design has matured enough to have worked out the essential elements but is still flexible enough to allow changes. And at least one joint design review should be held between the product team and any external interfaces required for that product’s success. For microprocessors, this means at least one joint design review between the CPU and the chip-set designers.

ANOTHER ONE RIDES THE BUS

Sometimes we CPU architects tend to take for granted any system elements outside the chip’s pins. We need to know bandwidths, latencies, and the capacities of the main memory and the chip’s frontside bus, but beyond those we are content. For P6, we were able to leave that external world in the extremely competent hands of our chief bus designer, Gurbir Singh, and his talented first lieutenant Nitin Sarangdhar. Gurbir had previously designed the bus for Intel’s i960 CPU, and the proposal he and the team came up with for P6’s bus seemed eminently reasonable, implementable, and competitive.

Some voices in the company wanted us to reuse the Pentium’s bus. They argued that changing the bus design had many unfortunate side effects, such as forcing the industry to change its tooling, routing, and motherboard placement rules for the chips. Chips that were compatible with the last generation’s sockets could be sold as system upgrades and had a previously established and tested set of chip-set components.

This was all true, but the trouble was that the Pentium (P5) bus simply was not up to the system performance levels we needed with the P6 design. The Pentium processor and all its predecessors had been in-order designs. Because instructions could not be reordered, if a load instruction missed in the cache and had to retrieve the missing information from memory across the bus, that load and all other instructions in the CPU had to wait until that access was satisfied. In the Pentium’s day, main memory was not very far away; it took only a few clock cycles to retrieve the needed cache line. That was a noticeable but not crippling loss of performance.

P6’s clock rate was much higher, and if the product road map were to unfold over the next several years the way we had envisioned,6 it would go much higher still. Main memory comprised dynamic RAMs (DRAMs), which were improving with time, but mostly in capacity, not in speed. This meant that the disparity between fast CPUs and relatively slow main memory was bad and about to get much worse. Making all instructions wait just because a particular load instruction missed the cache was likely to become onerous, especially for an intrinsically out-of-order engine that otherwise had all the capabilities required to avoid that cost.

To address this issue, Gurbir designed the P6 frontside bus to be transaction-oriented. Chips that connected to the bus were known as bus agents, and any bus communication would occur, by definition, between a bus agent pair, according to strict protocols and using the bus wires as little as possible. A CPU performing a cache line refill (as per the example I gave earlier) would arbitrate for bus ownership and send a message of the form “I request that whoever has data corresponding to system physical address A please send that data to me as soon as possible.” All bus agents would hear that request, and normally the Northbridge chip, responsible for controlling main memory, would respond. The request for the cache line was one bus transaction; at some later time, after memory had had enough time to look up the information, another transaction would be initiated (this time by the Northbridge) and the information would be presented on the bus for the initial requestor’s edification.

This transaction orientation is a standard feature of most modem microprocessor buses, despite its complexity and the implication that all bus agents must continuously monitor the bus and track the overall state, even for transactions that do not have direct relevance to them. But it is a good tradeoff between the expensive (wires, motherboard routing, and CPU package pins) and inexpensive (transistors on the CPUs).

As a newly designated project manager on P6, my first mistake was in thinking that technical reasoning would be a satisfactory way to resolve any issues associated with the bus. After all, that is how we would attack any open issues inside the CPU’s design. But as it turned out, the rules were (and still are) different for buses because just as multiple bus agents share buses in a computer system, so multiple corporate entities share the bus design, and each has a vital interest in it.

Worse, the corporate group most responsible for the design of the Pentium bus, and the i486 bus before it, had not completely bought into the idea that P6 needed its own bus. They were just beginning to realize that a group of CPU designers in Oregon had had the audacity to go off and design an x86 CPU bus without their blessing, and were demonstrably unwilling to accept that fait accompli.

All I knew at the time was that contradictory, vociferous e-mails were raising the average temperature of my inbox to dangerous levels. To resolve the problem, I started by personally interviewing all the principals. Gurbir and the P6 bus implementers told me that they had already implemented a complete bus design in RTL and done substantial testing on it. Any changes to the basic design would seriously impact the project’s schedule and they did not think the proposed changes were worth making given that cost.

On the other side was Pete MacWilliams, an Intel Fellow who had been instrumental in all prior x86 buses, but who had found himself immersed in Pentium issues until only very recently. One of Pete’s cohorts, Steve Pawlowski, was also helping us by reviewing the bus specification from the chip-set point of view and offering ideas and criticisms. Pete asked us to consider some ideas on how to improve the P6 bus, and after looking at them, I agreed that most were in fact improvements. Had it been even a year earlier in the project, we would have adopted them without furth
er ado. But schedule pressure was so high at this point in the project that this decision was no longer an easy one.

I decided to bring all the combatants into one room for a day to see if reasoned argument could let one side persuade the other. (Yes, I was that naive.)

Half of the room would think I had simply acquiesced to the obvious, whereas the other half would think I was a complete idiot.

The meeting opened with my standing at one end of a long table. On one side sat six P6 bus engineers and on the other sat Pete and a few of his coworkers. As I looked at the faces of those present, it suddenly dawned on me there was not going to be reasoned persuasion here. Instead, the day would end with my having to break the tie, and when I did, the “winning” half of the room would think I had simply acquiesced to the obvious, whereas the other half would think I was a complete idiot. After listening carefully to the arguments, I concluded that the improvements were not worth the cost to the schedule on the first product, but could be added to some later proliferation design. I thanked everyone for their help in resolving this and asked Pete to stay involved with the P6 system design because of his extensive experience.

That was not the end of it. I don’t think Pete was terribly unhappy that his suggestions had not been accepted, but apparently his boss was. A few weeks later, I found myself in a room of general managers and VPs. Pete’s boss started the meeting by stomping around the room, shouting about how his group had always had the charter for designing computer CPU buses, that such things had to be done with an eye toward the future, not just because it was convenient for one particular chip, and how on earth could we believe a group of microprocessor designers knew anything about buses?

At first I was bewildered. Who was this guy, and why did he seem so emotionally invested in my project? Was he faking his anger just to make a point? But the other attendees seemed to be taking him seriously. Then I got annoyed at how far from reality his viewpoint seemed to be. And then I got mad. Really mad. I fired back, “We had to design a new bus for P6 precisely because the buses you and your team had produced were so thoroughly outmoded and unusable for modern designs. We have laid a solid foundation for the next 10 years of microprocessors at this company, a trick your team has not succeeded at, and we did it with no help from you. Then, after we no longer need your help, you show up and propose to damage our schedule just to assuage your egos. Why don’t you just go away and let us design world-class products?”

This particular VP seemed to enjoy being addressed in this manner about as much as I had when he was yelling.

Looking back with 12 years of perfect hindsight, the man did have a point-it was, in fact, his group’s charter to guide the CPU buses for the corporation. I now appreciate much better than I did then that such a role is absolutely necessary. The root cause of the conflict was partly our naivete as a new x86 design group (not many people cared what bus was on the i960, but many people care about all aspects of x86 designs) and partly my own lack of understanding about how Intel was organized. On the other hand, I had valid points too; we really did put a solid bus into existence, a bus that is still being employed in various forms throughout the product line, 14 years after its conception. This is a testament to the strength of Gurbir Singh’s original vision and the cleverness of many designers and architects along the way. And there was probably a nontrivial component of the legendary architect’s ego involved. Anyone who wants to tell me that we botched some aspect of a design I am proud of had better start by acknowledging that context first. Otherwise, I will assume they are trying out for the Monday Morning Quarterbacking League and my interest level will drop precipitously.

Ultimately, the decision, which was made several levels higher than mine, was to compromise: Adopt the ideas that Pete felt would be the most necessary for good performance and early bus testing and refuse the ones that would have the worst impact on schedule. It’s a good thing that Pete is such a nice guy and great technologist, because the whole episode left a very sour taste in a lot of P6 designers’ mouths, including my own.

4

THE REALIZATION PHASE

It’s always questionable to try to do something too cleverly. -Albert Einstein

The concept phase produced about three feasible avenues down which the design project could travel successfully. The refinement phase investigated those avenues and identified the most promising. Now, the realization phase had to translate this winning idea into a product.

This phase can seem sudden and more than a little scary. Yesterday, the project was an abstraction-a collection of ideas, concepts, and new terms that a few people kicked around. Today, dozens of bright, experienced design engineers are taking these ideas at face value, studying them intently and internalizing them. Hundreds of people are being organized into small groups to work on respective subunits that seemed to be a lot less defined yesterday. People are creating Tshirts with drawings and caricatures based on technical terms you conjured out of thin air only a few weeks ago. Dozens of people are using terms that you invented and you wonder if they are using them in the way you intended.

At this time in the P6 project, I remember being overwhelmed by the many people who were taking our architecture ideas seriously. It is one thing to put a brave face on your uncertainty when convincing upper management that they should invest hundreds of mil lions of dollars in your concepts. It is quite another when you see your peers staking their careers on the idea that you and the rest of the concept team have come up with something that has enough integrity to be implementable, is aggressive enough to beat the competition, and is flexible enough to survive the surprises ahead. Either they have bought into your vision, or they have put their trust in you; both prospects induce humility.

People are creating Tshirts with drawings and caricatures based on technical terms you conjured out of thin air only a few weeks ago.

Nothing will ever be attempted, if all possible objections must be first overcome.

Those who are key players in a startup know what it means to be fully and irretrievably committed to a technical vision and how grateful you are when others make similar sacrifices. Shortly after I joined Multiflow Computer, a VLIW startup, I was copying some documents related to building a house when the VP of engineering walked in. When he realized that I was that deeply committed to making the company a success, the look on his face communicated volumes. What he knew then was exactly how precarious the corporate finances were and how badly things could go for us if the worst happened. All I knew was that no startup had a hope of succeeding unless we first burned the lifeboats. My attitude was full speed ahead and do not look back. Samuel Johnson said, “Nothing will ever be attempted, if all possible objections must be first overcome.” This kind of mindset will serve you well in the realization phase, in which a multitude of objections will arise daily.

By its nature, the realization phase has many concurrent activities, most of which center on further development of the register-transfer logic (RTL) model, an executable description of exactly how the final microprocessor chip will behave under all circumstances. The project’s refinement phase began with the behavioral RTL coding and the realization phase will convert that to the final structural RTL form from which the circuit and layout efforts will work. The finished P6 RTL model had approximately 750,000 lines of source code, clearly a nontrivial software development.

OF IMMEDIATE CONCERN

To get a feeling for what the realization phase is like, consider what would be involved in building a new high-rise [24]. The concept phase of that project has only a few participants: the buyer, the financier, the builder, and the architect. The refinement phase has more, as choices are considered and made, various possibilities weighed, and ground is broken.

But the building project really gets into high gear in the realization phase, when the blueprints have been drawn, the steel skeleton is up and the building site is crawling with hard hats. The building crew can now work at full speed, because the
plans and the infrastructure are in place. There are huge cranes to lift the pieces of the upper floors into position. There are temporary service elevators to bring workers to any floor. The army of supply trucks has been organized into an intricate choreography. And while some teams work at completing the building’s core structures, other teams put the building infrastructure in place, such as plumbing, electrical wiring, elevator service, and fire suppression equipment. Yet more teams work on finishing the interior according to the blueprints.

When you stand across the street from a project in the realization phase, you get a visceral sense of the concurrency in this phase. Small wonder that during this part of the project, more work gets done per unit time than at any other phase.

In a chip development, the concept phase comes up with some overall project alterna tives, and the refinement phase narrows those down to two and then selects one. It in essence sets up the project scaffolding needed to support the concurrency in the realization phase. Because of this relationship, it is crucial that the realization phase not be undertaken until the preparation work is complete. If you wait a week longer than necessary to crank up your project into the full-out execution mode of the realization phase, you will at worst have delayed your project by a week. You can often make up that week by inspired management later. But if the project is allowed to begin the realization phase before the project direction has been firmly and confidently set and before all team members have internalized it, you will pay a much higher price than a one week slip. Subtle errors will creep into interfaces, designers will make choices that may not be obviously wrong but still are seriously suboptimal, and everyone on the team will get the wrong subliminal impression that, overall, the project is further along than it really is. This impression can itself cause errors in judgment that validation or management has to notice later on. It is a much better idea to guide the project purposely and judiciously from refinement to realization.

‹ Prev Next ›