Examination of AMD's strategy induce several tough questions that I want to explore in a multi-part series.
Today, I will address the strategies around single die quadcores
Barcelona:
Intel's approach of putting together two dual cores is inefficient because both dual core chips compete with each other and with the expansion buses to access memory. The chips also block the bus while they are communicating among themselves. In this design, the bottleneck is the shared bus, every new core added diminishes the bandwidth the other cores may have. From the point of view of the customer, although going from dual cores to quadcores requires double the power consumption and it doesn't give anything nearly twice the performance, it may be cost-effective because most software licenses are charged based on the number of processors, not the number of cores. Then, for about the same cost in Software licenses, the quadcores provide substantially more performance. From the production point of view, there is an additional cost with MCM (Multi Chip Modules, packages with several chips), but there is no problem regarding yields or raw speeds because the dual cores put together are basically normal.
In AMD's DCA (Direct Connect Architecture) to add processors increases overall bandwidth because every processor drives its own set of memory banks, and each processor has "Cache Coherent Hypertransport" links to other processors to allow them to share memory. This is why AMD had many choices about the design for quadcores. It could have used a multi chip slave/master configuration in which only the "master" chip is actually connected to memory or other processors, but it chose the "clean" option of single-die quadcores.
Like most other AMD fans, I was very enthusiastic about AMD's designs because they are much superior to Intel's, but to succeed with single die quadcore design implies to succeed at three though challenges:
- Contrary to the MCM approach, single die quadcores imply a yield and binsplit hit, a potentially devastating blow that can erase any performance or production cost advantages: For a quadcore to work at 3 GHz, obviously all four individual cores must be able to run at 3 GHz. Let us speak of an hypothetical example, say that the current production techniques give 20% of dual cores good enough to run faster than 2.8 GHz; then, if the process behaves just as well with the Quadcore design, then merely... 4% of the quadcores will be capable of running at that speed. I illustrated this argument of exponential worsening of yields in "65nm is just Intel Marketing", and will treat the subject in greater detail later on. But for the time being, it is clear that this issue is a pervasive problem regarding single die quadcores
- AMD deemed impractical to do quadcores with the 90nm process, perhaps rightfully so, at 90nm there wasn't a transistor budget per die big enough for quadcores. Then, for the argument expressed earlier, not just the success or failure, but the life and death of the single die quadcores hinged on extraordinarily good 65nm process.
- Furthermore, to integrate four cores may require a core redesign, to include features like the shared L3 cache or more Hypertransport links.
1.- There are objective reasons to think that APM is a very much superior method of production (at least compared to Intel's "Copy Exactly!"), things like Sematech's award for the highest performing fab:
2.- Fab30 was producing absurd percentages beyond official capacity, meaning that everything production-related was very good at AMD
3.- The partnership with IBM for the development of 65nm and 45nm inspired confidence about the 65nm transition goals
4.- When the fastest dual core was the FX-60 at 2.6 GHz , fastest single the FX-57 at 2.8 GHz, the overclockability of 90nm Athlons made me think that the production process itself was centered above 2.5 GHz with a very tight variance (due to the fantastic Fab30 over production, over clockability and top speeds).
These things taken together led me to believe that all was well regarding the production capabilities and schedules for the 65nm process and the yields and binsplits hits of quadcores.
And the new core design didn't seem a problem at all, we are speaking here of the "Grand Masters" who brought DCA, AMD64, AMD-V, the emphasis in instructions per clock rather than clock speed, and the K8 design is very old already, the designers should have had more than enough time to prepare a new design.
Has AMD any chance of succeeding with this strategy choice?
Of these three items we can only be sure that AMD is producing 65nm products, under the family name of "Brisbane". But regarding the 65nm process requirements for successful single die quadcores, several red flags arise:
Despite the inherent advantages of 65nm and the acute urgency to have competitive products, all AMD has done with Brisbane are:
- Mysterious and "bullshit-explained" extra L2 latencies, that is, slower processors clock by clock
- Unimpressive shrink of die area
- Slower processors, up to 2.7 GHz when the company desperately needs something over 3.0 GHz and while some 90nm dual core products (the Quad FX 74) are 3.0 Ghz
- Small quantities, especially if believing what the company says, that the crossover point in the transition to 65nm manufacturing and several other milestones have been reached.
In this first part, we have seen how AMD's management chose an strategy for the flagship products that implied success at three major challenges. Today we don't have enough information to decide whether this was lunacy, overconfidence, or a calculated risk that went wrong; but since failure at Barcelona ripples in many ways for AMD's prospects, other strategic decisions must be seen in the context of the choices of Barcelona, that will be explored in successive parts.