Tuesday, April 18, 2006

Are you worried... -- READ AGAIN, MAJOR CHANGES


We have been discussing the Core µ-arch. here.

Let us focus on the complexity aspect and try to put it on perspective.

What has happened with Intel's attempts at complex products?

Let's set up the time frame starting with March 8th 2000, when Intel launched the Pentium III 1GHz (two days after AMD launched the Athlon 1GHz)

2000: Tried the 1.13 GHz Pentium III that wouldn't even compile the Linux kernel, great... embarrasment which was recalled

2000: Let me introduce you to the illustrious lineage of the Netburst µ-architecture, Pentium 4s designed from the ground up to eventually run at 10GHz (Never went over the 4GHz, and that pushing the envelop too much):

Willamette: tied to the showstopper Rambus memory (RDRAM), meaningless gigahertz, hot.

P4 Northwood: Introduced the infamous Hyperthreading, severely criticized by everyone + dog: Insecure, was actually a performance hindrance, quoting Ibbotsons' quotation in that article: "Intel had sold hyperthreading as something that gave performance gains to heavily threaded software. SQL Server is very thread-intensive, but it suffers. In fact, I've never seen performance improvement on server software with hyperthreading enabled. We recommend customers [to] disable it". Intel decided to ditch the feature for good. If you are curious to know why it never worked: because it trashes the caches.

From Wikipedia:
"Overclocking early stepping Northwood cores yielded a startling phenomenon. When VCore was increased past 1.7 V, the processor would slowly become more unstable over time, before dying and becoming totally unusable. This is believed to have been caused by the physical phenomenon known as Electromigration, where the internal pathways of the CPU become degraded over time due to excessive electron energy. This was also known as Sudden Northwood Death Syndrome."

Gallatin (Extreme Edition): Marketing answer to Athlon64's launch better known as "Emergency Edition"

Prescott: "Upon release, the Prescott turned out to generate approximately 40% more heat clock-for-clock than the Northwood, and almost every review of it was negative", wikipedia again.

Cedar Mill: 65nm Prescott with better thermals.

Tejas: Promised for 2005, was cancelled.

Itanium: Sights! don't get me started! This was in late 90's Intel's flagship product in every roadmap. When it became apparent that it was too complex, unsuitable for the consumer market, became a high end proposition only. It was supposed to embody the plans for the 64 bits transition Intel had. Once, after years (!) of delays, it finally came to the market, to demonstrate:

Mediocre to very bad on-chip x86 performance, general underwhelmer, hasn't crossed the 100nm barrier (difficult to manufacture? out of preference at Intel?), difficult to improve.

The Itaniums are very complex µ-architectures. I studied the (then known as) IA-64s so much and decided to forget all about it, is because I finally got to the point that I understood that the µ-architecture was so complex that it was easy to see that it wasn't going to ever be implemented well enough.
There is one issue I am delving about: Intel's apparent incompetence to manufacture new processors. I am still baffled by the fact that Itaniums are not produced at the very least at 90nm processes, by the slow conversion to 65nm...

Intel did try other stuff, such as getting into the 3D Video Accelerators game, much before nVidia's debunked 3Dfx from the top, to colossal failure.

Dual Cores: The embarrasment of the dual-dices, until Yonah. Clovertown: a twin dual core? Are you joking? should I be afraid of a product that intrinsically demonstrates incompetence to do the right thing? [remember, a dual die processor is, for all practical purposes, a dual processor except for the number of sockets]

EM64T: Intel's µ-architectures have two kinds of decoders: The simple decoders, and a kitchen sink for complex intructions called the complex decoder. I have explained that AMD64 defines a much larger register file that improves performance by itself. But at Intel processors, 64 bit code runs slower. There is every reason to think that the 64 bits are emulated in microcode, overwhelming the complex decoder. This is important, because it is very apparent that the 64 bits are not pervasive to the whole architecture. Intel hasn't managed to clone this AMD feature right. It must be very difficult, because Yonah was initially promised to be 64bits and it isn't. This led to yet another embarrasment, Sossaman, which HP decided to skip altogether. Sossaman is a Server Chip (is it?) tied to 32 bits (!!), with low power (both as in less electrical consumption as performance capabilities).

Itanium was complicated on purpose to scare the competition away from it. But Intel over did the complication and became enmeshed by it. In any case, just as a clear thinking mind is able to conclude that Itanium is too complex to ever be implemented properly, a clear mind reaches the conclusion that it is very unlikely that Intel can truly cope with Conroe's complexity.

Let me speak about marginal benefits: The efforts to improve are increasingly harder to do, and decreasingly significant, until the benefits that you can obtain are not worth the effort. Intel decided that the Pentium III was a dead end way back in late 90's, that's why they took the Pentium4 route.

But since the Pentium 4 very early on demonstrated its utter inadequacy for low power consumption, Intel gave the Pentium III one last chance to do the "mobile shift" and the Pentium M was born. Since Intel eventually came to the full realization that the Pentium 4 was utterly inadequate all around, not just about power consumption, they awoke to the fact that they didn't have anything in store, but the Pentium Pro's great grand child, the Pentium M to keep producing something.

But since this architecture was already terminally mature at 1999, there was no other recourse but to apply the brute force approach to it: Gigantic cache, widening of data ways, deepening of buffers, proliferation of execution units, some more transistors to turn off all these numerous execution units that are going to be idle most of the time because it is so hard to keep them well fed, and also, some more dauntingly complex logic to push hard the reordering to try to squeeze some more drops of parallelism to feed something to the execution units.

This petty monster is huge, thus will also demand to apply brute force to other aspects: having so many transistors will be much harder to produce; all of these complexities lead very naturally to nasty complications.

Thus, the bottom line is that Intel would have to execute perfectly a long chain of challenging activities to mass market these very costly beasts that anyway won't be able to sell expensive because AMD will remain very competitive; so, in the best of the scenarios, if Intel succeeds, it may have products that regain the performance crown fleetingly and from which they won't be able to profit much.

Where is the money going to come from to keep the profligate expending in marketing, in building "Copy Exactly!" inflexible fabs, and in sustaining the Wall Street patronage of dividends and buy-backs?

There is a point when marketing doesn't provide the products the market requires because those products need solid engineering to be made. That's why in this whole 7 year period, Intel has been able to only come up with only two worthy products: The Pentium (III) M, and the Corpse Duo, designed by the same team that designed the Corpse µ-arch (Conroe). But come on, politics at Intel must be so intense that it had to be from the outer confines of the company, in Israel, where they found a designing team that will lay out the future of the company. And these boys really, paraphrasing Sharikou, seem to be apprentices when compared to the truly awesome Grand Masters assembled at AMD, the creators of DCA, HTT, AMD64, Pacifica, Presidio, innovations in the Alpha µ-arch., DDR, SOI, sSOI, APM...

The fundamental question is this: Do you really believe Intel will succeed at the Core µ-architecture's complexity contravening important design principles given this track record of failures, and at the same same AMD will stand still having a µ-architecture so well designed with so much room to improve?

I stick to Dr. Ruiz and Mr. Meyer's management, with the Grand Masters designing products, and the German Produktionsleiten operating the factories. In all three dimensions I am going to keep succeeding, I can bear mediocrity at marketing. And I respect Mr. Richard's work, but he has a tough job working for a company so focused on production and engineering that really doesn't indulge in marketing displays.


Eddie said...

This article was first advertized here

Anonymous said...

Are you worried about Conroe? You better be.

Probably by tomorrow Chicagrafo will pick up in slashdot an article from the register, reporting that the French site x86-Secret is reporting that AMD is working on thechnology to make two (or more) cores appear as one, due to appear in the K10 (we are still in K8, by the way).

Since Chicagrafo will probably go on a rant about the grand masters and such, without checking the reference, he will miss this wonderful quote (BabelFished):

"Conscious that K8 architecture could not compete with the next high-speed motorboat of INTEL, all its hopes are for the moment based on a new "revolutionary" technology (it is our opinion, not it his) on which AMD works in this moment for after-K8. This technology is in fact a kind of anti-HT: There or HyperThreading sought to emulate two virtual processors with a physical processor, it is a question for AMD of emulating a single virtual processor with two (or several) physical processors."

Here is the link to the full article:

The babelfishing of the article is left as an exercise for the reader...

And while you are at it, please check the performance comparisons for Core Duo:


As a second exercise for the reader, try to spot other mistakes in this particular rant (Hint: There could be other reasons for Itanium NOT to be manufactured on 65nm... and there are many more flaws in the analysis)

Anonymous said...

Intel is coming out with a similar technology, called Mitosis.

Anonymous said...

Well written and researched; but you shoot your own foot by calling Core "Corpse". It detracts from the seriousness. Also, quoting Sharikou may work against you. I know big AMD fans who consider Sharikou too biased to take seriously. Try to read your own article from the mindset of someone undecided between Intel and AMD. You might in fact want to throw in some counterarguments, to make your case feel stronger.