Chicagrafo: December 2007

Monday, December 17, 2007

This blog is first in Google search!

I just noticed that Google is giving high search rankings to posts in this blog.

I am particularly proud of one search query at which this blog shows first:

w-h-y
a-m-d
d-o-i-n-g
s-o
b-a-d

The reason I put the hyphens is to obfuscate the words, I don't want Google to point to this article if you put that search query.

The link points to "Does AMD know what it's doing?" written in June, a good link indeed.

The celebration of the second anniversary is coming, I am happy that things are "clicking" for the blog: The market has vindicated the author, I am finding good material to work, the quantity has increased, and the articles have gained much more acceptance. The hike in Google ranking is also reason to celebrate for you, the reader: It means that a truly independent medium to express opinions, not conditioned, mediated or controlled by Wall Street remains open for your passive participation (or, if you are commenting, active!), while it is raising in its significance.

I hope to be able to be up to the heightened responsibility.

Note: I found out about this particular search query because I have a service that, if the visitor comes from a search query, it keeps tabs on the queries.

Thursday, December 13, 2007

Analysts Day

Today AMD will be holding its annual Financial Analysts Meeting. To put the meeting into perspective, I wanted to write several things about what AMD's management has been forecasting and what has been happening. I can understand the optimism of AMD a year ago, after all the period of great successes had barely finished. You can check the review of 2006 in Bob Rivet's presentation to get the feeling. The problem is that they applied the model of great successes for a period on which it didn't apply. As I explained in "important about K10", AMD is in "full lying" mode, it can not say the truth because it is just too horrible, then, it is important for tomorrow to be able to detect lies. To help in that regard, below I will summarize other statements from the company.

Fortunately, Roborat64 already summarized the last meeting for us, he wrote an article about this subject [ formatting changed ]:

[H]ere is what AMD projected for 2007:
K10 quad-core ramp: 2H’07; actual result: pushed out possible mid Q1'08
Barcelona performance: 40% better; actual result: ~40% worse (non-compliant SPEC benchmarks)
CAPEX: $2.5B; actual result: 2007 estimate will be at $1.7B (Fab38 delayed)
Revenue (long term target): ~$7.6B; actual result: $6.02B (average analyst estimates)
Gross Margins: 50+/-2%; actual result: 35% (last 3 qtrs)
2007 growth: 10% above industry (16%); actual result: -455% [ sic ]

I wish to mention a few things more:

In Mr. Seyer's presentation, slide 16/52, the quadcore was projected to have 40% superior performance, accelerated virtualization, and 60% improved power efficiency. Well, the kind of workloads that the TLB bug affect more are related to virtualization, it can be as much as 50% slower. Regarding the power efficiency, Jason Mick @ Daily Tech wrote a blog demonstrating the lies about K10 power consumption [ thanks to the Intel vs AMD blog for the link ]:

To put [the datum that K10 consumes 137 Watts] in perspective, a 3.16 GHz Xeon X5460 from Intel squeaks in at a still weighty 120 W. While AMD failed to disclose in the white paper on what frequencies its selected processors operate, it is almost surely 3.0 GHz or lower, as 3.0 GHz is the highest speed K10 processor currently demonstrated. The best case scenario is that a 166 MHz slower AMD processor consumes 17 more watts [my emphasis]
[...]
However, if the samples tested were lower than 3.0 GHz, obviously the picture becomes far worse. And since AMD's 2008 roadmap states that its 2.4 GHz processors are rated at 125 Watts TDP, this is almost certainly the case. Architecture and design advantages aside, K10 is a chip that is almost a gigahertz slower but with a significantly higher power consumption rating.

So much for the often repeated superior power efficiency of the "native" quadcore...

We can not forget that last year the company was speaking of fabulous forecasts in December 14, and less than a month afterward, they had to report a miss warning because the fourth quarter was much worse than anticipated.

In Q4 2006 CC (transcript), Mr Meyer's presentation said things like "We need to improve our financial performance relative to what we delivered in Q4. We will do so by delivering improved products, lowering our manufacturing costs, increasing our operating efficiencies across all disciplines, and continuing to grow share". Interestingly enough, things like the employee count hasn't gone down, which means that perhaps there weren't synergies between AMD and ATI, at least from the Human Resources perspective. So much for all Mr. Meyer said. Dr. Ruiz "I am incredibly optimistic and excited by the future of this company, more than I have been in the seven years that I have been with this company": Very well, let me know the next time you feel optimistic, I will gamble against your optimism. Mr. Rivet reiterated the Analysts day guidance, when it already was very clear that the guidance was pure fiction. Also, Mr. Meyer felt "very bullish" that as soon as the "native quadcore" was introduced, it would recapture the performance lead...

I wrote "Catastrophe" regarding the tragic comedy of Q1 2007 CC (transcript), so, I will not reiterate it here.

Then, it is Q2 (transcript). Meyer: "Our Fab 36 conversion to 65-nanometers is complete, with yields exceeding expectations and we now turn all our attention to 45-nanometer" [!!], "We are on a path to bring our gross margins and operating expenses back into a reasonable balance and improve our cash flow". Then, in the Q&A: Meyer: "First of all, we’re very happy with our 65-nanometer yields across all products, including Barcelona, so no issue there. The fact of the matter, Barcelona, while being an absolutely great product, is complicated and it’s taking a little bit more design work than we anticipated getting the final rim in place", Dr. Ruiz: "as Dirk mentioned, on 65-nanometer have been phenomenal, been outstanding". Rivet: "We would really like Q4 to be break-even, or to be specific, not just the month of December but definitely Q4". Then, there is an all-time greatest pieces of bullshit by Henri Richard that I hope someday to write an entire article about:

I don’t know of any IT manager that ever asked what was the nanometer in this processor and I don’t know of any student walking into a store and really wondering what the die size of a processor, let alone in some cases what’s the frequency? What they do is they look at more and more what is this machine going to do for me? How does this look? Is it a fashion statement? Is it responding to my needs?

In Q3 (transcript): Chris Danely, JP Morgan asks: "When do you guys expect to start shipping either at 2.4GHz or 2.5GHz Barcelona?" Meyer: "The plans that we have haven't changed from what we talked about around the timeframe of the Barcelona launch, which is to ship the 2.5GHz product in the middle of this quarter", answering to another question: "Based on the input we're getting from our customers and end users, there is a lot of demand for Barcelona, I tell you. We're just seeing people licking their chops and ready to get their hands on the product".

I am saving the bullshit of "asset light" for the end: Asset Light is empty talk because very probably AMD is forced by contractual obligations to manufacture in Dresden, otherwise the government wouldn't have pitched in. The x86 license caps the number of processors that can be outsourced. If AMD cuts its production scale, then it will suffer worsened economies of scale. AMD's inviability stems from being forced to sell a quantity of products to remain in a given scale, but since the products are mediocre, the only way the market can absorb the quantity is through very steep discounts that induce severe losses.

Since ATI imploded inside AMD, sooner than later the company had to adjust the "Goodwill". Currently, AMD's net tangible assets are negative, that in a way means that the company is worthless. This subject will be covered soon, but since we have the analysts day, I wanted to advise to interpret the statements of today in terms of viability of the company. Today most investors and analysts are not really thinking on an eventual bankruptcy, but as I have been explaining, the crisis may be more severe than what it seems at first sight, so, the question of viability will arise later and today is the day to be preparing for that. I am very surprised of AMD's recent stock price crash, because in reality there are no news, all the latest stuff of the K10 bug, delays, slowness, lacklustre performance, etc., are mere confirmations of things that were very plausible possibilities. Think about what may happen if today you do your due diligence and determine that AMD is inviable, and then, in 9 months time the market begins to seriously question AMD's viability? You would make a bundle!

Wednesday, December 12, 2007

The important thing about K10

I have been reading articles like "Has Intel Crushed AMD?" by Jon Fortt in Fortune's BigTech blog, the Mario Rivas interview [he is the Computing Product Group Executive Vice-President] by Damon Poeter in ChannelWeb, as well as many other numerous discussions in message Boards about AMD, and I grew increasingly frustrated at how people is losing sight of the truly important things about K10.

There is the perception that if AMD solves the k10 problems of bugs, slow clocks, manages to produce them in quantities, then AMD may continue to consolidate its duopoly player status, and continue to be a force in the industry that must be taken into account. I think that all this optimism, very unfortunately, is unfounded.

Let us suppose AMD had launched K10 processors at 3.0 GHz, for both servers and desktops, around June of this year, and without any bugs of importance. Still AMD would be headed down, only not so fast. That is my point. Why? because the K10 design itself proved to be a dud at so many levels that it is exhausting just to mention all of them. I think that the important thing of K10 is that it proved inferior in IPC (instructions per clock) to Intel's existing double duals. This is a fact in a context of two extremely alarming things: Intel's double duals are handicapped by the front side bus (they can't communicate die to die directly, and every new core or processor on the same and the same bus diminishes the effective memory bandwidth per core) and external memory controller delays. Still, despite the handicaps, the double duals beat fair and square any K10 quad at same-clock comparisons in the vast majority of workloads.

This will get much worse, because Intel is already enjoying the advantages of 45nm, over the horizon looms the new advances in transistors, the high dielectric and the metal gates (* see note at the foot), as well as their QuickPath implementation of P2P that does away with the handicaps. I have demonstrated that there is no need to do something as good as AMD's DCA/Hyptertransport because for the vast majority of applications they were used only minimally (just to save a bit of money on the external memory controller, and to reduce the points of failure, helping speed to market), so, the industry has every reason to expect a much more competitive Intel in the short, medium and longer terms.

Does K10 has the room for future improvement? I was very wrong regarding Core, I honestly thought that the P6 line didn't have room for improvement, but Intel proved me wrong. I will try again to formulate predictions, though:

I don't think the cache hierarchy in K10 works. The independent L2 caches, half of the total cache space, are inefficient. The L3 level is too small compared to the L2 level to be justified (according to my simulations, for the latency steps on cache level of typical architectures, the sizes should be at least four times bigger than the previous level. These are numbers that I use in high performance optimizations where I try to adapt my software so that the "working sets" maximize the cache hierarchy performance. Also, I have said many times that the L1/L2 hierarchy of K8 behaves more like a "one and a half levels", that's why it is so size-efficient), thus, unless AMD changes this radically, I see K10 underperforming in memory-intensive applications. While the whole L3 is of dubious merit, it still occupies a significant fraction of the processor area, and consumes a significant fraction of its power... Some might say that I am saying this in hindsight, but in reality it is just that the actual performance numbers of K10 have given me confidence to go public with reservations I had from the beginning. I don't find the lackluster performance problem of K10 in any of the important advancements of this architecture (ask Ron, "Cove3" in the InvestorVillage message board for a complete list), but it has to be something, and I think the cache hierarchy may be a partial answer.
I don't think the migration towards quadcores will happen fast, not anything close to the migration from single core to dual core. A second core really adds usable computing power for normal Windows usage, as valuable as 70% of the first core, but the third core adds computing power that is hard to use, so it is only 35% as valuable. The fourth core is even less valuable. That's why I am so interested in three-cores, I can really think of ways to use a third core, but the fourth is still too far. This has to do with software engineering and the principle of combinatorial complexity. From the design perspective, the problem with these facts, is that while the single-die principle of K10 is oriented towards maximizing the efficiency of the four cores, it does so at very steep bin-split, yield and complexity penalties. Intel's existing double duals have the priorities reversed: inefficient multicore performance but with quick to market times, ease of manufacture and capable of top clock speeds. By the time this situation reverses, Intel will already be in the market with single-die designs, so, I am afraid K10 won't ever have the chance to be the adequate design for its time, at least from the perspective of multi-cores.
The problems we have seen of K10 are not accidental, I fear they are fundamental: The architecture is single die/four core, thus complex, thus requires time to develop, it is error prone, difficult to produce, and hard to make it run at top clock speeds.

I hope to have explained with sufficient detail why I think this is not a circumstantial crisis in the processor business of AMD, but an structural crisis that will aggravate.

The words of Rivas are very contradictory: He implies that the total performance of the processor really doesn't matter for the enthusiast, which is a lie by itself; but yet, the architecture he sells, optimized for multicore performance, is as enthusiast-directed as it gets. He minimizes the performance penalty of the BIOS fix of the TLB bug, contradicting the Tech Report benchmarking (in an article by Cyril Kowaliski that "Chico" asked me to read in his latest comment, Tech Report pounds on Rivas for this), and of course, it is literally brimming with promises of improvements that I don't see how to justify. Rivas is the same AMD official that acknowledged in March that the single die quadcore had been a mistake (Ashlee Vancee @ "The Register"), and digging a little bit more, Rivas, in an interview exactly one year ago, promised a place in heaven regarding Fusion (EETimes, Junko Yoshida), when the company was still trying to justify the ATI acquisition. Read the contradictions of Rivas, that will lead you to conclude that AMD is in full lying mode, presumably because the officials can not say the truth, that is, the news are to become much worse.

I never agreed with the Opteron/K10 comparison. It is true that both are monumental challenges, but that's about all their similarity. Opteron was revolutionary in ways that the industry was prepared to embrace, like the P2P connectivity, the emphasis on setting the way for single-die dual cores; and it was conservative and evolutionary on things the market wasn't willing to change: A true upgrade path for the x86 instruction set architecture for 64 bits, AMD64, while Intel was at the apex of their attempt at consolidating the Itanium ISA. K8 wasn't "marketing driven engineering", that's why it insisted in the technically superior approach of slow clocks of highly optimized execution rather than marketing gigahertz of idle instructions, represented by Netburst. Today, K10 tries to "innovate" in what is not necessary, like the single-die quadcore, the third cache level, etc., rather than innovating in things the industry is desperate for, revolutionary coprocessors for the consumer market, for example; on the other hand, today the risks associated to the K10 challenge are not at all mitigated by Intel's insistence on the incorrect approach, like at the times of the K8 challenge, but quite the contrary, the risks are heightened by Intel's practical and effective approach. Finally, AMD, at the times of the K8 challenge enjoyed the momentum of the superior product design, the Athlon, while today AMD suffers the negative momentum of having the inferior design (thus calling for a more practical approach).

My advice is to be suspicious of the theory that AMD just had a bad streak of problems and mistakes, at least regarding K10, it is very clear that AMD exposed itself to great suffering, and now that the gamble failed, the real pain is about to begin.

(*) AMD, unsurprisingly, is downplaying the silicon process race. Of course, it is so much behind already and getting ever further behind that it has to resort to deny the negative; but this subject is better left for another article.

Monday, December 10, 2007

Scott Wasson @ Tech Report: a mistake

We have been talking about the "Tech Report" coverage of the K10 TLB error.

Scott Wasson published an article where he explains that he made a mistake

I wrote more than once in our coverage of the erratum that AMD had initially suggested the problem didn't affect lower clock speeds of the Phenom. Turns out that's not the case. Here is the text of my notes, verbatim:
TLB problem w/virtualization
2.4 will have the complete fix
Have to enable something in the BIOS for the 2.2 and 2.3
Can degrade perf a little bit
[...]
I think I may have read this incorrect information online somewhere

I also followed internet sources that said that AMD explained that the Phenom 2.4 GHz had a bug and that's why it was retired from the market, but that the same bug wasn't present in the slower versions. The Inquirer may have been the culprit,

This problem was found during speed-binning the B2 revision processors, and this was the cause for the Phenom FX 3.0 GHz delay. It turns out that some CPUs running at 2.4 GHz or above in some benchmarking combinations, while all four cores are running at 100% load, can cause a system freeze.
[...]
9500 (2.2 GHz) and 9600 (2.3 GHz) parts are unaffected by the errata [ my emphasis ] Some 9500/9600 parts may even be overclocked to 2.6, 2.8, 2.9, 3.0 GHz and they will have no problems whatsoever, while some will have this error.

I thought it was important to correct this at the "new-post" level rather than merely an update to old articles or a comment. This emphasizes the importance of having the sources properly linked to, you can backtrack the origin of your assertions.

Thursday, December 06, 2007

Impact of BIOS patch of TLB Errata 298 measured

Phleanom(TM) logo
Scott Wasson, whom we have quoted before, wrote an article for TechReport whose conclusions state that the performance hit of the BIOS patch for the erratum 298 is as severe as 20% in average. Even while taking out of the benchmark mix the memory performance tests, the performance hit is still more than 13%. Then, it has been confirmed the initial assessment of 20% performance penalty and that AMD once again tried to misled the public into diminishing the importance of the bug.

The 'net is abundant on reports on how the Phenoms are slower than plain old K8s in certain workloads. Since most of the consumer applications are very low-threaded, Phenom doesn't really have many chances to out compete their K8 dual core brothers throwing more cores to the workloads, but now that the BIOS patch castrates them of their memory performance, they look truly horrible. In some of the very long comparison tables of Wasson's article, the Phenoms are last in performance, by large margins. The BIOS patch affects severely the only competitive edge that K10 has over Intel products, so, the comparison turned hopeless against Intel processors.

Who are the suckers who are buying these Phenoms?

I wish I could leave it at that. But I can't. It turns out that at the height of the crisis, AMD officially came out to say that they are shipping the hundred of thousands of K10 processors they guided the last quarterly report conference call [ Mark Hachman @ ExtremeTech reports that AMD personnel emailed statements with that information ]. On top of the desperate and unethical behavior I describe in "terrible news", I can't fathom how stupid this company may be:

We know that AMD is quite simply not selling all the K10 it was supposed to sell [ we know that they are performing "application screening" before actual shipment of Barcelonas, they never launched the expected 2.6 GHz Phenom, had to retire the 2.4 GHz, IBM never launched the systems to the public so it couldn't certify its benchmark ], so the statement of tracking in accordance to previous guidance must be an outright lie; but that is not the worst about this statement, AMD actually believes that people will interpret the information that they are selling hundreds of thousands of severely defective processors as good news...

Wasson says that since the performance of Phenom is so mediocre, its only redeeming quality may be the cheap price, so, some average consumers may be interested in it, but

I doubt whether the average sort of consumer is likely to purchase a system with a quad-core processor. One wonders where that leaves AMD and the PC makers currently shipping Phenom-based PCs. I'm not sure a recall is in order, but a discount certainly might be. And folks need to know what they're getting into when purchasing a Phenom 9500 or 9600-based computer".
[...]

[A] credible source indicated to us that at least some of the few high-volume customers who are still accepting Barcelona Opterons with the erratum are receiving "substantial" discounts for taking the chips [...] I doubt AMD would have shipped Phenom processors in this state were it not feeling intense financial pressure.

AMD's other major concern here should be for its reputation [ my emphasis ]. The company really pulled a no-no by representing Phenom performance to the press (and thus to consumers) without fully explaining the TLB erratum and its performance ramifications at the time of the product's introduction.

It is even worse, Wasson forgets something he mentioned that I already quoted: AMD also misled the public by telling early reviewers that since the external bus of Phenoms was going to be 2.0 GHz, they should set the external bus to that speed for their reviews, while in fact the external bus of the actually launched Phenoms are 1.8 GHz.

I think all of this deserves REPUDIATION.

For a slight touch of comic relief, follow this link.

gfor said of Dave Orton

gfor left a comment in "A-TItanic comments" that I want to share with all the audience:

You negative comments regarding Orton are misplaced. As the CEO of ATI, his first and foremost responsibility was to ATI shareholders, and he took excellent care of them.

1) He managed to get $5.4 billion dollars for a company that, had it not been sold, was hading for ~$2B market cap by April 07 based on dismal profits and R600 fiasco.

2) He knew full-well what a disaster AMD-ATI would be and did his shareholders enormous favor by demanding cold hard cash. The fact that the outside people most familiar with AMD's finances (ATI management team) did not want to touch their stock with a 10-pole should have set off red flags all over. "We will create a dominant company... yeah, and we don't want to be paid in it's stock".

Orton's ability to get AMD to overpay by the factor of two for his company, and pay the bulk of the sum in cash (which they could sure have used now) is nothing short of a genius. The man was looking out for ATI shareholders and he took great care of them.

According to the wikipedia entry on David E. Orton, he "enthusiastically supported ATI acquisition by AMD and was one of the main forces behind it". The history of this business deal is well understood, so, I won't write contemporary history tratise, but I think that Dr. Ruiz and the rest of the managerial team that approved this catastrophic acquisition did it in good faith, but they were scammed by Wall Street and the ATI personnel into paying twice what ATI was worth in an unnecessary acquisition. They are that naive and they have such inferiority complex that anyone who says "you could do X that Intel can not" will get their attention, even if X is the most stupid thing in the world. [ This inferiority complex also manifests into the sickening submissive attitude towards Microsoft ].

I wrote about this and the broader subject of big mergers in "Big Merger=Bad Merger": The deals receive excellent financial media press, because Wall Street and their Investment Bank branches stand to gain hundreds of millions of dollars in commissions and other fees, so they have every incentive to use their considerable leverage on the financial media to give the big mergers good propaganda. Also, the managers of both companies get richer, a phenomenon explained in links in that article.

This is like departing tenants throwing a party: There is lots of excitement, people come and have a good time, but by the morning, the owner is faced with a mountain of trash, vomit in the floors, the toilets clogged, the garden all trampled and all sorts of weird stuff. The owner is, naturally, the shareholder of the acquiring company.

Wednesday, December 05, 2007

Erratum 298

The error in K10 that has generated this flurry of controversy is called Erratum 298. I will explain what it is about below, but I first want to put this problem on what I think is its due context.

In "terrible news" I spoke about AMD launching Phenom knowing about the existence of this bug. Because of technical characteristics of the bug and the Linux patch that works around it, I think that the BIOS patch can also work around the problem, therefore, this is not a problem that grants a product recall. Nevertheless, the performance hit of patching a system through the BIOS may be very significant, AMD claims around 10%, independent testers claim around 20%; but it seems that if the Operating System can be patched too, it only hits 1%. In practical terms this bug and the patch are as if AMD would have launched processors 10% slower.

According to "Daily Tech"'s Kristopher Kubicki, AMD halted shipping of K10 pending "application screening", that is, AMD is checking whether the applications of a customer would likely trip the bug or not before shipping. It seems that the bug may only occur when the operating system needs to set the "Accessed" or "Dirty" bits of the page table entry [ I found this article for the people interested in learning about Paging, the meaning of the accessed and dirty bits is explained there ]; like I mentioned in "terrible news", some workloads like supercomputing may not trip the bug, the reason seems to be that supercomputing doesn't do very sophisticated virtual memory management, at least not as complex as virtualization, so the simultaneous conditions required to trigger data corruption or system crash may not occur.

This means that the flow of Barcelona processors to the market is slower than anticipated, and some other customers that chose AMD because the specific advantages of AMD processors for workloads like virtualization are not receiving any product at all. In the case of "consumers", it seems that the company will give the chance to disable any patch and have a buggy system, or take the 10% to 20% performance hit.

Now that we come to that, the choice of disabling key functionality of the L3, Kubicki also quotes AMD saying that some tri-cores will have the L3 disabled. This makes sense, so, I guess it may be interpreted as good news. Let me explain why:

Caches have been sort of a "loose cannon" in the world of µarchitectures, for instance:

The original 266MHz Celerons without any cache were so slow that Pentium MMX 233 were noticeable faster,
then Intel solved the problem a bit overkill and launched the cheap and very overclocking-friendly Celeron 300A that became famous because its half-size, full core speed L2 cache made it faster than the much more expensive Pentium II's with double size, off-die, half core speed L2 caches, especially while overclocked allowing 100 MHz memory rather than 66MHz (I owned a Celeron 300A for years, it ran at 450MHz with 100Mhz bus without a hitch and outperformed Pentium II and Katmai Pentium III of the same speed).
the problem that killed the hyperthreading feature of top of the line Netburst processors was the cache contention, despite the large sizes of Netburst caches (they were that sensitive to cache misses),
one the reasons for the superiority of AMD's Durons (in their price/value space) was their supersized L1 caches,
and one of the great reasons why AMD's K8 could compete with Intel processors of FOUR times the total amount of L2 cache was the very efficient "exclusive" architecture of L1/L2 caches (here exclusive means that the data in L2 is not "repeated" in L1)

so, I can understand that the L3 cache in K10 could have been a good idea in the designing stages, but the test in real life conditions demonstrated that the extra memory latency and higher manufacturing costs wasn't really compensated by how much it helped performance. Still, AMD expended lots of money, opportunity costs, time to market, and risk exposure to bugs to develop this feature in K10 that ultimately was proven of dubious value. This highlights, once again, that AMD shouldn't have skipped the intermediate steps between K8 and the "triple challenge", or that the "business exploration" is very important.

Another positive lesson about the Erratum 298 is how much more responsive the Open Source software is when compared to proprietary offerings. Linux already has a patch that emulates the "Accessed" and "Dirty" bits of page descriptors, so, the performance penalty gets reduced to much more numerous page fault exceptions; on the other hand, Microsoft isn't even bothering to patch around the K10 problem; it is true that the patch performs nothing short of "major surgery" in memory subsystem of the Linux kernel, but while AMD can actually make a patch for Linux, I guess that it is unthinkable for Microsoft something as radical. For the same reason, I expect the Open Source virtualization projects Xen and VirtualBox to be much more agile than, let's say, VMWare, to tend a helping hand to AMD to still allow early K10 to run virtualization without an extreme performance hit.

I received an anonymous comment that pointed to "andikleen"'s comment that leads to the code in x86-64.org of the patch and the explanation. [ Thanks to whoever posted the comment, but please, leave a name, there is no need to sign in to anything, just overwrite "anonymous" with a name of your choosing and that'll do ]. Cyril Kowaliski @ TechReport also comments on the bug and the Linux patch finishing with a very important thing, the apparent contradiction that AMD says that few customers will be affected by this problem, but at the same time it strongly advises Phenom motherboard manufacturers to enable the BIOS fix that zaps at least 10% performance without giving the option to disable the fix. By now the whole world knows that the bug is severe, I honestly don't understand what is AMD trying to do by insisting on minimizing it...

According to the Kubicki's article we have been talking about, AMD will continue to ship defective processors until the next stepping, B3, of both Phenom and Barcelona, gets launched in March... although the "2.6 GHz Phenom model 9900 is not affected", so, presumably, the Phenom 9900 would be the first B3 K10.

There are more K10-related news: AMD is re-emphasizing 65nm K8, "Brisbane" [ DailyTech ], "of course!" is what I say. There never actually was any need for AMD to forget about K8, the world is barely moving to the dual core wave and AMD should have focused on improving their dual core offerings rather than the "triple challenge" foolish adventure that led to slower processors than what is acceptable, hotter, and buggier too. K8, on the other hand, still actually has untapped potential. Unfortunately, since AMD has such bad 65nm process, it just can't go for the 3.0 GHz and 3.2 GHz speeds, currently manufactured processors at that speed are all 90nm and will be discontinued.

In any case, AMD will have to ride three more months and more on the back of the architecture it has been slighting for over a year now, K8...

Tuesday, December 04, 2007

A-TItanic

This article is dedicated to Henri Richard, and to a lesser extent, to Dave Orton, because they have been the top rats that got a fortune from AMD and have abandoned the ATItanic(1) ship before facing the consequences of their decisions.

Mr. Orton was the CEO of ATI who managed to sell to AMD at 20% market premiums that nest of lice. Mr. Orton didn't allow much time for the evidently catastrophic situation AMD was in to hurt him, he resigned in July of this year.

The inertia that Mr. Orton's ATI carried over to AMD was the disappointing and power hungry 2900 series. It is also interesting to note that although AMD acquired ATI primarily to guarantee support for its processor initiatives, once inside AMD, ATI hasn't done even trivial things like supporting Quad FX. Now that the company has announced that it is killing the Quad FX, we can be sure that there will never be any ATI chipsets for it, the existing nVidia will be the only one...

Henri Richard holding a pair of Quad FX processors Speaking about Quad FX, that foolish platform (2) must have been the invention of someone at AMD, very probably Mr. Henri Richard, Senior Vicepresident of Marketing. Since only now that Mr. Richard is not in AMD the company terminates this initiative, it makes you think whether it was his project.

There are other decisions. Most managers at the upper echelons must have pushed for the disastrous "triple challenge" (3). Someone, very probably from marketing, thought that it was a good idea, one year ago, the last quarter AMD had strong demand, to give preference to OEMs and probably specifically Dell, rather than the historic businesses in the Channel, beginning the string of net losses of about $4 per share. Someone must have thought that ATI, a company with few tangible assets beyond its technical and marketing skill, was fairly valued. I don't think that production-minded people like Mr. Meyer or Dr. Ruiz would have taken the decision to acquire ATI without assessments of people from marketing. And finally, someone must have decided that the lesser evil was to launch defective products than to postpone the launching of any K10 altogether.

Henri Richard Mr. Richard departed AMD within two weeks of the very late (paper) launching of Barcelona, the first K10 based processor. Exactly what happened with K10?:

The company is not able to produce them even in small quantities, not even enough for IBM to certify its benchmark.
Not able to produce processors of even mediocre speeds, nothing close to what it desperately need, even slower than the already reduced estimates.
The company launched defective Server and Desktop K10 knowingly.
Promises of quantities or debugged products for late Q1

But all of the above is not really important, I heard/read Dr. Ruiz at least in two occasions saying that Barcelona wasn't going to materially affect the financial numbers of the company this year, that the production this year was for "design wins"; but the most important thing we now know about K10 is that it sucks, big time.

I find the chosen timing of Mr. Richard's departure as very good, he left just before the shit hit the fan.

Guys, I have to thank you. Even though you fooled me several times, in the end your departure allowed me to read through your bullshit and thus you indirectly helped me to hold on to my very bearish portfolio bias on AMD with conviction. That has paid off handsomely.

(1) A fellow "cass"posted a comment here where he refers to AMD/ATI as ATItanic
(2) Quad FX is a very foolish platform without coprocessors: A regular "desktop" computer finds very few uses for more than two cores; the economy/speed of unbuffered DDR2 at 800 MHz memory instead of registered DDR2 667 MHz of Opterons does not compensate for the platform premium cost nor the lack of options; the Quad FX, without consumer-oriented coprocessors, always was an expensive platform for cheap memory, an oxymoron.
(3) Triple challenge refers to develop a new architecture, K10, in an immature process and with the complexity, yields and bin split problems of single die quadcores; all at the same time, and unnecessarily.

More terrible news about Phenom and K10

[ UPDATED 12/5/12:04 CST ]

More lies come from AMD, and the performance of their products is significantly worse than in early reviews:

Giant in a comment at Roborat64's blog mentioned something important:

TechReport reports that their original Phenom benchmarks were done incorrectly, that the actual performance is worse than reported, you may find it at the penultimate paragraph:

We don't yet have a BIOS with the [ L3 Cache ] workaround to test, but we've already discovered that our Phenom review overstates the performance of the 2.3GHz Phenom. We tested at a 2.3GHz core clock with a 2.0GHz north bridge clock, because AMD told us those speeds were representative of the Phenom 9600. Our production samples of the Phenom 9500 and 9600, however, have north bridge clocks of 1.8GHz. Because the L3 cache runs at the speed of the north bridge, this clock plays a noteworthy role in overall Phenom performance. We've already confirmed lower scores in some benchmarks.

It is reasonable to assume that other sites may have done the same mistake, so, the already bad Phenom reviews are actually worse...

But that is not the important thing I want to talk about, it happens that "techreport" almost confirms that AMD lied regarding the reason why it couldn't launch Phenoms at 2.4 GHz and faster, that supposedly only affected the 2.4 GHz and over; it turns out that the problem is pervasive to all the current K10 incarnations, from Barcelona to Phenom, this is what Scott Wasson at "techreport" said about this subject yesterday: "Apparently contradicting prior AMD statements on the matter, [Michael Saucier, Desktop product Marketing Manager at AMD,] flatly denied any relationship between the TLB erratum and chip clock frequencies".

Not just this, but since the bug first showed up at Barcelona (that eventually led to a drastic cut of supply of defective product, only to those who have usage patterns such as supercomping not likely to trip the bug as opposed to virtualization workloads that are likely to trip it), AMD should have expected Phenom to have the same problem, but instead of postponing the launching of Phenom, the company went ahead and launched a defective series of processors.

Scott Wason connects the dots and mentions this:

[T]he presence of the TLB erratum may explain the odd behavior of AMD's PR team during the lead-up to the Phenom launch, as I described in my recent blog post. The decision to use 2.6GHz parts and to require the press to test in a controlled environment makes more sense in this context

It turns out that the BIOS patch that prevents the problem, that also includes microcode updates, turns off functionality of the L3 cache with an official impact of 10% of performance, or 20% according to early independent reviews. Let's use the official 10%, if we simply reduce clock speeds by 10%, the products AMD launched were not faster than 2.1 GHz... But this patch is not available today for the majority of 790FX platforms!

There is also the rumors that AMD will launch triple cores without L3 cache. This would confirm my appreciation that the L3 cache provides dubious performance advantages, but as you may see, it is another point of failure in the development of the architecture.

In summary:

K10 is buggy, in accordance to the predictions regarding the "triple challenge" of developing a new architecture on immature process and managing the complexities of single die quadcores [ I wrote an old article about why I expected that the sheer complexity of "Core" was going to be too much for Intel, but it happened that it was the "triple challenge" complexity what is too much for AMD, just like "Intel's 65nm is Marketing" applies much better to AMD ].
AMD lied about why it couldn't launch Phenoms at 2.4 GHz and over (2.6 GHz were promised a long time ago), this is more of the same bullshit as saying that the L2 latencies in Brisbane were higher supposedly to allow for larger caches in the future (the caches didn't increase in the 12 months after the launch of Brisbane, by the way).
Knowingly, AMD launched defective products
The performance reviews of Phenoms must be revised downward significantly, once the actual bugfixes are availabe, which make take a while!
AMD influenced reviewers to make a mistake (to set the external clock to 2.0 GHz) that would show Phenom in a more positive light
AMD tried to hide the problem at the launching of Spider.

I felt the need to update this post because I didn't speak about the implications:
Traditionally this is the most important season for businesses like AMD, but the products in the market may even be recalled and it will take some time, in the order of months, for AMD to be able to correct the problems, we are talking of late Q1, the worst business season...

I just wrote in "A-TItanic" that Dr. Ruiz several times said that K10 will not affect the finances of the company this year, that early production was for "design wins", but if something is very evident is that even bug-free K10 stink, so who is going to wait more months to buy such mediocre products?, or what motherboard designer is going to bother with K10 features like HyperTransport 3 and such?, what is AMD going to do with the K10 products it already manufactured? What about Penryn? This is all too much to ask to the venerable K8 architecture.

After all, it seems that K10 will affect the finances of AMD this year: very negatively.

Saturday, December 01, 2007

Another margin call!

It is getting annoying the margin maintenance rules of eTrade: I have an investment system for AMD, that believe it or not, every time I make lots of money, I get a margin call!

It has to do with the margin rules. In eTrade, as many other brokerages, the shares you own may be used as collateral for the margin loan, that is, the guarantee you give of being able to pay your margin balance are the shares themselves. Nevertheless, options can not be used as collateral, then, if for some reason you gain lots of money on options while you lose comparatively less on shares, you may lose "margin equity" and get a "margin call". This has happened to me so often that I lost track of the number of times.

My investment system is not very easy to describe, so be prepared to read this several times: I write short term at-the-money covered calls, meaning that I sell calls of strike prices close to the current stock price to expire in a few weeks backed up by shares, but in reality, I use the covered calls as a hedge and money flow to pay for the real investment, quantities of very long term and far-out-of-the-money puts. Since I gain either way with the shares moving up or down (or even more when they move sideways, yeah, I am like the casino: "The House Always Win") I go full tilt with margin purchases, and when I say "full tilt" I mean that I really buy everything I can buy on margin [ note: there are a number of adjustments that I do that may imply acquiring calls or other complicated plays against the market or bullish the market, but the bottom line are the written covered calls with shares overprotected ].

Anyway, when AMD crashes, as it happened recently, those puts really appreciate, much more than what I lose on the shares (the gains on the written calls are not significant in this case); but since the puts do not count towards margin equity, I get a margin call, and typically it forces me to sell a chunk of long term puts, which pisses me off 'cos I have to make sure that the order gets executed the margin call day and so I get hit with the bid/ask spread that may be huge in highly volatile environments...

Fortunately, this time I really wanted to reduce my AMD positions because half the downward movement I expected to happen in 2007 already happened, so, the next half is not so clear: I am not so sure that AMD will go below $7 now that it is @$10, not with the same conviction I had when it was at $13.50 that it will go below $10. I could, in principle, "neutralize" the position to not speculate on whether AMD will appreciate or not with the intention to just "milk" the written calls at over 1% per month of total gains including margin interest and all; but I think I can give my money better use. All in all, I sold 1/4 of my AMD positions and I am now 1/2 as bearish as I was before.

Chicagrafo