AMD Hints To Hyper-Threading In 2012

April 23, 2009

Ahead in many fronts, AMD is still playing catch up to Intel on hyper-threading.

Some of you have raised the issue on why AMD has not adopted hyper-threading to increase performance of its chips in highly threaded environment such as virtualization.

Recently a TechPulse360 reader commented on why hyper-threading made sense:

  1. Hyperthreading gives tremendous boost in certain applications. 10-15% is towards the lower end of the scale;
  2. All modern CPU architectures (except AMD) support multiple threads per core. Look at POWER from IBM, T1/T2/Rock from Sun Micro, Nehalem/Atom/Larrabee from Intel …;
  3. A proper implementation of simultaneous multithreading (SMT, and what Intel calls Hyper-Threading) requires few additional resources. There is no such thing as a “Normal Pipeline” and a “Hyperthreaded pipeline” – all the functional units that form the bulk of the pipleline are unchanged, certain resources are shared, while certain other resources (like ISA registers) are duplicated.

This week an AMD engineer confided to me that not having hyper-threading available made Opteron look slower than Intel’s low-end chips. The engineer also said that people at AMD have now admitted that not having hyper-threading was the wrong technical choice.

So here’s what Pat Patla, AMD’s server boss had to say when I asked him about hyper-threading during my visit at AMD’s Sunnyvale, Calif.- headquarters this week:

“If you look at our future roadmap and what we’re showing for adressing the threaded market, we believe it is best addressed at full core count this time. And you saw our 2010 time frame when we are talking about 12 cores per CPU and in 2011 with 16 cores per CPU. So we think we are pretty well covered in the 48 to 64 threads environment for the next couple years and we’ll see what 2012 and 2013 brings.”

It sounds to me that AMD’s 2012 chips are going to have hyper-threading!

Here’s a video excerpt of Pat Patla answers on the hyper-threading question:


AMD Plans To Quadruple Opteron Performance In 2-Years With 16 “Bulldozer” Cores, 32-Nm

April 23, 2009
AMD expects to quadruple the performance of its chips in 2 years!

AMD expects to more than quadruple the performance of its current quad-core chips in the next 2 years!

AMD is on a roll this week.

After announcing the June availability of its native six-cores server chip “Istanbul”, AMD unveiled the roadmap of its future processors all the way up to 2011.

In 2-years, the Sunnyvale, Calif.-chipmaker expects to launch its “Interlagos” 16-cores server chip based on the “Bulldozzer” core – which is AMD’s new implementation of the X86 micro-architecture- and build on a 32-nm process. Intel’s first 32-nm chips are expected by the end of this year.

With Interlagos, AMD plans to quadruple today’s server chip performance.

Here’s a video excerpt of AMD’s server boss, Pat Patla, talking about Bulldozer:


AMD Counters Intel’s “Disingenuous” Server Claims Over Xeon 5500 (Nehalem) Performance, Price

April 6, 2009
An Intel Nehalem server system cost almost twice as much as an equivalent (2 memory channels) AMD Opteron server

An Intel Nehalem server system cost almost twice as much as an equivalent (2 memory channels) AMD Opteron server

AMD finally strikes back at Intel’s competitive server claims.

It took indeed several days for AMD to respond to some of the claims that Intel made when it launched its next-generation server chip Xeon 5500 (Nehalem) last week.

And reading through the counter-claims, Intel’s Nehalem looks more of a jumbo-jet than a supersonic jet fighter!

Here’s a summary of my conversations with AMD’s server Chief Pat Patla and manager John Fruehe.

So is the slowest Nehalem chip really faster than the fastest Opteron chip, including AMD’s Istanbul server chip coming out at the end of the year?

How can that be? The slowest Nehalem is a dual-core chip. How can their dual-core chip be faster than our quad-core? They [Intel] just say those things without anything backing up their statement. The only benchmarks Intel published is on their top end parts. Nothing on the lower end. Intel has done a great job in marketing. I don’t necessarily agree that they have done a great job in driving value for the customers.

But Intel has added the super fast QuickPath Interconnect (QPI) with Nehalem?

QPI is actually a copy of HyperTransport. As a matter of fact, they could have licensed HyperTransport which is an open standard but instead they decided to go with their own proprietary infrastructure.

Now, Intel will rave about the performance of QPI but it’s only if you buy their top end parts. If you buy their mid-range parts, the QPI speed drops down, and if you buy their lower end parts the QPI speeds drop down even more.

Meaning that if you have an application that rely on high I/O and high memory throughput but doesn’t need a lot of compute power, like a Web server, a file server or network infrastructure – which are the real backbone of today’s data centers - you would have to buy the fastest Nehalem processor to get the fastest QPI! Instead, we offer the same HyperTransport speed on all of our Opteron chips.

And hyperthreading?

Real men use real cores. We’ve got real cores across our products. Hyperthreading is basically designed to act like a core except that it only gives 10 to 15 percent performance bump for real applications workload. That’s because hyperthreading requires the core logic to maintain 2 pipelines: its normal pipeline and its hyperthreaded pipeline. A management overhead that doesn’t give you a clear throughput.

You’re saying that Nehalem chips are overpriced. Why?

Yes. A Dell server with the Nehalem 2.93 GHz chip is 104 percent more expensive (~$6.100) than the same configured server equipped with a Shanghai processor at 2.7 GHz (~$3,000). At this price, I sure hope so that they are faster. So if you’re in a tough economy and you’re trying to make your budget dollars as far as you can, you’re probably not going to buy half as many Nehalem servers but more cost effective Opteron servers.

It’s somewhat disingenuous to layout all the benchmarks and say “we’ve got a better platform” and completely ignore the pricing aspect of it.

Why are Intel-based servers more expensive than AMDs?

  1. First off, the price of the Nehalem chip itself is more expensive than the Opteron chip;
  2. Then, they use DDR3 memory which is more expensive, draws more power and has higher latency. So DDR3 is not a good choice for 2009. But in 2010, the tables will turn on DDR3 with lower prices, lower latency and lower power;
  3. The Nehalem servers have 3 channels of memory, versus 2 for the Opteron. So where we would put 2 DIMMs, they would put 3 DIMMs in, which makes it 50 percent more expensive in DIMMs and it’s going to consume 50 percent more power from the memory perspective;
  4. Because of the size of the socket and because of the 3 memory channels, Intel needs to have more layers on the board, plus special VRMs, etc… making the whole infrastructure more expensive to build.

What about Intel claim that a customer can consolidate 9 single core servers on one single Nehalem server ?

We also support all the virtualization platforms (VMware,Microsoft HyperV, Xen…) which let one dual-socket server support on average 5 to 10 virtual machines. So what Intel is really talking about is virtualization and we do that as well! There’s no reason that you could not support the work of 10 single core servers on an Opteron. They are making that sound as something unique that only Intel can do, but they’re not the only platform that runs virtualization.

Intel also claims that in some cases, Nehalem servers have an ROI of only 8 months!

Again, it’s disingenuous to talk about ROI to the IT world as a hardware vendor. Because people look at a complete solution: it’s hardware, software, lifecycle management, licensing, power, security… And if you look at any TCO models – which is what you’d use to do an ROI analysis – it will say that acquisition costs (hardware and software) is about 25 percent. And the software is a lot more expensive than the hardware. So if your hardware is about 10 percent of the cost of the total solution, how are they coming up with an ROI of 8 months? I’m sure they are doing the math thinking “if you’re buying the server today and you unplug 10 single core servers, the amount of power that you’d save would payoff this server.”

And Nehalem servers being a cash machine after 8 months?

Maybe after 8 months, it starts to print off enough money to pay for the 104 percent price premium that you pay at the beginning! You could virtualize 10 servers on an Opteron platform using virtualization and unplug them. And because we are half as much in cost, our ROI should technically be 4 months, shouldn’t it? If they can do it in 8 months, and we cost half as they do, we should do it in 4 months, right? But I wouldn’t make that statement to customers because I’ll be laughed out of their office, because it’s not how they measure ROI.

Is Nehalem really all that bad?

Intel have done a lot of great work to bring down the idle power, which is great on a desktop but less an issue in servers. So, while Nehalem has a very low idle power, in a data center you have to set all of your parameters around the highest amount of power the platform can draw. And by design Nehalem servers draw more power than Opteron servers. Which means that you can put less of them in a data center than AMD servers


Virtualization Is #1 Driver For AMD Push in Server Chipset Business

September 30, 2008

Virtualization was one of the main topics discussed this morning at AMD’s “Shanghai” media briefing in San Francisco.

First, was the key issue of moving virtual machines using VMware’s VMotion administration tool between AMD and Intel servers. Unable to do so, companies will then have to decide very early on whether to choose between AMD or Intel hardware platforms.

“It’s possible but under certain conditions [AMD promised to "get back to us" with more details. Comments Margareth?]. But there are also issues in moving virtual machines from Intel to Intel servers”, said AMD’s server and workstation division general manager, Patrick Patla.

AMD’s executive also confirmed that virtualization and virtualized I/O was the #1 driver for the company’s push into the server chipset business. Effectively competing with partners like Broadcom or Nvidia that had so far supplied the chipsets for AMD’s server processors.

“Virtualization is now in our DNA and when we do silicon design or we’re thinking of enhancements we always think about what we can do for virtualization
… We need it [the chipset] to make sure that it’s done right, that we bring the feature [virtualized I/O] to market when we think it needs to be there”, explains Patla.


AMD To Ship “Shanghai” Server Chip 3 Months Early. A 35% Speed Bump To Current Opteron Generation

September 30, 2008
AMD "Shanghai" media briefing in San Francisco

AMD "Shanghai" media briefing

In a briefing this morning in San Francisco, Patrick Patla, the new general manager of AMD’s server and workstation division, confirmed that the Silicon Valley company is already shipping the latest iteration of its Opteron server processor dubbed “Shanghai”. The “standard” Shanghai chip will be in OEM machines by year end, but the faster and more power efficient ones will ship in the first quarter of next year.

“It’s going to be 35% more power efficient than the current generation. It has more memory cache and a faster communications bus [HyperTransport version 3]“, explained Patla.

The fact that AMD is shipping Shanghai earlier also means that there will only be 6 months at most between the current generation of Opteron and the next one. That tells you how screwed up the Barcelona launch was, leaving AMD less than 6 months to recoup its investment versus the usual year or so.

Read the rest of this entry »


Follow

Get every new post delivered to your Inbox.

Join 31 other followers