AMD Core Counts and Bulldozer: Preparing for an APU World
by Anand Lal Shimpi on November 30, 2009 12:00 AM EST- Posted in
- CPUs
The New Way to Count Cores
Henceforth AMD is referring to the number of integer cores on a processor when it counts cores. So a quad-core Zambezi is made up of four integer cores, or two Bulldozer modules. An eight-core would be four Bulldozer modules.
A hypothetical quad-core Bulldozer. Presumably the L3 cache would be shared by both modules.
A hypothetical eight-core Bulldozer. Presumably the L3 cache would be shared by all four modules.
It's a distinct shift from AMD's (and Intel's) current method of counting cores. A quad-core Phenom II X4 is literally four Phenom II cores on a single die, if you disabled three you would be left with a single core Phenom II. The same can't be said about a quad-core Bulldozer. The smallest functional block there is a module, which is two cores according to AMD.
Better than Hyper Threading?
Intel doesn't take, at least today, quite aggressive of a step towards multithreading. Nehalem uses SMT to send two threads to a single core, resulting in as much as a 30% increase in performance:
The added die area to enable HT on Nehalem is very small, far less than 5%.
AMD claims that the performance benefit from the second integer core on a single Bulldozer module is up to 80% on threaded code. That's more than what AMD could get through something like Hyper Threading, but as we've recently found out the impact to die size is not negligible. It really boils down to the sorts of workloads AMD will be running on Bulldozer. If they are indeed mostly integer, then the performance per die area will be quite good and the tradeoff worth it. Part of the integer/FP balance does depend on how quickly the world embraces computing on the GPU however...
According to AMD's roadmaps, Zambezi will use either 4 or 8 Bulldozer cores (that's 2 or 4 modules). The quad-core Zambezi should have roughly 10 - 35% better integer performance than a similarly clocked quad-core Phenom II. An eight-core Zambezi will be a threaded monster.
No GPU, for Now
The first APU from AMD will be Llano, but based on existing Phenom II cores. The move to a new manufacturing process combined with the first monolithic CPU/GPU is enough to do at once, there's no need to toss in a brand new microarchitecture at the same time.
AMD did add that eventually, in a matter of 3 - 5 years, most floating point workloads would be moved off of the CPU and onto the GPU. At that point you could even argue against including any sort of FP logic on the "CPU" at all. It's clear that AMD's design direction with Bulldozer is to prepare for that future.
In recent history AMD's architectural decisions have predicted, earlier than Intel, where the the microprocessor industry was headed. The K8 embraced 64-bit computing, a move that Intel eventually echoed some years later. Phenom was first to migrate to the 3 level cache hierarchy that we have today, with private L2 caches. Nehalem mimicked and improved on that philosophy. Bulldozer appears to be similarly ahead of its time, ready for world where heterogenous CPU/GPU computing is commonplace. I wonder if we'll see a similar architecture from Intel in a few years.
94 Comments
View All Comments
Alouette Radeon - Wednesday, March 10, 2010 - link
How the hell could you embrace Intel after all the harm they've caused AMD, nVidia, VIA and ultimately us, the consumers with their criminal tactics? They don't give a damn about you, they just want your money. A lot of people say that AMD is no different (I know nVidia sure is the same as Intel in that regard) but at least they operate with integrity. They've never been accused of anything underhanded or sneaky. For that matter, neither has VIA. Intel and nVidia on the other hand, while nVidia has never done anything downright CRIMINAL, they've still been dishonest as hell. Intel on the other hand, has stooped about as low as you can go. So go ahead, embrace Intel, just like a stupid biatch who won't leave her abusive spouse. She just keeps going back for more and people like me who have brains can only shake our heads and wonder.AmbroseAthan - Monday, November 30, 2009 - link
While I agree Intel has the performance crown now, I can't knock AMD for being the value right now. Picked up an AMD x4 955 BE and Asus motherboard (full ATX/crossfire) for $230 to build my parents a computer with (Newegg combo). Intel can't compete in that price space easily.dilidolo - Monday, November 30, 2009 - link
Intel can't compete in that price range? No, Intel doesn't want to. Manufacturing capacity is limited, if I can sell more higher margin products, why should I go after lower margin segment? Leave that segment to AMD, the more AMD sells in that segment, the more money AMD looses. If Intel wants to compete in that segment, they can easily kill AMD, that's not what Intel wants to do.siuol11 - Tuesday, December 1, 2009 - link
Ah, the ravings of the marginally informed... How the internet loves them!blyndy - Monday, November 30, 2009 - link
I'm very excited about AMD's brand-new design and how it's new ideas translate into performance, however:"The quad-core Zambezi should have roughly 10 - 35% better integer performance than a similarly clocked quad-core Phenom II"
That sounds a bit low, I hope the final comparable CPUs can manage something more like 15 - 40% better integer performance over their PhII counterparts. Then again perhaps that's just because of Intels large performance increases between their recent architectures making us expect more -- they are more the exception than the rule, so 10 - 35 % shouldn't be sneezed at, although that just may not be competitive on their release in 2011.
mczak - Monday, November 30, 2009 - link
Considering that the int cores actually have less execution units (used to be 3 alus (plus shared load/store, but can do two operations per clock), bulldozer only 2 alus (+ load and separate store)) I think 10-35% better integer performance is amazing. More than that would be a miracle imho...Zool - Monday, November 30, 2009 - link
From the previous article "The extra integer core (schedulers, D-cache and pipelines) adds only 5% die space".So the quad core Zambezi (2 modules, 4 integer pipelines)should have roughly 10-35% better integer performance than a similarly clocked quad-core Phenom II. Thats a super boost per transistor count.
nafhan - Monday, November 30, 2009 - link
Based on AMD's re-defining of the word core that's actually a HUGE improvement. A quad core Zambezi has a similar transistor budget as a dual core Phenom II, and a 10-35% performance improvement.In other words, quad core integer performance for dual core price.
psychobriggsy - Tuesday, December 1, 2009 - link
A quad-core Bulldozer has the same transistor budget as a tri-core Phenom II (if they existed natively), yet performs around 20% better than a quad-core.I think that SMT would have provided easier performance pickings (20% for 5% die space). I don't understand why AMD have been avoiding SMT so far. Sure, 80% more performance for 50% die space isn't to be sneezed at, but it's not so easy pickings.
In addition there are more integer resources than in a Phenom II core, and the FPU has two 128-bit FMAs, so each core could be reasonably bigger. In effect it could be that 1 Bulldozer module is the same size as two Phenom II cores - so all you have then is the 10-35% performance increase. I hope this is per-clock...
titan7 - Sunday, December 6, 2009 - link
Perhaps the k7/k8 didn't make sense to add SMT? The p4 was really designed for it and had it enabled in genII. Look how long it took Intel to get SMT into the Pentium Pro/Core/i7.I suspect AMD is designing for SMT right now, but gen1 is just "get to market ASAP because Intel is faster right now" and genII will have SMT enabled.