AMD's Bulldozer Packs Plenty Of Cores, But Not Enough Power
- 14 October, 2011 01:01
Earlier this year Intel made waves with its Sandy Bridge processors, which served up impressive performance gains over their predecessors while improving energy efficiency. AMD’s return salvo is finally here in the form of the AMD FX platform, previously codenamed Bulldozer.
AMD built the FX platform with multi-threaded software in mind, juggling speed and power consumption to deliver strong, energy-efficient performance for the growing list of applications built to take advantage of multi-core processors. To that end, the Bulldozer platform supports up to eight physical cores.
That’s… a lot of cores. It's a number usually reserved for server hardware--a conclusion I found myself coming back to fairly often while putting the platform through its paces. And while the Bulldozer platform is a significant step forward for AMD, it fails to outshine the potent competition.
Inside the FX Platform
AMD built a fair amount of new technology into the Bulldozer architecture, along withquite a few refinements. AMD’s Turbo Core makes an appearance as well.
Here's how Turbo Core works: processors have heat and power thresholds, beyond which their stability becomes compromised. We call this limit their thermal design power (or TDP). When a CPU hasn’t hit its limits while it’s running, there’s inevitably bit of room left to push it's frequency up, automatically overclocking it. Intel's variant of the same process came first -- they call theirs Turbo Boost.
With Bulldozer, Turbo Core gets a bit smarter. If an app is using some, but not all, of the available processor cores, Max Turbo increase the clock speed on the cores that are in use. If an application is taking advantage of every single core, Core Turbo will boost them all--it won't pust them quite as far as it would if only some of the cores were in use, but it would work to leave no potential performance on the table.
Further architecture enhancements include AES instructions for hardware-accelerated encryption and AVX instruction for improved floating point performance (this will benefit video- and photo-processing tasks, as well as some financial applications). Intel’s Sandy Bridge platform already offers both of these instruction sets, so Bulldozer plays catch-up to some extent here.
AMD also baked in support for FMA4 and XOP instruction sets, which lend greater performance to complex mathematical operations. There’s a large can of worms lurking there, with Intel and AMD in disagreement over the optimal instruction sets for these operations. But that’s beyond the scope of my testing here.
On to the CPUs!
The Bulldozer CPUs are compatible with AMD’s recently launched AM3+ socket. On the high end is the FX−8150 processor: It’s an eight-core chip slated to cost $245, with a standard clock speed of 3.6GHz that bumps up to 3.9GHz in Turbo Core mode, and tops out at 4.2 GHz in Max Turbo mode. AMD provided one of these CPUs for my tests, as well as an Asetek liquid CPU cooler for gauging overclocking performance.
It’s a nice lineup on paper--of special interest to me is the FX-4100, which offers half as many physical cores as its beefier siblings, but comes costs only $115 as of this writing. I suspect the FX-4100 will be a popular part in some of the more inexpensive gaming rigs that make the rounds in the coming months. Meanwhile, the FX-8150 will match up against the middle of Intel’s Sandy Bridge lineup, the Core i5-2500K.
My AMD test system consisted of the aforementioned Bulldozer CPU on an Asus Republic of Gamers Crosshair V Formula motherboard, with 4GB of DDR3-1600MHz RAM, a 1TB hard drive, and a Radeon HD 6970 graphics card. My Intel testbed was nearly identical, with the exception of the Core i5-2500K processor and the Intel DP67BL motherboard.
Enough talk: how does the FX-8150 stack up?
Testing: Synthetic Benchmarks
We’ll start with some synthetic benchmarks. These are all theoretical measures of performance: they stress the hardware in ways that simulate real-world performance, but aren’t necessarily an accurate representation of how you’d actually use your PC. They’re useful because they give us an idea of a machine’s upper limitations, and give us a level playing field to muck around in.
First up, is Cinebench. This is a rather straightforward synthetic benchmark, developed by Maxon to evaluate a PC’s processing prowess. Cinebench offers two tests: one that stresses the GPU, and one that stresses the CPU. We’re only interested in the processor benchmark, here. Cinebench is useful because it scales well across multi-core CPUs —up to 64 processor threads — so we can test the capabilities of all eight cores in the FX-8150, and the 4 cores — 8 virtual cores, care of Hyperthreading — in the Core i5-2500K.
The processor performance test consists of rendering a complex scene, with points being assigned based on how quickly the task is completed. The FX-8150 earned 6.01 points, while the Core i5-2500K earned 6.17 points. If might seem like a fairly minor difference, and it is. But it does offer the first instance of the performance gap (however minor) between AMD’s latest, and Intel’s mid-range platform.
Next up is Futuremark’s PCMark 7. This benchmark is an industry standard, crunching through a series of workloads and assigning a score based on how the machine performed. There are a wide range of tests rolled into PCMark, gauging everything from video playback to web browsing. And in all cases, Intel’s Sandy Bridge architecture proved the victor. The FX-8150 earned a PCMark 7 score of 2807, as compared to the Core i5-2500K’s score of 3450.
On to Unigine Heaven. This synthetic benchmark is a bit more interesting, if only for the gorgeous visuals it serves up. This benchmark is a technical demonstration of a high-end 3D game engine that’s currently in development. It’s designed to tax a PC with a strenuous, DirectX 11-based workload. As there’s no actual game based on the engine available yet it won’t replace proper real-world gaming tests. But it still gives us some idea of how the pair of processors stack up.
And stack up they do. The differences are once again largely inconsequential: the Core i5-2500K maintains something like a lead for most of the tests, but it’s often barely greater than a tenth of a frame per second. We’ll need to turn to “real” games to get a better idea of where the differences lie.
Our games tests consisted of a pair of top-tier DirectX 11-enabled titles: Crytek’s Crysis 2, and Codemasters’ Dirt 3. They’re both quite capable of taxing a system once all of the bells and whistles are turned on. I tested both games at two resolutions: 1920-by-1080 pixels and 2560-by-1600 pixels, on 30-inch displays. The games’ settings were cranked up, with anti-aliasing set to 4x.
We’ll start with Crytek’s Crysis 2. The original Crysis brought lesser PCs to their knees, demanding premium hardware to achieve playable framerates. While things have improved and the game is no longer the resource hog it once was, it’s still more than capable of stressing a PC’s capabilities.
As we saw with our synthetic benchmarks, the Core i5-2500K is the winner. But you’d be hard pressed to actually see the difference, with results this close.
The end result is about the same for Codemasters’ Dirt 3.
At the higher end of our tests, the FX-8150 takes something like a lead, but doesn’t pull away much further than the Core i5-2500K managed to when we tested Crysis 2.
Testing: Video Transcoding and Energy Efficiency
Thus far we haven’t seen much to write home about. The FX-8150 maintains a solid pace alongside the Core i5-2500K, but neither part seems to outshines the other.
Things get a bit more interesting when we turn our attention to to media transcoding. This is a fairly common task: you’ve got a an audio or video file, and you’d like to convert it into another format. For my test case, I grabbed a copy of Big Buck Bunny, an open-licensed animated film that’s available in a wide variety of formats and resolutions. There are plenty of software options to choose from; I went with Arcsoft’s Media Converter.
There are a number of hardware acceleration technologies available: Nvidia offers CUDA, AMD offers Stream, and Intel offers Quick Sync on platforms using its integrated graphics. For parity’s sake I chose AMD Stream, as my test benches are equipped with that Radeon HD 6970. AMD Stream (and the other technologies I mentioned) allow a system’s graphics card to work with the CPU to churn through transcoding a bit faster.
For this test, I converted a 1080p version of Big Buck Bunny down to an iPad-friendly 720p resolution. These results are a bit more clear cut: after multiple passes, the FX-8150 completed its conversion with an average time of 3 minutes and 19 seconds. The Core i5-2500K finished a full minute faster, at 2 minutes and 18 seconds.
What happens when we shut those GPUs off and let the processors stand on their own merits? The same: 4 minutes and 22 seconds for the FX-8150, 2 minutes and 58 seconds for the Core i5-2500K.
The last measure to consider is power efficiency. I tested two scenarios using a power meter. For the idle setting, I let the machine sit at the desktop. To test power consumption while under a heavy load, I ran a pass of the Crysis 2 benchmark at maximum settings, and let the power meter record the results.
The results are fairly cut and dry here, too. While sitting idly, the Core i5-2500K test bench drew 71.8 watts of power. The FX-8150 drew 109.8. When under load, the Core i5-2500K climbed up to 300 watts, while the FX-8150 peaked at 356 watts.
That’s a fairly dramatic difference, but one that’s plausible. Both processors are 32nm parts, but the Core i5-2500K has a max TDP of 95W, while the FX-8150 has a max TDP of 125W — it’s simply a hungrier part. There are also the 8 physical cores to consider; the Core i5-2500K has 4, with 8 virtual cores thanks to hyper-threading. One could also make a case for optimization: the hardware I tested is likely to see plenty of BIOS improvements as it progresses, so I won’t be surprised if motherboard updates pull those numbers down a bit with time.
But in the end, we’re still looking at a brand new processor that draws a significant chunk of energy to generate performance that barely keeps pace with Intel’s 9-month old hardware.
Things haven’t boded well for Bulldozer thus far. All signs point to it being a server processor with no real place in a desktop: it’s power hungry, and performance falters in all but the most highly-threaded apps when pitted against a CPU that costs $30 less. But there’s still one plausible saving grace for the enthusiast set: overclocking.
A few months ago I’ve watched pro-overclockers take the FX-8150 well past 8GHz, so I know what it’s theoretically capable of. I’ve nowhere near that level of expertise, nor do I have giant tanks of liquid helium on hand. But it couldn’t hurt to give it a shot.
AMD provided the latest version of their Overdrive overclocking utility. It’s a wonderful tool for the performance minded, giving you full control over all of the minutiae of CPUs without requiring constant trips back to the BIOS — quite the boon for relative novices like myself.
Overclocking is more of a dark art than an exact science, so results will vary. With a few slight tweaks to the FX-8150’s CPU voltage and a bit of trial and error with the CPU multiplier, I cranked the FX-8150 from its stock voltage of 3.6GHz up to 4.4GHz — and it sang. My new Cinebench result was 7.06, an appreciable bump over its stock performance. Applications that are dependent on the CPU will see noticeable improvements.
Our gaming results didn’t change much, owing to the extreme resolutions and the powerful graphics hardware, but if you found yourself CPU limited, you could see some gains there. The tradeoff is that you’d need make the expected alterations to your machine’s configuration: slapping on a better cooler (or dealing with loud fans), and dealing with the requisite surges in power consumption.
Caveats abound, unfortunately. While AMD has made some strong claims regarding Bulldozer’s overclocking potential, much of that hinges on shutting off cores, and taking other steps to mitigate the heat and power consumption. To get things running stably at 4.4GHz, I had to disable AMD’s Turbo Core. The end result is that power consumption didn’t jump too dramatically: under load, my overclocked benchmark only drew 359 watts, as opposed to the 356 watts at stock settings. But I’d of course lost that extra edge that Turbo Core affords.
Is Bulldozer right for you?
Bulldozer has been a long time coming. And now that it’s here, it’s kind of hard to make case for it. I hesitate to use the word “disappointment,” but these testing results are clear. An extra $45 over Intel’s existing hardware gets you comparable performance, at a significant power cost.
AMD has bragged about the FX platform’s overclocking potential — Intel’s unlocked, K-series processors do come at a price premium over their standard, locked variants. But if you are a member of the enthusiast community that pushes their machines beyond stock settings, Intel’s processors will still offer superior performance.
If you’re currently running on recent AMD hardware then an upgrade could be inexpensive, as some motherboard manufacturers profess support for AM3+ CPUs in their AM3 socket motherboards, after a BIOS update. But the use of AM3+ CPUs in AM3 boards isn’t support officially supported by AMD, so you’ll be on your own if something goes amiss.
If you’re looking to pick up a new machine, things don’t really bode well for AMD. Aggressive pricing could be AMD’s saving grace here. If pressed I could argue that the Bulldozer chips further down the line could make for a nice, inexpensive gaming rig — the six-core FX-6100 could be a nice deal at $165, particularly if its overclocking potential approached my experience on the FX-8150. But that makes for quite a few caveats, ifs, and maybes.
I’ll be blunt: this is a commendable server processor, likely holding up well when tackling complex computational tasks in an environment where power consumption is irrelevant. But think twice before sliding it into your desktop.