Your server is wasting your CPU
- 31 July, 2008 10:41
While using an AMD Barcelona (quad-core Opteron) server to create a portable benchmarking kit for InfoWorld's Test Center, I discovered something unexpected: I could incur variances in some benchmark tests ranging from 10 to 60 per cent through combined manipulation of the server's BIOS settings, BIOS version, compiler flags, and OS release.
I attempted to document the impact that each individual change had on performance, but flipping one switch often changed the effect of all the others. This frustrating yet fascinating effort ran aground when I had fiddled with settings and flags so much that my testbed stabilized; I could no longer budge the results no matter what I did. Normally stability is a good thing, but in this case, I was more interested in investigating the variance than in eliminating it. If I tried to bring this case to an actual engineer, he or she would likely tell me it had all been a mirage.
Last week, I had an opportunity to put this matter to a whole panel of engineers, the brain trust that manages performance and power testing for AMD's server CPUs. I was told that I was seeing an effect that's widely known among CPU engineers, but seldom communicated to IT. The performance envelope of a CPU and chipset is cast in silicon, but sculpted in software. Long before you lay hands on a server, BIOS and OS engineers have reshaped its finely tuned logic in code, sometimes with the real intent of making it faster or more efficient in some way that AMD hadn't considered, sometimes to compensate for overall server design flaws, and sometimes to homogenize the server to flatten its performance relative to Intel's.
Perhaps there have been cases where AMD servers were made more powerful or efficient through software tuning that deviates from AMD's advice. I sincerely doubt it. Most times, trying to outthink AMD's engineers is a fruitless exercise, but system and OS makers do it all the time. When they get it wrong -- and this is far easier to do than getting it right -- it costs you. You end up with systems that aren't performing to their potential, are letting power efficiency features go unexploited, or both.
AMD has performance engineering teams devoted to the science of optimization. Before a single system is built using any new family or major revision of AMD64 microprocessor, AMD issues detailed documentation listing each CPU's capabilities and tunable parameters. New CPUs and AMD-built chipsets go out with reference BIOS code that puts the processor in an optimal state before booting the OS. I met the engineers that develop the guidance for system makers, BIOS vendors, and the OS development teams at Microsoft, Red Hat, Suse, and Sun. They're no amateurs; I'd trust their advice. But once they do their jobs, the tuning of each system sold with AMD CPUs is out of their hands. The tiller is turned over to software.
The BIOS gets in there first. Machine code in the BIOS walks through the CPU's parameters and initializes them based on some combination of AMD's advice, the system administrator's preferences as expressed through configuration settings, and the whims of the system maker. Manufacturers contract for the development of the BIOS firmware in their servers. They determine what an admin can adjust, as well as the settings of all the things you can neither see nor change.
You'll never find a server shipped from the vendor with overly aggressive settings. Systems may be tuned downward to operate at the widest possible range of temperatures, to accommodate cheaper components, or to throttle performance in order to compensate for inadequate cooling design. Tuning can also undercut system performance in misguided attempts to meet energy efficiency targets, when that objective could be just as well served without sacrificing performance. For example, slowing memory access can cool the system considerably and lower its power draw, but the cap on performance means that tasks take longer to complete, increasing the time that the system spends drawing maximum power.
The OS also presents an interesting issue, especially with Windows. I was surprised to learn that starting with Vista, processor drivers, critical to controlling power states, are being written exclusively by Microsoft. AMD knows a thing or two that Microsoft doesn't about tweaking an AMD64 CPU for speed and efficiency. Microsoft wants to handle this on its own, and tuning within Windows Vista and Server 2008 does not take the unique characteristics and advantages of AMD's architecture into account.
The big problem is that there's no way for IT and end-users to find out what they're missing. It is possible to dump the myriad registers affecting performance, but they're meaningless to mortals and many can't be changed without disrupting operation. Short of writing your own BIOS, there isn't much you can do. Maybe that will change. The secretive relationship between chipmakers and OEMs doesn't always serve customers well. The configuration advice that AMD issues to its OEMs, BIOS vendors, and OS vendors could form a sort of fingerprint. Even without an understanding of the meaning of individual registers and flags, patterns of variance can point to a vendor's agenda for diverging from best practices. If nothing else, IT could ask why AMD's advice wasn't followed. There may be perfectly good reasons, reasons that differentiate one server brand from another and show who's been doing their homework.
No chipmaker would ever single out an OEM for praise or scorn. AMD's no exception. While AMD's testing engineers express frustration that their recommendations are a take-it-or-leave-it affair, and that when their advice is set aside it affects the public's perception of their CPUs, they don't take it out on OEMs either on or off the record. AMD figures that this is the way the system works when you're on the 20 side of a market that's split 80/20.
The system needs to change. AMD is building new classes of high-powered client platforms that are wide open to end-user parametric tweaking. Enthusiasts and gamers do pay attention to AMD's advice with regard to performance, and they're driven to pull the maximum possible performance out of AMD's silicon. This serves AMD well, because when third parties do this and write about it, AMD doesn't have to out OEMs for taking a lazy approach to configuration. Enthusiast-tweaked machines create a best case, and makers of desktops sold into the high end will have to explain why they don't live up to best case numbers. That's not being done for servers, in large part because server enthusiasts willing to do exploratory tweaking of their machines are rare. I only know one such person.
As mainstream server CPUs grow from four to six to eight cores, four socket servers become the norm and deeply multithreaded applications come to predominate, tuning the CPU, chipset, bus, and memory becomes crucial, with a direct impact measured in dollars, hours, and watts. This is tuning that administrators shouldn't be required to do by hand. They should be able to trust that when a system hits their floor, it performs as well as its technology permits. This requires that vendors put some effort behind understanding and leveraging the differences between AMD and Intel architectures -- effort that isn't a priority at present. This mystifies me, since AMD does all of the legwork, freely handing vendors BIOS and kernel guidance that started taking shape when the CPU was still in simulation. It takes a lot of work to ignore the chipmaker's advice, and so far, I've seen no evidence that it does customers any good.