UPDATE: AMD Shanghai versus Intel Nehalem at 45nm

Posted by AnandTech

Anandtech.com posted an article last week comparing Intel's Nehalem to AMD's Shanghai on the Linpack 9.1 benchmark. While running the same binary on both systems, the Core i7 2.66 GHz with HyperThreading turned off gave the best score of 36.9. AMD's 2.7 GHz Opteron 8234 with DDR2-800 came in at 32.5 (13.5% lower) and with DDR2-533 came in at 30.3 (21.8% lower). Linpack 10.1 (the current version) would not run on the AMD system.



No cost data



While the AnandTech article is only preliminary, there was no comparison of power consumption or cost given. The Opteron 8384 (greater-than-2-way chip) 2.7 GHz today sells for $2,149. The equivalent 2384 series (2-way chip) sells for $989. Intel's Nehalem-based Core i7 at 2.66 GHz (the slowest model in the Core i7 line) is $284. The 2.93 GHz and 3.20 GHz models sell for $562 and $999 respectively. And with the 2.66 GHz able to best AMD's high-end offering by 13.5% for 1/9th the price, the 3.20 GHz model at $999 should increase that notably while still less than half the price.


No power consumption data



AMD's 8384 does consume less power, however, at 75 watts ACP, Core i7 2.66 can consumes as much as 130 watts before throttling. Both Shanghai and Nehalem do have some power saving features which allow unused cores to be completely powered down. In addition, typical power should be around 90 watts under similar loads in Core i7 which brings them closer into parity. A recent article at Tom's Hardware does cite that Core i7 platforms consume more power than Core 2 did.


No relative performance data



AMD has also reported that Shanghai contains HyperTransport 3.0 support with up to 41.6 GB/s bi-directional bandwidth theoretically (20.8 GB/s uni-directional). However, no products shipped before 2H'09 will have HT3 technology enabled and AMD claims only 2-way and higher users will see speedups with HT3 (which are about 5% on 2-way and 10-15% on greater-than-2-way). Still, this hard reality limits today's Shanghai CPUs to HT 2.0 and a maximum throughput of 22.4 GB/s bi-directional.

Intel's Core i7's QuickPath Interconnect (QPI) delivers up to 32 GB/s bi-directional bandwidth via a 20-bit link. Early Nehalem chips are limited to 25.6 GB/s throughput, however.




Read more at AnandTech.




UPDATED:  December 1, 2008 at 3:02pm
AnandTech (and TG Daily) have received several comments wondering why these two platforms were compared. The idea was to show the latest core revisions from both products applied to a real-world benchmark.

AnandTech has updated their Linpack version and created a new chart showing that Shanghai performs better with a recompiled version. Their conclusion sums up the addition: "Bottom line is that these LINPACK benchmarks are moving targets like the SPEC CPU benchmarks, as the compilers and libraries used are just as important as the CPUs.When the Xeon 5500 will materialize, LINPACK performance will probably be higher as the binary is built for the "Penryn/Harpertown" family."
 
Read more in Part 2 and AnandTech.