Albuquerque (NM) – Sandia National Laboratories has discovered that the level of performance given by multi-core supercomputers begins to fall-off as the number of cores increases. They show that 16 cores performs barely as well as two cores for certain types of complex operations.
Sandia used a simulation involving “key algorithms for deriving knowledge from large data sets.” The simulation showed a significant increase in speed in going from two to four cores. However, there was an insignificant increase when moving from four to eight. And, when going above eight cores it actually shows a reduction in performance – so much so that sixteen cores barely performs as well as two cores by themselves. Above sixteen cores, Sandia notes a “steep decline is registered as more cores are added.”
They claim the problem is the lack of memory bandwidth as well as a contention between processors over the shared memory bus, which ultimately causes a bottleneck.
Sandia’s Arun Rodrigues, who helped conduct the simulations, said, “To some extent, it is pointing out the obvious — many of our applications have been memory-bandwidth-limited even on a single core. However, it is not an issue to which industry has a known solution, and the problem is often ignored.”
Sandia’s director of the Computations, Computers, Information and Mathematics Center, James Peery, said, “The difficulty is contention among modules. The cores are all asking for memory through the same pipe. It’s like having one, two, four, or eight people all talking to you at the same time, saying, ‘I want this information.’ Then they have to wait until the answer to their request comes back. This causes delays.”
He continues, “The original AMD processors in Red Storm were chosen because they had better memory performance than other processors, including other Opteron processors. One of the main reasons that AMD processors are popular in high-performance computing is that they have an integrated memory controller that, until very recently, Intel processors didn’t have.”
According to Mike Heroux of Sandia’s Technical Staff Scalable Algorithms Department, “The [chip design] community didn’t go with multicores because they were without flaw. The community couldn’t see a better approach. It was desperate. Presently we are seeing memory system designs that provide a dramatic improvement over what was available 12 months ago, but the fundamental problem still exists.”
The industry today does not have a clear solution in moving forward for addressing this memory bottleneck as the number of compute cores increases. At the same time we have roadmaps from companies like Intel and AMD which show many-cored CPUs in our near future. Intel has already released a six-core server chip, and both AMD and Intel have had quad-core CPUs out for a while.