Speedier software coming from AMD

Posted by Rick C. Hodgin

Santa Clara (CA) - AMD announced an initiative today designed to help developers accelerate their software: There will be a new classification of hardware extensions dubbed XSP (or Extensions for Software Parallelism), which provides internal hardware monitors that contain runtime observed information about executing processes. This new data is expected to help software re-design itself at run-time to remove conditions that are known to be significant bottlenecks to high performance.

The first product being discussed by AMD is a new initiative called Light-Weight Profiling (LWP).  LWP consists of “hardware hooks” which allow software to query or capture data the hardware has observed through its own internal runtime analysis: The hardware might determine that two objects are sharing the same cache line, for example. While this would decrease performance in a multi-tasking/multi-core environment, today's software would have no easy way of knowing that potential slowdown condition existed (without expensive, complex software algorithms coupled to high developer skills).  As a result, software today runs slower than it needs to.  

With LWP, a program can monitor that hardware-observed condition, take a cue from its data and move one of the objects to a different location in memory, thereby resolving the conflict. Other similar abilities exist within LWP and will be discussed in the coming months, we were told by AMD.  Large software vendors like Microsoft, Sun, Oracle are typically involved in open initiatives like this as AMD is looking for wide community feedback.

AMD told us that managed applications (like Java and .NET) are ideal candidates for this type of extension. They can be designed to read the hardware provided data and make runtime changes easily and without any new Java or .NET code. Including these abilities in the runtime manager will result in user code which may actually run faster, and all without any changes by the Java or .NET author.  

This also means that all existing applications that have not been changed could run faster as well.  XSP provides access to any level software code, including non-managed languages.  It can be used by the OS, libraries or user-level code.  A minimal amount of software overhead penalty is incurred by using LWP, according to AMD.  However, the company indicated that the amount of overhead observed today to implement this new ability will likely be completely overshadowed by the increases in performance.  Additionally, since the new data is provided by hardware, there are no complex libraries or analysis engines to design and test.

This move also serves as part of a larger initiative being seen industry-wide, to provide more desirable tools for software developers, rather than just faster processors.  While these features do help the consumers in the end, they are also the inevitable addition to processors when technology limitations are encountered. For example, we are seeing more cores today, but we aren't seeing higher clock speeds evolve over time like we have in the past.  That “free lunch” in performance is over, and the parallel model has begun.  

Today, it is more difficult to gain better performance in this model with non-parallel code.  More finesse is needed to make it happen.  As a result, technologies like XSP (and those which add that finesse) will undoubtedly be the tools of the trade arriving in the years to come.