People@TGDaily

10 things you didn't know about...
Read more at
   SmallNetBuilder.com
Try our new and free
Price Comparison Service
Follow-up: Has Intel found the key to unlock supercomputing powers on the desktop? PDF Print E-mail
Opinion
By Rick C. Hodgin   
Monday, June 25, 2007 17:38
Article Index
Follow-up: Has Intel found the key to unlock supercomputing powers on the desktop?
Page 2

Opinion – Last week we posted an article entitled “Analysis: Has Intel found the key to unlock supercomputing powers on the desktop?” in which I discussed several facets of a potential Intel technology without going into too many of the technical details. At the time of our posting, Intel had not yet publicly released the paper describing the technology, which prevented me from going into certain details. However, you can now find published papers using a model Intel is calling “EXOCHI” or Exoskeleton Sequencer C for Heterogeneous Integration. For example, here's a good link that takes you directly to a page with a PDF written by Perry H. Wang, et al, from Intel.

This PDF is available for download and explains the inner workings of EXOCHI and will answer many of the questions raised in the reader comments. But I have taken some time to address some of the key concerns raised in our original article.

It seemed to me that one of the most horrific fears found in those posting responses was that Intel is in someway trying to undo the OS industry or introduce something new which operates outside of the OS. While the Exoskeleton software layer definitely exists outside of the OS and would, by acting as a type of executive layer of sorts, operate almost entirely outside of OS control or driver library support, the OS interface is still a requirement. While I personally believe this strict non-OS ability would be a great solution, it's just not the one Intel is offering.

What Intel would like to do is make the process of using heterogeneous cores as painless as possible for all involved. The company’s solution includes a software model that will target as many operating systems as possible, but without tightly bound OS drivers or requirements outside of minor patches to increase the amount of data captured during a task switch.

A picture included in Intel’s paper demonstrates that a software layer will still be required. However, if you look closely, it is OS independent in both operation and function. It will provide a single binary that will communicate with the application running within the OS for all OS-related service requests, thereby requiring only that the application have special knowledge of the Exoskeleton software layer, not the OS.

Image

As you can see, the EXOCHI model removes the need for an OS-coupled device drivers. This still allows a CHI runtime library to exist and be linked to your application. To allow for a more traditional approach when creating applications for Exo-sequencers, a software developer would write code for the runtime library requirements. That runtime library would then, in turn, handle all of the actual instrumentation and stream scheduling.

To the applications programmer, the new Exoskeleton software layer will be a black box with only the API provided by the CHI, should it be used. The exposed API can, therefore, be nearly as straight forward as it is today. The only real differences is it won't have the OS dependency or the abstraction layers seen in today's GPGPU model. This simplifies things greatly for the developer while targeting as many operating systems as possible right out of the gate.

My personal opinion is that this could turn out to be a brilliant move by Intel, and one which keeps any new facilities of hardware extremely close to hardware, but with the added flexibility of physically being a software layer that is non-OS dependent. This speaks to one of the proposed technology's greatest strengths.

Another common response in was that the real problem isn't addressed, that of software models or developer skill sets. Several comments indicated these are the real issues with multi-thread programming and that nothing Intel is adding will solve those current problems.

Today's requirements of multi-thread programming are extremely OS dependent. The application starts a new thread and, depending on what platform the OS is running on it will either schedule the new thread in the application's allotted time slice, or allow it to run on a new core, or some combination thereof. All of this is required because we're working on homogeneous cores where each core can only do one thing at a time.

With the Exoskeleton software layer and the Exo-sequencers, we will now have the ability to have many instruction streams running at the same time. The advantage of not having strong OS dependency is that the OS is already burdened enough with task scheduling on homogeneous cores. If it were to attempt to schedule tasks on heterogeneous cores, the results would be a much more complex tasking model for every OS.

Intel's solution addresses that weakness by providing to the application developer a model which would allow them to schedule threads themselves, without OS support, and with a much smaller learning curve due to the strength of Intel's EXOCHI tools in the Exoskeleton software layer and Exo-sequencer hardware layers.

The only thing the OS has to worry about is storing some additional task switching information. This results only in a slightly larger memory block being switched out each time a task switch occurs. The result is a software model which, so far as the main OS is concerned, is still a single-threaded app (or, if it's already multi-threaded, then it's a multi-threaded app). Any new threads launched on the Exo-sequencers happen without the OS knowing about it. There is still an OS layer, of sorts, which is not the main OS. It is the Exoskeleton software layer's communication protocols with the calling apps so that multiple tasks, multiple threads and multiple callers are all handled correctly.

All of this means that what Intel has done is basically introduce a new OS which is transparent and serves only one function, no matter what platform its running on.

It does, via software, what the OS would otherwise have to do in a specific-to-every-OS model, though it does it one time for all by having a software layer which exists closer to hardware than the OS. This benefit cannot be explained heavily enough. Intel is offering a way to create multiple threads on disparate pieces of hardware outside of OS awareness. If your application, for example, compiles correctly and runs under Windows and uses the Exo-sequencers properly, then because everything is specific only to your app and the x86-hardware layer its running on, then the same code will immediately port to Mac OS X, Linux, UNIX, Solaris, anything that runs x86. It will not require, from a straight-forward processing point of view (one outside of GUI requirements, for example), any changes. The software you write once, once recompiled and put into appropriate binary form, will work on all x86-based platforms.

This new ability means that software developers will see a notably smaller learning curve. Developers will no longer have to turn to OS-specific models, or books on theory only to then apply them to specific OSes. They will now be able to target the hardware itself from their point of view. While it will physically be implemented in a more virtual manner, from the developer’s point of view they now only have one thing to address to use any added Exo-sequencer. The developers of today who can look at the x86 ISA and use it will be able to write impressively engineered multi-thread code which can take advantage of disparate hardware solutions.

The software knowledge requirements will still be there, but the target for understanding will be much smaller and much more easy to code for. And therein lies its strength.

 

Read on the next page: So, what problem does this solution really solve?  

 



 

Shop Keywords: Intel, Exochi, Exoskeleton, multicore, heterogeneous

-view -opinion -128 --128
Powered By Page_Cache by Ircmaxell
Generated in 2.39362192154 Seconds