NVIDIA’s DGX SuperPOD: Redefining AI In The Shadow Of A Pandemic

Artificial Intelligence, while still in its infancy, is making significant advances.  NVIDIA, who is arguably the leader in core AI technology based on their recent MLperf performance results, just showcased they could build, from scratch, a Supercomputer, their new Selene supercomputer, in a couple of weeks.  This Supercomputer wasn’t trivial either; a DGX Superpod has 64 nodes of NVIDIA DGX A100, 2,240 NVIDIA A100 GPUs, over 5M NVIDIA Cuda cores, over 655K NVIDIA Tensor cores, and 768 NVIDIA NV Switches.  Now it typically takes the better part of a year to develop and build a new Supercomputer, so doing it in weeks is a testament to NVIDIA’s architecture and builds on their MLperf performance statistics to confirm their leadership status.  

This ability to spin up new Supercomputers very rapidly is a game-changer; let me explain.  


The COVID-19 Lesson

IBM, which has several supercomputers, responded very rapidly by helping to shift many of those resources to Covid-19 research and developed several potential drugs to treat the virus very rapidly (most are in testing).  But the problem is that while this is going on, the vital work these Supercomputers were doing is being delayed.  

And those delays are tied to projects like building more accurate weather models that also have the potential to save lives.  But building new supercomputers just wasn’t an option because it takes so long, costs a small fortune, and typically requires recommissioning experts from the projects they are on while the Supercomputer is being designed.  Given an anti-virus is now estimated to be completed before a new supercomputer could be normally designed and built, researchers had no real choice other than recommissioning existing resources and accepting the collateral damage on pre-existing vital projects.  

But if you can cut down the design and build time from months to weeks, suddenly creating new supercomputers to face threats like this becomes the more viable path because you can then focus on the new problem without prematurely abandoning the existing supercomputer priorities.  

Now imagine being able to spin up tens or hundreds of supercomputers rapidly to address an emerging new issue like an existential threat, say an asteroid impact, or another even more rapidly moving deadly virus (which is much more likely).  This capability could not only make the difference between living or dying for millions but better assure we wouldn’t be blindsided because whatever the existing supercomputers had been working on wouldn’t get ignored.   

It doesn’t do you a lot of good to find a cure for the virus if you are already dead because you weren’t adequately warned about a weather event.  

With broad trends like unrest, climate change, the need to track and mitigate hostile states, the advancement of autonomous everything, and the aggressively increased study of illnesses, the need for more focused supercomputers has never been more significant. 


Private Sector

But this isn’t just for public sector work because the advancement of autonomous everything from robots to cars, ships, and planes is also advancing rapidly.   During this same week, Continental and NVIDIA announced a partnership to create a Supercomputer to focus specifically on driver-assistance tech.  Continental is one of the top automotive technology suppliers, and this effort should turn them into a powerful force driving this technology into smaller suppliers or allowing those operating on very tight budgets like Jaguar and Land Rover to compete with their far larger competitors.  The effort could have the same level of impact that personal computers had on individual productivity by putting high-level technology in the hands of users whose companies otherwise couldn’t afford it.  

Companies can use these supercomputers to do simulations that both reduce development time and better assure the success of the products they bring to market.  They can do more in-depth research and development, resulting in new offerings that otherwise wouldn’t exist, and they can use them to analyze their existing and potential customer base better to assure customer satisfaction, loyalty, and advocacy.  

It is an impressive game changer.


Wrapping Up:  The More Intelligent More Comprehensive Smart Supercomputer Present 

I think this supercomputer breakthrough has as much potential to transform very high-end computing into a far more accessible technology.  Supercomputers are used to do some of the most exciting and most critical jobs currently asked of any computer.  Coming up with a solution that can be deployed in days rather than in months or years will have a significant impact on our safety, quality of life, and our satisfaction with future product offerings.  It is an enormous game-changer, and, again, NVIDIA is at the heart of the change.