Monday, October 2, 2023

Nvidia -- The Fourth Industrial Revolution -- October 2, 2023

Locator: 45627NVIDIA. 

30-second elevator speech: CPUs will still be the heart of the computer but it will be the GPUs that will be the real heroes. If the CPU is the quarterback, the GPUs are the tight ends, the wide ends, the back field runners, and the placekickers. GPUs will be the "special teams." LOL Seriously.

30-second elevator speech: for NASCAR aficionados, the CPU will be the driver that gets all the glory but the GPUs will be the crew chief, the pit crew, the go-fers, and the garage mechanics.

From the Nvidia website:

In a talk, now available online, NVIDIA Chief Scientist Bill Dally describes a tectonic shift in how computer performance gets delivered in a post-Moore’s law era.

Each new processor requires ingenuity and effort inventing and validating fresh ingredients.
That’s radically different from a generation ago, when engineers essentially relied on the physics of ever smaller, faster chips.

The team of more than 300 that Dally leads at NVIDIA Research helped deliver a whopping 1,000x improvement in single GPU performance on AI inference over the past decade.

It’s an astounding increase that IEEE Spectrum was the first to dub “Huang’s Law” after NVIDIA founder and CEO Jensen Huang. The label was later popularized by a column in the Wall Street Journal.

1000x Leap in GPU Performance in a Decade

The advance was a response to the equally phenomenal rise of large language models used for generative AI that are growing by an order of magnitude every year.

“That’s been setting the pace for us in the hardware industry because we feel we have to provide for this demand,” Dally said.

In his talk, Dally detailed the elements that drove the 1,000x gain.

The largest of all, a sixteen-fold gain, came from finding simpler ways to represent the numbers computers use to make their calculations.

The New Math

The latest NVIDIA Hopper architecture with its Transformer Engine uses a dynamic mix of eight- and 16-bit floating point and integer math. It’s tailored to the needs of today’s generative AI models
Separately, his team helped achieve a 12.5x leap by crafting advanced instructions that tell the GPU how to organize its work. These complex commands help execute more work with less energy.

As a result, computers can be “as efficient as dedicated accelerators, but retain all the programmability of GPUs,” he said.

In addition, the NVIDIA Ampere architecture added structural sparsity, an innovative way to simplify the weights in AI models without compromising the model’s accuracy. The technique brought another 2x performance increase and promises future advances, too.

Dally described how NVLink interconnects between GPUs in a system and NVIDIA networking among systems compound the 1,000x gains in single GPU performance.

No Free Lunch
Though NVIDIA migrated GPUs from 28nm to 5nm semiconductor nodes over the decade, that technology only accounted for 2.5x of the total gains, Dally noted.

That’s a huge change from computer design a generation ago under Moore’s law, an observation that performance should double every two years as chips become ever smaller and faster. 
Those gains were described in part by Denard scaling, essentially a physics formula defined in a 1974 paper co-authored by IBM scientist Robert Denard. Unfortunately, the physics of shrinking hit natural limits such as the amount of heat the ever smaller and faster devices could tolerate.

************************
Nvidia -- The Wall Street Journal

Link here.

From the linked article:

During modern computing’s first epoch, one trend reigned supreme: Moore’s Law.

Actually a prediction by Intel Corp. co-founder Gordon Moore rather than any sort of physical law, Moore’s Law held that the number of transistors on a chip doubles roughly every two years. It also meant that performance of those chips—and the computers they powered—increased by a substantial amount on roughly the same timetable. This formed the industry’s core, the glowing crucible from which sprang trillion-dollar technologies that upended almost every aspect of our day-to-day existence.

As chip makers have reached the limits of atomic-scale circuitry and the physics of electrons, Moore’s law has slowed, and some say it’s over. But a different law, potentially no less consequential for computing’s next half century, has arisen.

I call it Huang’s Law, after Nvidia Corp. chief executive and co-founder Jensen Huang. It describes how the silicon chips that power artificial intelligence more than double in performance every two years. While the increase can be attributed to both hardware and software, its steady progress makes it a unique enabler of everything from autonomous cars, trucks and ships to the face, voice and object recognition in our personal gadgets.

Between November 2012 and this May, performance of Nvidia’s chips increased 317 times for an important class of AI calculations, says Bill Dally, chief scientist and senior vice president of research at Nvidia. On average, in other words, the performance of these chips more than doubled every year, a rate of progress that makes Moore’s Law pale in comparison.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.