Planned new standard for AI could accelerate development

By Marcus Law
A proposed new standard for the future of AI processing could provide significant benefits, according to a white paper published by Nvidia, Arm and Intel.

A joint white paper on the future of AI processing has been published in a move which could accelerate development by optimising memory usage.

Nvidia, Arm, and Intel have jointly authored a whitepaper, FP8 Formats for Deep Learning, which describes an eight-bit floating point (FP8) specification aimed at accelerating AI development by optimising memory usage for both AI training and inference. 

FP8 is an interchange format that will allow software ecosystems to share neural network (NN) models easily. It means models developed on one platform may be run on other platforms without encountering the overhead of having to convert the vast amounts of model data between formats while reducing task loss to a minimum. 

Eight-bit AI model good balance between hardware and software

AI processing requires full-stack innovation across hardware and software platforms to address the growing computational demands of neural networks. A key area to drive efficiency is using lower precision number formats to improve computational efficiency, reduce memory usage, and optimise for interconnect bandwidth.

To realise these benefits, the industry has moved from 32-bit precisions to 16-bit, and now even eight-bit precision formats. Transformer networks, which are one of the most important innovations in AI, benefit from an eight-bit floating point precision in particular. 

In a blog post, Nvidia’s director of product marketing Shar Narasimhan says FP8 “minimises deviations from existing IEEE 754 floating point formats with a good balance between hardware and software to leverage existing implementations, accelerate adoption, and improve developer productivity”.

“E5M2 uses five bits for the exponent and two bits for the mantissa and is a truncated IEEE FP16 format,” he adds. “In circumstances where more precision is required at the expense of some numerical range, the E4M3 format makes a few adjustments to extend the range representable with a four-bit exponent and a three-bit mantissa.

“The new format saves additional computational cycles since it uses just eight bits. It can be used for both AI training and inference without requiring any re-casting between precisions. Furthermore, by minimising deviations from existing floating point formats, it enables the greatest latitude for future AI innovation while still adhering to current conventions.”

High-accuracy training and inference with four-fold boost in efficiency

In June, British semiconductor company Graphcore released a 30-page study that showed the superior performance of low-precision floating point formats, along with the long-term benefits of reducing power consumption in training initiatives.

“Low precision numerical formats can be a key component of large machine learning models that provide state of the art accuracy while reducing their environmental impact,” the researchers wrote. “In particular, by using eight-bit floating point arithmetic the energy efficiency can be increased by up to 4× with respect to float-16 arithmetic and up to 16× with respect to float-32 arithmetic.”

Graphcore co-founder and chief technology officer Simon Knowles said that along with  “tremendous performance and efficiency benefits”, a common standard would be: “an opportunity for the industry to settle on a single, open standard, rather than ushering in a confusing mix of competing formats.”

“The proposed FP8 format shows comparable accuracy to 16-bit precisions across a wide array of use cases, architectures, and networks,” Narasimhan adds. “Results on transformers, computer vision, and GAN networks all show that FP8 training accuracy is similar to 16-bit precisions while delivering significant speedups.”

Move a step towards standardisation of AI 

The specification has been published in an open, licence-free format to encourage broad industry adoption. Nvidia, Arm and Intel will also submit this proposal to IEEE.

In a blog post on Arm’s website, Neil Burgess and Sangwon Ha wrote: “We firmly believe in the benefits of the industry coalescing around one 8-bit floating point format, enabling developers to focus on innovation and differentiation where it really matters. 

“We’re excited to see how FP8 advances AI development in the future.”

To view the full report, click here.

Share

Featured Articles

How Gen AI is Taking the FinTech Sector by Storm

Gen AI is taking the technological bounds of fintech into the future, helping fintechs become more efficient, but where exactly is it being implemented?

Preparing the Workforce for an AI-Native Future

AI Magazine speaks with Clyde Seepersad of the Linux Foundation about how businesses can create more sustainable, effective and responsible AI use cases

AWS Bedrock Gets Anthropic's New Claude 3.5 Sonnet Model

Amazon Bedrock and Anthropic upgrade their partnership to build Gen AI applications... with Gen AI assistance by adding Claude 3.5 Sonnet to copilot tasks

What Dell and Super Micro can Bring Musk’s xAI Supercomputer

Cloud & Infrastructure

Toshiba Takes Another Step to Ushering in Embodied AI

Robotics

Why AWS is Investing $230m in Credits for Gen AI Startups

Cloud & Infrastructure