NVIDIA Further Boosts AI Performance By 3x For GeForce RTX GPUs, RTX PC & RTX Workstations With Latest Driver

NVIDIA has further boosted the AI performance of its GeForce RTX GPUs & RTX AI PC platforms with the latest R555 driver release.

NVIDIA’s GeForce RTX GPUs & RTX PCs Offer The Fastest AI Performance Across All Segments, Now Boosted By 3X With Latest Drivers

During today’s Microsoft Build, NVIDIA announced a range of new AI performance optimizations that are now available on the RTX platform which includes GeForce RTX GPUs, Workstations, and PCs.

The new optimizations are specifically targeted at a range of LLMs (Large Language Models) that power the latest Generative AI experiences. Using the latest R555 drivers, NVIDIA’s RTX GPUs and AI PC platforms now offer up to 3x faster AI performance with ONNX Runtime (ORT) and DirectML. These two tools are used to run AI models locally on Windows PCs.

In addition to that, WebNN has also been accelerated with RTX via DirectML. This is an application programming interface for web developers to deploy new AI models. Microsoft is working with NVIDIA to further accelerate RTX GPU performance whilst adding DirectML support on PyTorch. Following is a full list of capabilities that the new R555 drivers offer for GeForce RTX GPUs and RTX PCs:

  • Support for DQ-GEMM metacommand to handle INT4 weight-only quantization for LLMs
  • New RMSNorm normalization methods for Llama 2, Llama 3, Mistral and Phi-3 models
  • Group and multi-query attention mechanisms, and sliding window attention to support Mistral
  • In-place KV updates to improve attention performance
  • Support for GEMM of non-multiple-of-8 tensors to improve context phase performance
Image Source: NVIDIA

In performance benchmarks of ORT, a generative AI extension released by Microsoft, NVIDIA shows gains across the board in both INT4 and FP16 data types. The performance improvements are up to 3x thanks to the optimization techniques added within these extensions for LLMs such as Phi-3, Llama 3, Gemma, and Mistral.

Besides these enhancements, NVIDIA has been leading the charge in the consumer AI PC space with its powerful TensorRT and TensorRT-LLM suite. The company also offers a diverse range of solutions powered by its AI hardware incorporated within its GPUs such as Tensor Cores.

These solutions include the game-changing DLSS Super Resolution technology, NVIDIA ACE, RTX Remix, Omniverse, Broadcast, RTX Video, and several other technologies. NVIDIA’s GPUs offer up to 1300 TOPS of AI compute which are miles ahead of the fastest chips coming out this year which are only expected to top 100 TOPS. Furthermore, these PCs will come equipped with the latest NVIDIA RTX GPUs, further fueling the RTX AI PC platform and pushing the AI segment further in the consumer space.

Share this story