NVIDIA Data Center
Deep Learning Product Performance

Reproduce these results on your system by following the instructions in the Measuring Training and Inferencing Performance on NVIDIA AI Platforms Reviewer’s Guide .

View Performance Data for:

Latest NVIDIA Data Center Products

Training networks to convergence allows AI deployment in real-world applications

Training to Convergence

Deploying AI in real-world applications requires training networks to convergence at a specified accuracy. This is the best methodology to test whether AI systems are ready to be deployed in the field to deliver meaningful results.

Learn More

AI Inference

Real-world inferencing demands high throughput and low latencies with maximum efficiency across use cases. An industry-leading solution lets customers quickly deploy AI models into real-world production with the highest performance from data center to edge.

Learn More

Customer service avatars use NVIDIA Riva app framework for conversational AI services

AI Pipeline

NVIDIA Riva is an application framework for multimodal conversational AI services that deliver real-time performance on GPUs.

Learn More

Related Resources

High-Performance Computing (HPC) Performance

Review the latest GPU-acceleration factors of popular HPC applications.

Training

Learn how NVIDIA Blackwell Doubles LLM Training Performance in MLPerf Training v4.1.
Read how to Boost Llama 3.1 405B Throughput by Another 1.5x on NVIDIA H200 Tensor Core GPUs and NVLink Switch.
Read why training to convergence is essential for enterprise AI adoption.
Get up and running quickly with NVIDIA’s complete solution stack:

Pull software containers from NVIDIA® NGC™.

Read how NVIDIA’s supercomputer won every benchmark in MLPerf HPC 2.0.

Inference

NVIDIA Blackwell sets new LLM Inference records in MLPerf Inference v4.1.
Read the inference whitepaper to explore the evolving landscape and get an overview of inference platforms.
Learn how Dynamic Batching can increase throughput on Triton with Benefits of Triton.
For additional data on Triton performance in offline and online server, please refer to ResNet-50 v1.5.
Power high-throughput, low-latency inference with NVIDIA’s complete solution stack:

Achieve the most efficient inference performance with NVIDIA® TensorRT™ running on NVIDIA Tensor Core GPUs.
Maximize performance and simplify the deployment of AI models with the NVIDIA Triton™ Inference Server.
Pull software containers from NVIDIA® NGC™ to race into production.

AI Pipeline

Download and get started with NVIDIA Riva.

NVIDIA Data Center Deep Learning Product Performance

View Performance Data for:

Training to Convergence

AI Inference

AI Pipeline

Related Resources

High-Performance Computing (HPC) Performance

NVIDIA Data Center
Deep Learning Product Performance