1. Home
  2. Applications
  3. NVIDIA TensorRT 3 Dramatically Accelerates AI Inference for Hyperscale Data Centers

NVIDIA TensorRT 3 Dramatically Accelerates AI Inference for Hyperscale Data Centers


Alibaba, Baidu, Tencent, JD.com and Hikvision Adopt NVIDIA TensorRT for Programmable Inference Acceleration

GTC China – NVIDIA today unveiled new NVIDIA® TensorRT™ 3 AI inference software that sharply boosts the performance and slashes the cost of inferencing from the cloud to edge devices, including self-driving cars and robots.

The combination of TensorRT 3 with NVIDIA GPUs delivers ultra-fast and efficient inferencing across all frameworks for AI-enabled services — such as image and speech recognition, natural language processing, visual search and personalized recommendations. TensorRT and NVIDIA Tesla® GPU accelerators are up to 40 times faster than CPUs(1) at one-tenth the cost of CPU-based solutions.(2)

“Internet companies are racing to infuse AI into services used by billions of people. As a result, AI inference workloads are growing exponentially,” said NVIDIA founder and CEO Jensen Huang. “NVIDIA TensorRT is the world’s first programmable inference accelerator. With CUDA programmability, TensorRT will be able to accelerate the growing diversity and complexity of deep neural networks. And with TensorRT’s dramatic speed-up, service providers can affordably deploy these compute intensive AI workloads.”

More than 1,200 companies have already begun using NVIDIA’s inference platform across a wide spectrum of industries to discover new insights from data and deploy intelligent services to businesses and consumers. Among them are Amazon, Microsoft, Facebook and Google; as well as leading Chinese enterprise companies like Alibaba, Baidu, JD.com, iFLYTEK, Hikvision, Tencent and WeChat.

“NVIDIA’s AI platform, using TensorRT software on Tesla GPUs, is an outstanding technology at the forefront of enabling SAP’s growing requirements for inferencing,” said Juergen Mueller, chief innovation officer at SAP. “TensorRT and NVIDIA GPUs make real-time service delivery possible, with maximum machine learning performance and versatility to meet our customers’ needs.”

“JD.com relies on NVIDIA GPUs and software for inferencing in our data centers,” said Andy Chen, senior director of AI and Big Data at JD. “Using NVIDIA’s TensorRT on Tesla GPUs, we can simultaneously inference 1,000 HD video streams in real time, with 20 times fewer servers. NVIDIA’s deep learning platform provides outstanding performance and efficiency for JD.”

TensorRT 3 is a high-performance optimizing compiler and runtime engine for production deployment of AI applications. It can rapidly optimize, validate and deploy trained neural networks for inference to hyperscale data centers, embedded or automotive GPU platforms.

It offers highly accurate INT8 and FP16 network execution, which can save data center operators tens of millions of dollars in acquisition and annual energy costs. A developer can use it to take a trained neural network and, in just one day, create a deployable inference solution that runs 3-5x faster than their training framework.

To further accelerate AI, NVIDIA introduced additional software, including:

  • DeepStream SDK:NVIDIA DeepStream SDK delivers real-time, low-latency video analytics at scale. It helps developers integrate advanced video inference capabilities, including INT8 precision and GPU-accelerated transcoding, to support AI-powered services like object classification and scene understanding for up to 30 HD streams in real time on a single Tesla P4 GPU accelerator.
  • CUDA 9: The latest version of CUDA®, NVIDIA’s accelerated computing software platform, speeds up HPC and deep learning applications with support for NVIDIA Volta architecture-based GPUs, up to 5x faster libraries, a new programming model for thread management and updates to debugging and profiling tools. CUDA 9 is optimized to deliver maximum performance on Tesla V100 GPU accelerators.

Inference for the Data Center
Data center managers constantly balance performance and efficiency to keep their server fleets at maximum productivity. Tesla GPU accelerated servers can replace over a hundred hyperscale CPU servers for deep learning inference applications and services, freeing up precious rack space, reducing energy and cooling requirements, and reducing cost as much as 90 percent.

NVIDIA Tesla GPU accelerators provide the optimal inference solution — combining the highest throughput, best efficiency and lowest latency on deep learning inference workloads to power new AI-driven experiences.

Inference for Self-Driving Cars and Embedded Applications
With NVIDIA’s unified architecture, deep neural networks on every deep learning framework can be trained on NVIDIA DGX™ systems in the data center, and then deployed into all types of devices — from robots to autonomous vehicles — for real-time inferencing at the edge.

TuSimple, a startup developing autonomous trucking technology, increased inferencing performance by 30 percent after TensorRT optimization. In June, the company successfully completed a 170-mile Level 4 test drive from San Diego to Yuma, Arizona, using NVIDIA GPUs and cameras as the primary sensor. The performance gains from TensorRT allow TuSimple to analyze additional camera data, and add new AI algorithms to their autonomous trucks, without sacrificing response time.

Keep Current on NVIDIA

Subscribe to the NVIDIA blog, follow us on FacebookGoogle+TwitterLinkedIn and Instagram, and view NVIDIA videos on YouTube and images on Flickr.


NVIDIA‘s (NASDAQ: NVDA) invention of the GPU iNVIDIA’s (NASDAQ: NVDA) invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern computer graphics and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI — the next era of computing — with the GPU acting as the brain of computers, robots and self-driving cars that can perceive and understand the world. More information at http://nvidianews.nvidia.com/.

NVIDIA Since 1993, NVIDIA (NASDAQ: NVDA) has pioneered the art and science of visual computing. The company's technologies are transforming a world of displays into a world of interactive discovery -- for everyone from gamers to scientists, and consumers to enterprise customers.

Featured Resources:

Related Articles:


White Papers

‘All You Need to Know About Microsoft Windows Nano Server’ Veeam White Paper

Now updated for Windows Server 2016 GA release! You probably heard about Windows Nano Server already … but what is it exactly, and how do you get started with it? What value will it bring to your environment? Nano Server is a headless, 64-bit only deployment option for Windows Server 2016. Microsoft created this component specifically with […]


Download Commvault VM Backup and Recovery: end-to-end VM backup, recovery and cloud management

Commvault’s ability to provide end-to-end VM backup, recovery and cloud management creates a significantly better way to build, protect and optimize VMs throughout their lifecycle. Our best-in-class software for VM backup, recovery and cloud management delivers a number of significant benefits, including: VM recovery with live recovery options; backup to and in the cloud; custom-fit […]

On-Demand Webinars

Architecting for today’s desktop environments – FSLogix On-Demand Webinar

October 19, 2017 Webinar with David Young, Solutions Architect and Product Champion, and Brandon Lee, Solutions Marketer. Video Recording of a live demo of FSLogix and an overview of the latest release of FSLogix Apps featuring Roaming XenApp Email Search and OneDrive App along with Skype for Business Global Address List and Device Based Licensing. […]

Latest Videos

Current State of EUC – E2EVC Video

Session from @E2EVC 2017 Orlando. For event information please visit www.e2evc.com/home. For slides, additional info etc please contact the presenter directly on Twitter. For best video and sound quality do visit the event! This video is from the fine folks at E2EVC Conference

Views All IT News on DABCC.com
Views All IT Videos on DABCC.com
Win a Tesla P100D

Visit Our Sponsors