GPU giant says NIM will eliminate dependency headaches for the low low cost of $4,500/year per GPU
"It is unlikely that you'll write it from scratch or write a whole bunch of Python code or anything like that," he said on stage during his GTCThis AI team, Jensen explains, might include a model designed to break down and delegate a request to various other models. Some of these models might be trained to understand business services like SAP or Service Now, while others might perform numerical analysis on data stored in a vector database.
Dubbed Nvidia Inference Microservices, or NIM for short, these are essentially container images containing both the model, whether it be the open source or proprietary, along with all the dependencies necessary to get it running. These containerized models can then be deployed across any number of runtimes, including Nvidia-accelerated Kubernetes nodes.
If the idea of containerizing GPU accelerated workloads sounds familiar, this isn't exactly a new idea for Nvidia. CUDA acceleration has beenon a wide variety of container runtimes, including Docker, Podman, Containerd, or CRI-O for years, and it doesn't look like Nvidia's Container Runtime is going anywhere.
In addition to hardware specific model optimizations, Nvidia is also working on enabling consistent communications between containers, so that they can chat with each other, via API calls. Jensen highlighted this fact during his keynote. Asked about an internal program used within Nvidia, Meta's Llama 2 70B large language model unsurprisingly provided the definition to an unrelated term.
Singapore Latest News, Singapore Headlines
Similar News:You can also read news stories similar to this one that we have collected from other news sources.
Dell exec reveals Nvidia has a 1,000-watt GPU in the worksHot and hungry, yet direct liquid cooling may not be required
Read more »
Portal with RTX has been updated to include Nvidia's latest load time accelerating GPU techAndy built his first gaming PC at the tender age of 12, when IDE cables were a thing and high resolution wasn't. After spending over 15 years in the production industry overseeing a variety of live and recorded projects, he started writing his own PC hardware blog for a year in the hope that people might send him things. Sometimes they did.
Read more »
Why do Nvidia’s chips dominate the AI market?The firm has three big advantages
Read more »
Why Cristiano Ronaldo has been banned in Saudi Arabia and why his gesture is offensiveCristiano Ronaldo will miss Al Nassr's next match after he was banned for a gesture he directed at opposition fans
Read more »
Why Manchester United face so many shots – and why that isn’t all bad newsUnited look like one of the most frantic and haphazard sides when working against the ball, yet Onana has eight clean sheets this season
Read more »
India plans 10,000-GPU sovereign AI supercomputerPuts $1.2 billion on the table for AI skills and local LLMs, tells private enterprise it expects help
Read more »