Nvidia Becomes A Major Model Maker With Nemotron 3

nvidia made It used to be a big company supplying chips to companies working on artificial intelligence, but today the chip maker has taken a step towards becoming a more serious model maker itself by releasing a series of cutting-edge open models along with data and tools to help engineers use them.

The move, which comes at a time when AI companies like OpenAI, Google and Anthropic are developing increasingly capable chips of their own, could be a hedge against these companies moving away from Nvidia’s technology over time.

Open models are already an important part of the AI ecosystem and many researchers and startups are using them to experiment, prototype, and build. While OpenAI and Google offer small open models, they do not update them as frequently as their rivals in China. For this reason and others, the open models of Chinese companies are currently much more popular, according to data from Hugging Face, a hosting platform for open source projects.

According to benchmark scores shared by the company ahead of release, Nvidia’s new Nemotron 3 models are among the best that can be downloaded, modified, and run on your own hardware.

“Open innovation is the foundation of AI progress,” CEO Jensen Huang said in a statement ahead of the news. “With Nemotron, we are transforming advanced AI into an open platform that provides developers with the transparency and efficiency they need to build agentic systems at scale.”

By releasing the data used to train the Nemotron, Nvidia is taking a more fully transparent approach than many of its US rivals – a fact that will help engineers modify the models more easily. The company is also releasing tools to aid in optimization and fine-tuning. It includes a new hybrid latent mixture-expert model architecture, which Nvidia says is particularly good for building AI agents that can take actions on a computer or on the web. The company is also launching a library that allows users to train agents to do things using reinforcement learning, which involves giving the model fake rewards and punishments.

Nemotron 3 models come in three sizes: Nano, which has 30 billion parameters; Super, which has 100 billion; and Ultra, which has 500 billion. The parameters of a model roughly correspond to how capable it is as well as how cumbersome it is to drive. The largest models are so cumbersome that they require running on racks of expensive hardware.

Model Foundation

Kari Ann Brisky, vice president of generative AI software for enterprise at Nvidia, said open models are important for AI builders for three reasons: builders are increasingly needing to customize models for particular tasks; It often helps to assign questions to different models; And after training, it is easy to get more intelligent responses by making these models do a kind of simulated reasoning. “We believe open source is the foundation of AI innovation that continues to accelerate the global economy,” Brisky said.

Social media giant Meta released the first advanced open model called Llama in February 2023. However, as competition has intensified, Meta has indicated that its future releases may not be open source.

The move is part of a larger trend in the AI industry. In the past year, American companies have moved away from openness, becoming more secretive about their research and more reluctant to tell their competitors about their latest engineering tricks.

<a href

Nvidia Becomes a Major Model Maker With Nemotron 3

Model Foundation

Like this:

Related

Leave a Comment Cancel reply

Model Foundation

Share this:

Like this:

Related

Leave a Comment Cancel reply