Hello Guest

Sign In / Register

Welcome,{$name}!

/ Logout
English
EnglishDeutschItaliaFrançais한국의русскийSvenskaNederlandespañolPortuguêspolskiSuomiGaeilgeSlovenskáSlovenijaČeštinaMelayuMagyarországHrvatskaDanskromânescIndonesiaΕλλάδαБългарски езикGalegolietuviųMaoriRepublika e ShqipërisëالعربيةአማርኛAzərbaycanEesti VabariikEuskera‎БеларусьLëtzebuergeschAyitiAfrikaansBosnaíslenskaCambodiaမြန်မာМонголулсМакедонскиmalaɡasʲພາສາລາວKurdîსაქართველოIsiXhosaفارسیisiZuluPilipinoසිංහලTürk diliTiếng ViệtहिंदीТоҷикӣاردوภาษาไทยO'zbekKongeriketবাংলা ভাষারChicheŵaSamoaSesothoCрпскиKiswahiliУкраїнаनेपालीעִבְרִיתپښتوКыргыз тилиҚазақшаCatalàCorsaLatviešuHausaગુજરાતીಕನ್ನಡkannaḍaमराठी
Home > News > NVIDIA Open-Sources Nemotron-Mini-4B-Instruct AI Model for On-Device Deployment

NVIDIA Open-Sources Nemotron-Mini-4B-Instruct AI Model for On-Device Deployment

On September 15, marktechpost, a technology media outlet, reported that NVIDIA has open-sourced the Nemotron-Mini-4B-Instruct AI model, marking another milestone in the company's innovation in the AI space.

The Nemotron-Mini-4B-Instruct AI model is specifically designed for tasks such as role-playing, retrieval-augmented generation (RAG), and function calling. It is a small language model (SLM), distilled and optimized from the larger Nemotron-4 15B model.

NVIDIA employed advanced AI techniques such as pruning, quantization, and distillation to create a smaller and more efficient model, making it especially suitable for on-device deployment.

Despite its reduced size, the model's performance in specific scenarios like role-playing and function calling remains uncompromised, making it a practical choice for applications requiring fast, on-demand responses.

Fine-tuned on the Minitron-4B-Base model, the Nemotron-Mini-4B-Instruct AI model incorporates LLM compression technology. One of its most notable features is its ability to handle a 4096-token context window, enabling it to generate longer and more coherent responses.