What are the hardware requirements for Nemotron Cascade 2 30B A3B?

Nemotron Cascade 2 30B A3B requires approximately 63.2 GB of VRAM and supports a context window of 256K tokens. Quantized variants may run on less VRAM; see the Quantized Models section on this page.

Who developed Nemotron Cascade 2 30B A3B and how large is it?

Nemotron Cascade 2 30B A3B is developed by nvidia, a model with 30b parameters. The model is published as open weights on Hugging Face and indexed on LLM Explorer with full benchmark history.

Where can I download or evaluate Nemotron Cascade 2 30B A3B?

Nemotron Cascade 2 30B A3B is hosted on Hugging Face and linked from this page. LLM Explorer also lists quantized variants and similar alternatives if available.

Nemotron Cascade 2 30B A3B by nvidia — VRAM 63.2GB, 256K context

Name: Nemotron Cascade 2 30B A3B
Author: nvidia

Nemotron Cascade 2 30B A3B is an open-source language model by nvidia. Features: 30b LLM, VRAM: 63.2GB, Context: 256K, License: other, LLM Explorer Score: 0.4.

Arxiv:2603.19220 Conversational Custom code Deploy:azure En Endpoints compatible Eval-results General-purpose Nemotron-cascade-2 Nemotron h Nvidia Reasoning Region:us Rl Safetensors Sft Sharded Tensorflow

Model Card on HF 🤗: https://huggingface.co/nvidia/Nemotron-Cascade-2-30B-A3B

Nemotron Cascade 2 30B A3B Benchmarks

LLME Score: 0.40194

^nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").

What is the LLM Explorer Rank (Score)

Nemotron Cascade 2 30B A3B (nvidia/Nemotron-Cascade-2-30B-A3B)

🌟 Advertise your project 🚀

Nemotron Cascade 2 30B A3B Parameters and Internals

LLM Name	Nemotron Cascade 2 30B A3B
Repository 🤗	https://huggingface.co/nvidia/Nemotron-Cascade-2-30B-A3B
Model Size	30b
Required VRAM	63.2 GB
Updated	2026-07-10
Maintainer	nvidia
Model Type	nemotron_h
Model Files	5.0 GB: 1-of-13 5.0 GB: 2-of-13 5.0 GB: 3-of-13 5.0 GB: 4-of-13 5.0 GB: 5-of-13 5.0 GB: 6-of-13 5.0 GB: 7-of-13 5.0 GB: 8-of-13 5.0 GB: 9-of-13 5.0 GB: 10-of-13 5.0 GB: 11-of-13 5.0 GB: 12-of-13 3.2 GB: 13-of-13
Supported Languages	en
Model Architecture	NemotronHForCausalLM
License	other
Context Length	262144
Model Max Length	262144
Transformers Version	4.55.4
Tokenizer Class	PreTrainedTokenizerFast
Padding Token	<\|im_end\|>
Vocabulary Size	131072
Torch Data Type	bfloat16

Quantized Models of the Nemotron Cascade 2 30B A3B

Model	Likes	Downloads	VRAM
...emotron Cascade 2 30B A3B 4bit	19	843	17 GB
...emotron Cascade 2 30B A3B 8bit	8	279	33 GB
...emotron Cascade 2 30B A3B 6bit	6	298	25 GB
...ron Cascade 2 30B A3B Mlx 6bit	3	219	27 GB

Best Alternatives to Nemotron Cascade 2 30B A3B

Best Alternatives	Context / RAM	Downloads	Likes
...A Nemotron 3 Nano 30B A3B BF16	256K / 63.2 GB	1645889	752
... Nemotron 3 Nano 30B A3B NVFP4	256K / 19.3 GB	1409	8
...IA Nemotron 3 Nano 30B A3B FP8	256K / 32.7 GB	349250	352
... Nemotron 3 Nano 30B A3B NVFP4	256K / 19.3 GB	532640	151
...otron 3 Nano 30B A3B Base BF16	256K / 63.2 GB	90736	127
Nemotron 3 Nano 30B A3B	256K / 63.2 GB	174315	14
...n Labs 3 Elastic 30B A3B NVFP4	256K / 19.3 GB	5706	15
...emotron Labs 3 Elastic 12B A2B	256K / 24.5 GB	2518	7
...ron Labs 3 Elastic 30B A3B FP8	256K / 32.7 GB	3479	9
Hebatron	256K / 63.2 GB	740	14

Note: green Score (e.g. "73.2") means that the model is better than nvidia/Nemotron-Cascade-2-30B-A3B.

Rank the Nemotron Cascade 2 30B A3B Capabilities

🆘 Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! 🌟

Instruction Following and Task Automation
Factuality and Completeness of Knowledge
Censorship and Alignment
Data Analysis and Insight Generation
Text Generation
Text Summarization and Feature Extraction
Code Generation
Multi-Language Support and Translation

What open-source LLMs or SLMs are you in search of? 54931 in total.

Email us: info@extractum.io. Our Privacy Policy | Terms and Conditions | Suggest an improvement.

Our Social Media →

Original data from HuggingFace, OpenCompass and various public git repos.

Check out Ag3ntum — our secure, self-hosted AI agent for server management.

Release v20260328a

Support LLM Explorer

Nemotron Cascade 2 30B A3B by nvidia