Inference Optimized Checkpoints (with Model Optimizer) Collection A collection of generative models quantized and optimized for inference with Model Optimizer. • 46 items • Updated about 16 hours ago • 69
nvidia/NVIDIA-Nemotron-Nano-12B-v2-VL-NVFP4-QAD Image-Text-to-Text • 8B • Updated Nov 13, 2025 • 2.82k • 14
nvidia/NVIDIA-Nemotron-Nano-12B-v2-VL-FP8 Image-Text-to-Text • 13B • Updated Nov 13, 2025 • 3.08k • 44
nvidia/NVIDIA-Nemotron-Nano-12B-v2-VL-NVFP4-QAD Image-Text-to-Text • 8B • Updated Nov 13, 2025 • 2.82k • 14
nvidia/NVIDIA-Nemotron-Nano-12B-v2-VL-FP8 Image-Text-to-Text • 13B • Updated Nov 13, 2025 • 3.08k • 44
nvidia/NVIDIA-Nemotron-Nano-12B-v2-VL-NVFP4-QAD Image-Text-to-Text • 8B • Updated Nov 13, 2025 • 2.82k • 14
nvidia/Llama-3.1-Nemotron-Nano-VL-8B-V1-FP4-QAD Image-Text-to-Text • 6B • Updated Oct 9, 2025 • 97 • 11
nvidia/Llama-3.1-Nemotron-Nano-VL-8B-V1-FP4-QAD Image-Text-to-Text • 6B • Updated Oct 9, 2025 • 97 • 11
nvidia/Llama-3.1-Nemotron-Nano-VL-8B-V1-FP4-QAD Image-Text-to-Text • 6B • Updated Oct 9, 2025 • 97 • 11