Instructions to use uditjain/Nemotron-30B-Science-Instruct-LoRI with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use uditjain/Nemotron-30B-Science-Instruct-LoRI with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16") model = PeftModel.from_pretrained(base_model, "uditjain/Nemotron-30B-Science-Instruct-LoRI") - Notebooks
- Google Colab
- Kaggle
Nemotron-30B Science Expert PEFT
Welcome to the Nemotron-30B Science Expert PEFT, a specialized parameter-efficient fine-tuning (PEFT) module designed for the nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-NVFP4 architecture.
Trained as part of the Mewtwo multi-adapter routing research project.
Quantitative Training Details
This adapter was heavily optimized on a single consumer GPU following LoRA principles.
- Hardware: 1x NVIDIA RTX 5090 (32GB VRAM)
- VRAM Utilization: ~19.3 GB (4-bit NF4 quantization)
- Training Time: ~8.1 hours (487 min)
- Dataset: ~15K samples from
allenai/sciq - Total Steps: 1,250
Hyperparameters:
- LoRA Rank ($r$): 64
- LoRA Alpha: 128.0
- Learning Rate: 1e-4
- Target Modules:
q_proj,k_proj,v_proj,o_proj
Intended Use & Limitations
✅ Intended Use: Advanced scientific reasoning, internalizing complex logic chains, and Multiple Choice Question (MCQ) optimization ❌ Out-of-Scope: Open-ended chat, creative writing, multilingual translation. ⚠️ Limitations: As a PEFT adapter quantized in 4-bit, expect minor precision losses on complex Olympiad-level geometries. Also prone to hallucinations if context exceeds 4096 tokens.
The Science Ripple (Cross-Domain Latent Transfer)
We observed a massive +13.5% gain in MATH-500 despite the model never seeing a math proof during training. This suggests that the dense fact-retrieval and logical consistency required for SciQ acted as a "Latent Reasoning Booster", re-wiring the model's internal hierarchy to prioritize symbolic logic over baseline probability.
xychart-beta
title "Cross-Domain Reasoning Impact (Accuracy %)"
x-axis ["ARC", "HumanEval", "MATH-500"]
bar [21.0, 1.0, 55.0]
line [20.0, 50.0, 41.5]
(Blue Bar = Peak Expert Performance, Red Line = Base Model Performance)
Benchmark Table
| Benchmark | Base Model | Nemotron-30B Science Expert PEFT | Delta |
|---|---|---|---|
| ARC-Challenge (25-shot) | 20.0% | 21% | 1% |
| HumanEval (0-shot) | 50.0% | 1% | -49% |
| MATH-500 (0-shot) | 41.5% | 55% | 13% |
| MBPP (0-shot) | 8.0% | 0% | -8% |
Note: The near-zero scores on HumanEval and MBPP are the strongest empirical evidence of Domain-Inversion. By specializing so heavily in scientific fact-patterns, the model discarded its specialized Python generation circuits entirely
How to Use (Working Snippet)
This architecture is a Hybrid Mamba-Attention model, so typical generation caching will fail without the correct HuggingFace override.
import torch
import sys
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
model_id = "nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-NVFP4"
adapter_id = "uditjain/nemotron-30b-science-expert-peft"
# 1. Load Base Model and Tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_id)
bnb_config = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_compute_dtype=torch.bfloat16)
base_model = AutoModelForCausalLM.from_pretrained(
model_id,
device_map="auto",
quantization_config=bnb_config
)
# 2. Attach PEFT Adapter
model = PeftModel.from_pretrained(base_model, adapter_id)
model.eval() # Ensure dropout modules are disabled
# 3. Dynamic Cache Extraction (Mandatory for Nemotron-30B Hybrid)
try:
model_module = sys.modules[base_model.__class__.__module__]
HybridMambaAttentionDynamicCache = getattr(model_module, 'HybridMambaAttentionDynamicCache')
past_key_values = HybridMambaAttentionDynamicCache(
base_model.config, batch_size=1, dtype=torch.bfloat16, device=model.device
)
except Exception as e:
print(f"Warning: Failed to load custom Mamba cache. Generation may be slower or degrade. Error: {e}")
past_key_values = None
# Format the Prompt
messages = [{"role": "user", "content": "Explain the role of the Higgs Boson in the Standard Model and why it's necessary for the definition of mass"}]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
# Generate Output
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=400,
past_key_values=past_key_values,
do_sample=False
)
response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True)
print(response)
Citation & Contact
If you use this adapter or build upon the Code Paradox findings, please cite:
@misc{jain2026nemotronscience,
author = {Udit Jain},
title = {Nemotron-30B-Science-Instruct-LoRI},
year = {2026},
publisher = {HuggingFace},
url = {https://huggingface.co/uditjain/Nemotron-30B-Science-Instruct-LoRI}
}
Collaboration & Queries: hello@uditjain.in
- Downloads last month
- 5
Model tree for uditjain/Nemotron-30B-Science-Instruct-LoRI
Base model
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16Dataset used to train uditjain/Nemotron-30B-Science-Instruct-LoRI
Evaluation results
- accuracy on MATH-500self-reported0.550
- pass@1 on HumanEvalself-reported0.010
- accuracy on ARC-Challengeself-reported0.210
- pass@1 on MBPPself-reported0.000