Ambuj-Tripathi Indian Legal Llama β GGUF : GGUF
Built with Llama 3.2 | Fine-tuned & Converted by Ambuj Kumar Tripathi
Overview
GGUF format of a qLoRA fine-tuned Llama 3.2 1B model on Indian Legal data. Run locally on CPU β no GPU required.
Developed by: Ambuj Kumar Tripathi
Base Model: unsloth/llama-3.2-1b-instruct
Training Method: qLoRA (4-bit + LoRA adapters)
GGUF Quantization: Q4_K_M
Training Data: 14,543 Indian Legal QA pairs
License: Llama 3.2 Community License
Training Data
| Dataset | Coverage |
|---|---|
| Indian Constitution QA | β |
| IPC (Indian Penal Code) QA | β |
| CrPC (Criminal Procedure Code) QA | β |
| Total | 14,543 examples |
β οΈ BEFORE YOU START
β οΈ Version 0.1 Alpha β Compute-Constrained Proof of Concept
π¨ Please read before using this model.
This version was built to validate the complete QLoRA β GGUF pipeline on zero-cost infrastructure. Due to free-tier compute limits, training was capped at 100 steps (~5.5% of full dataset / < 1 Epoch).
What works β
- Domain locking β model refuses non-legal queries
- Indian legal tone and structure learned
- IPC/CrPC/Constitution query format understood
Known Limitations β
- Factual hallucination β Article/Section numbers may be incorrect
- Underfitting β only 800 of 14,543 examples seen during training
- Do not use for actual legal research or advice
Roadmap π§
- Full epoch training (~1,820 steps) planned on Kaggle GPU
- Target: factual accuracy + reduced hallucination
For production-grade Indian Legal AI with RAG retrieval, see:
π Files in This Repo
| File | Purpose |
|---|---|
llama-3.2-1b-instruct.Q4_K_M.gguf |
Quantized model weights (~808 MB) |
model.yaml |
Auto-config for LM Studio (recommended) |
Must-Final-Fixed.preset.json |
Manual preset backup for any local tool |
π» How to Run Locally
Option 1 β LM Studio (Recommended)
Thanks to the included model.yaml, this model is fully automated for LM Studio.
- Open LM Studio
- Search for:
invincibleambuj/Ambuj-Tripathi-Indian-Legal-Llama-GGUF - Download the GGUF file
- Start chatting
All settings (system prompt, temperature, repetition penalty) are applied automatically via model.yaml.
Option 2 β Manual Preset (Backup)
Use this if your tool does not support automatic config via model.yaml.
- Download
Must-Final-Fixed.preset.jsonfrom the Files tab - Import it into LM Studio or Jan
- Load the GGUF model
- Start a new chat
π§ͺ Quick Test
Ask these after setup:
Who built you?
β I was fine-tuned and deployed by Ambuj Kumar Tripathi.
IPC 302 kya hai?
β Section 302 of the Indian Penal Code deals with punishment for murder...
Hello how are you?
β I only handle Indian legal questions.
Important Notice
This model was fine-tuned and deployed by Ambuj Kumar Tripathi for educational, research, and skill-development purposes only.
The training data used in this project was collected from publicly available legal-learning resources and a Kaggle dataset used strictly for learning, experimentation, and non-commercial fine-tuning. This repository and its releases are not intended to provide legal advice, are not offered as a commercial legal product, and should be treated as an experimental AI learning project.
Correct Attribution
- Fine-tuned and deployed by: Ambuj Kumar Tripathi
- If the model output mentions any other creator name, including names appearing from legacy training data, treat that as an incorrect model artifact and not as the correct attribution.
Known Limitations
- The model may occasionally produce incorrect creator-name references due to legacy training-data artifacts.
- The model may occasionally output formatting artifacts such as special tokens in some local GGUF runtimes.
- Outputs may contain hallucinations or inaccuracies and should always be independently verified.
GGUF / Local Runtime Note
If you are running the GGUF model locally in tools like LM Studio, some raw model behaviors may still appear depending on the prompt template and runtime settings. For best results, use a strict system prompt and low-temperature preset.
Legal Disclaimer
This model is provided strictly for educational and training purposes only. It does not constitute legal advice, does not create any lawyer-client relationship, and should not be relied on for real legal decisions. Always consult official legal sources and a qualified advocate.
How to Run Locally
Option 1 β LM Studio (No code required)
- Download LM Studio from lmstudio.ai
- Search:
invincibleambuj/Ambuj-Tripathi-Indian-Legal-Llama-GGUF - Download and start chatting locally
Option 2 β llama.cpp
llama-cli -hf invincibleambuj/Ambuj-Tripathi-Indian-Legal-Llama-GGUF --jinja
Use via Python (qLoRA adapter)
from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = "invincibleambuj/llama-3.2-1b-legal-india-qlora",
load_in_4bit = True,
)
inputs = tokenizer(
"### Instruction:\nWhat is IPC Section 302?\n\n### Response:\n",
return_tensors="pt"
)
outputs = model.generate(**inputs, max_new_tokens=200, repetition_penalty=1.3)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
β οΈ Important: Chat Template & Prompt Format
This model is fine-tuned on the strict Meta Llama 3 Instruct architecture. How you prompt it depends on how you are using it:
1οΈβ£ For Normal Users (LM Studio / Ollama)
You don't need to do anything! Just download the GGUF file and load it. Your software will automatically detect the Llama-3 format and apply the correct settings (thanks to the included model.yaml and preset.json). You will get highly detailed, accurate legal answers right out of the box.
2οΈβ£ For Developers (Custom Python / LangChain / API)
If you are building your own custom app (like Gradio) or using pure Python (transformers / llama.cpp directly), you MUST format your prompts using the native Llama 3 tags.
β Do NOT use: User: or Assistant: (The model will ignore instructions and give 1-line answers).
β
MUST use: The exact format below:
<|start_header_id|>system<|end_header_id|>
{system_prompt}<|eot_id|> <|start_header_id|>user<|end_header_id|>
{user_prompt}<|eot_id|> <|start_header_id|>assistant<|end_header_id|>
Available Files
| File | Size | Format |
|---|---|---|
| llama-3.2-1b-instruct.Q4_K_M.gguf | ~807MB | GGUF Q4_K_M |
License
Llama 3.2 Community License β Free for personal and commercial use.
This model was trained 2x faster with Unsloth
This model was finetuned and converted to GGUF format using Unsloth.
Example usage:
- For text only LLMs:
llama-cli -hf invincibleambuj/Ambuj-Tripathi-Indian-Legal-Llama-GGUF --jinja - For multimodal models:
llama-mtmd-cli -hf invincibleambuj/Ambuj-Tripathi-Indian-Legal-Llama-GGUF --jinja
Available Model files:
llama-3.2-1b-instruct.Q4_K_M.gguf
Ollama
An Ollama Modelfile is included for easy deployment.
This was trained 2x faster with Unsloth

- Downloads last month
- 2,315
4-bit