Ambuj-Tripathi Indian Legal Llama — GGUF : GGUF

Built with Llama 3.2 | Fine-tuned & Converted by Ambuj Kumar Tripathi

Overview

GGUF format of a qLoRA fine-tuned Llama 3.2 1B model on Indian Legal data. Run locally on CPU — no GPU required.

Developed by: Ambuj Kumar Tripathi
Base Model: unsloth/llama-3.2-1b-instruct
Training Method: qLoRA (4-bit + LoRA adapters)
GGUF Quantization: Q4_K_M
Training Data: 14,543 Indian Legal QA pairs
License: Llama 3.2 Community License

Training Data

Dataset	Coverage
Indian Constitution QA	✅
IPC (Indian Penal Code) QA	✅
CrPC (Criminal Procedure Code) QA	✅
Total	14,543 examples

⚠️ BEFORE YOU START

⚠️ Version 0.1 Alpha — Compute-Constrained Proof of Concept

🚨 Please read before using this model.

This version was built to validate the complete QLoRA → GGUF pipeline on zero-cost infrastructure. Due to free-tier compute limits, training was capped at 100 steps (~5.5% of full dataset / < 1 Epoch).

What works ✅

Domain locking — model refuses non-legal queries
Indian legal tone and structure learned
IPC/CrPC/Constitution query format understood

Known Limitations ❌

Factual hallucination — Article/Section numbers may be incorrect
Underfitting — only 800 of 14,543 examples seen during training
Do not use for actual legal research or advice

Roadmap 🔧

Full epoch training (~1,820 steps) planned on Kaggle GPU
Target: factual accuracy + reduced hallucination

For production-grade Indian Legal AI with RAG retrieval, see:

🏛️ Indian Legal AI Expert

💰 Agentic Financial Intelligence Parser

📁 Files in This Repo

File	Purpose
`llama-3.2-1b-instruct.Q4_K_M.gguf`	Quantized model weights (~808 MB)
`model.yaml`	Auto-config for LM Studio (recommended)
`Must-Final-Fixed.preset.json`	Manual preset backup for any local tool

💻 How to Run Locally

Option 1 — LM Studio (Recommended)

Thanks to the included model.yaml, this model is fully automated for LM Studio.

Open LM Studio
Search for: invincibleambuj/Ambuj-Tripathi-Indian-Legal-Llama-GGUF
Download the GGUF file
Start chatting

All settings (system prompt, temperature, repetition penalty) are applied automatically via model.yaml.

Option 2 — Manual Preset (Backup)

Use this if your tool does not support automatic config via model.yaml.

Download Must-Final-Fixed.preset.json from the Files tab
Import it into LM Studio or Jan
Load the GGUF model
Start a new chat

🧪 Quick Test

Ask these after setup:

Who built you?
→ I was fine-tuned and deployed by Ambuj Kumar Tripathi.

IPC 302 kya hai?
→ Section 302 of the Indian Penal Code deals with punishment for murder...

Hello how are you?
→ I only handle Indian legal questions.

Important Notice

This model was fine-tuned and deployed by Ambuj Kumar Tripathi for educational, research, and skill-development purposes only.

The training data used in this project was collected from publicly available legal-learning resources and a Kaggle dataset used strictly for learning, experimentation, and non-commercial fine-tuning. This repository and its releases are not intended to provide legal advice, are not offered as a commercial legal product, and should be treated as an experimental AI learning project.

Correct Attribution

Fine-tuned and deployed by: Ambuj Kumar Tripathi
If the model output mentions any other creator name, including names appearing from legacy training data, treat that as an incorrect model artifact and not as the correct attribution.

Known Limitations

The model may occasionally produce incorrect creator-name references due to legacy training-data artifacts.
The model may occasionally output formatting artifacts such as special tokens in some local GGUF runtimes.
Outputs may contain hallucinations or inaccuracies and should always be independently verified.

GGUF / Local Runtime Note

If you are running the GGUF model locally in tools like LM Studio, some raw model behaviors may still appear depending on the prompt template and runtime settings. For best results, use a strict system prompt and low-temperature preset.

Legal Disclaimer

This model is provided strictly for educational and training purposes only. It does not constitute legal advice, does not create any lawyer-client relationship, and should not be relied on for real legal decisions. Always consult official legal sources and a qualified advocate.

How to Run Locally

Option 1 — LM Studio (No code required)

Download LM Studio from lmstudio.ai
Search: invincibleambuj/Ambuj-Tripathi-Indian-Legal-Llama-GGUF
Download and start chatting locally

Option 2 — llama.cpp

llama-cli -hf invincibleambuj/Ambuj-Tripathi-Indian-Legal-Llama-GGUF --jinja

Use via Python (qLoRA adapter)

from unsloth import FastLanguageModel

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "invincibleambuj/llama-3.2-1b-legal-india-qlora",
    load_in_4bit = True,
)
inputs = tokenizer(
    "### Instruction:\nWhat is IPC Section 302?\n\n### Response:\n",
    return_tensors="pt"
)
outputs = model.generate(**inputs, max_new_tokens=200, repetition_penalty=1.3)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

⚠️ Important: Chat Template & Prompt Format

This model is fine-tuned on the strict Meta Llama 3 Instruct architecture. How you prompt it depends on how you are using it:

1️⃣ For Normal Users (LM Studio / Ollama)

You don't need to do anything! Just download the GGUF file and load it. Your software will automatically detect the Llama-3 format and apply the correct settings (thanks to the included model.yaml and preset.json). You will get highly detailed, accurate legal answers right out of the box.

2️⃣ For Developers (Custom Python / LangChain / API)

If you are building your own custom app (like Gradio) or using pure Python (transformers / llama.cpp directly), you MUST format your prompts using the native Llama 3 tags.

❌ Do NOT use: User: or Assistant: (The model will ignore instructions and give 1-line answers). ✅ MUST use: The exact format below:

<|start_header_id|>system<|end_header_id|>

{system_prompt}<|eot_id|> <|start_header_id|>user<|end_header_id|>

{user_prompt}<|eot_id|> <|start_header_id|>assistant<|end_header_id|>

Available Files

File	Size	Format
llama-3.2-1b-instruct.Q4_K_M.gguf	~807MB	GGUF Q4_K_M

License

Llama 3.2 Community License — Free for personal and commercial use.

This model was trained 2x faster with Unsloth

This model was finetuned and converted to GGUF format using Unsloth.

Example usage:

For text only LLMs: llama-cli -hf invincibleambuj/Ambuj-Tripathi-Indian-Legal-Llama-GGUF --jinja
For multimodal models: llama-mtmd-cli -hf invincibleambuj/Ambuj-Tripathi-Indian-Legal-Llama-GGUF --jinja

Available Model files:

llama-3.2-1b-instruct.Q4_K_M.gguf

Ollama

An Ollama Modelfile is included for easy deployment. This was trained 2x faster with Unsloth

Downloads last month: 2,315

GGUF

Model size

1B params

Architecture

llama

Hardware compatibility

4-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

invincibleambuj
/

Ambuj-Tripathi-Indian-Legal-Llama-GGUF