Ambuj-Tripathi Indian Legal Llama β€” GGUF : GGUF

Built with Llama 3.2 | Fine-tuned & Converted by Ambuj Kumar Tripathi

Overview

GGUF format of a qLoRA fine-tuned Llama 3.2 1B model on Indian Legal data. Run locally on CPU β€” no GPU required.

Developed by: Ambuj Kumar Tripathi
Base Model: unsloth/llama-3.2-1b-instruct
Training Method: qLoRA (4-bit + LoRA adapters)
GGUF Quantization: Q4_K_M
Training Data: 14,543 Indian Legal QA pairs
License: Llama 3.2 Community License

Training Data

Dataset Coverage
Indian Constitution QA βœ…
IPC (Indian Penal Code) QA βœ…
CrPC (Criminal Procedure Code) QA βœ…
Total 14,543 examples

⚠️ BEFORE YOU START


⚠️ Version 0.1 Alpha β€” Compute-Constrained Proof of Concept

🚨 Please read before using this model.

This version was built to validate the complete QLoRA β†’ GGUF pipeline on zero-cost infrastructure. Due to free-tier compute limits, training was capped at 100 steps (~5.5% of full dataset / < 1 Epoch).

What works βœ…

  • Domain locking β€” model refuses non-legal queries
  • Indian legal tone and structure learned
  • IPC/CrPC/Constitution query format understood

Known Limitations ❌

  • Factual hallucination β€” Article/Section numbers may be incorrect
  • Underfitting β€” only 800 of 14,543 examples seen during training
  • Do not use for actual legal research or advice

Roadmap πŸ”§

  • Full epoch training (~1,820 steps) planned on Kaggle GPU
  • Target: factual accuracy + reduced hallucination

For production-grade Indian Legal AI with RAG retrieval, see:

πŸ“ Files in This Repo

File Purpose
llama-3.2-1b-instruct.Q4_K_M.gguf Quantized model weights (~808 MB)
model.yaml Auto-config for LM Studio (recommended)
Must-Final-Fixed.preset.json Manual preset backup for any local tool

πŸ’» How to Run Locally

Option 1 β€” LM Studio (Recommended)

Thanks to the included model.yaml, this model is fully automated for LM Studio.

  1. Open LM Studio
  2. Search for: invincibleambuj/Ambuj-Tripathi-Indian-Legal-Llama-GGUF
  3. Download the GGUF file
  4. Start chatting

All settings (system prompt, temperature, repetition penalty) are applied automatically via model.yaml.

Option 2 β€” Manual Preset (Backup)

Use this if your tool does not support automatic config via model.yaml.

  1. Download Must-Final-Fixed.preset.json from the Files tab
  2. Import it into LM Studio or Jan
  3. Load the GGUF model
  4. Start a new chat

πŸ§ͺ Quick Test

Ask these after setup:

Who built you?
β†’ I was fine-tuned and deployed by Ambuj Kumar Tripathi.

IPC 302 kya hai?
β†’ Section 302 of the Indian Penal Code deals with punishment for murder...

Hello how are you?
β†’ I only handle Indian legal questions.

Important Notice

This model was fine-tuned and deployed by Ambuj Kumar Tripathi for educational, research, and skill-development purposes only.

The training data used in this project was collected from publicly available legal-learning resources and a Kaggle dataset used strictly for learning, experimentation, and non-commercial fine-tuning. This repository and its releases are not intended to provide legal advice, are not offered as a commercial legal product, and should be treated as an experimental AI learning project.

Correct Attribution

  • Fine-tuned and deployed by: Ambuj Kumar Tripathi
  • If the model output mentions any other creator name, including names appearing from legacy training data, treat that as an incorrect model artifact and not as the correct attribution.

Known Limitations

  • The model may occasionally produce incorrect creator-name references due to legacy training-data artifacts.
  • The model may occasionally output formatting artifacts such as special tokens in some local GGUF runtimes.
  • Outputs may contain hallucinations or inaccuracies and should always be independently verified.

GGUF / Local Runtime Note

If you are running the GGUF model locally in tools like LM Studio, some raw model behaviors may still appear depending on the prompt template and runtime settings. For best results, use a strict system prompt and low-temperature preset.

Legal Disclaimer

This model is provided strictly for educational and training purposes only. It does not constitute legal advice, does not create any lawyer-client relationship, and should not be relied on for real legal decisions. Always consult official legal sources and a qualified advocate.

How to Run Locally

Option 1 β€” LM Studio (No code required)

  1. Download LM Studio from lmstudio.ai
  2. Search: invincibleambuj/Ambuj-Tripathi-Indian-Legal-Llama-GGUF
  3. Download and start chatting locally

Option 2 β€” llama.cpp

llama-cli -hf invincibleambuj/Ambuj-Tripathi-Indian-Legal-Llama-GGUF --jinja

Use via Python (qLoRA adapter)

from unsloth import FastLanguageModel

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "invincibleambuj/llama-3.2-1b-legal-india-qlora",
    load_in_4bit = True,
)
inputs = tokenizer(
    "### Instruction:\nWhat is IPC Section 302?\n\n### Response:\n",
    return_tensors="pt"
)
outputs = model.generate(**inputs, max_new_tokens=200, repetition_penalty=1.3)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

⚠️ Important: Chat Template & Prompt Format

This model is fine-tuned on the strict Meta Llama 3 Instruct architecture. How you prompt it depends on how you are using it:

1️⃣ For Normal Users (LM Studio / Ollama)

You don't need to do anything! Just download the GGUF file and load it. Your software will automatically detect the Llama-3 format and apply the correct settings (thanks to the included model.yaml and preset.json). You will get highly detailed, accurate legal answers right out of the box.

2️⃣ For Developers (Custom Python / LangChain / API)

If you are building your own custom app (like Gradio) or using pure Python (transformers / llama.cpp directly), you MUST format your prompts using the native Llama 3 tags.

❌ Do NOT use: User: or Assistant: (The model will ignore instructions and give 1-line answers). βœ… MUST use: The exact format below:

<|start_header_id|>system<|end_header_id|>

{system_prompt}<|eot_id|> <|start_header_id|>user<|end_header_id|>

{user_prompt}<|eot_id|> <|start_header_id|>assistant<|end_header_id|>

Available Files

File Size Format
llama-3.2-1b-instruct.Q4_K_M.gguf ~807MB GGUF Q4_K_M

License

Llama 3.2 Community License β€” Free for personal and commercial use.

This model was trained 2x faster with Unsloth

This model was finetuned and converted to GGUF format using Unsloth.

Example usage:

  • For text only LLMs: llama-cli -hf invincibleambuj/Ambuj-Tripathi-Indian-Legal-Llama-GGUF --jinja
  • For multimodal models: llama-mtmd-cli -hf invincibleambuj/Ambuj-Tripathi-Indian-Legal-Llama-GGUF --jinja

Available Model files:

  • llama-3.2-1b-instruct.Q4_K_M.gguf

Ollama

An Ollama Modelfile is included for easy deployment. This was trained 2x faster with Unsloth

Downloads last month
2,315
GGUF
Model size
1B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Space using invincibleambuj/Ambuj-Tripathi-Indian-Legal-Llama-GGUF 1