Qwen3-Coder-30B-A3B-Kubernetes-Instruct-LoRA
⚠️ IMPORTANT: This is an adapter, not a full model. You must load it on top of the base model. The full model is also available at Dogacel/Qwen3-Coder-30B-A3B-Kubernetes-Instruct.
Model Description
Qwen3-Coder-30B-A3B-Kubernetes-Instruct is a specialized fine-tune of Qwen3-Coder-30B-A3B-Instruct.
It is designed to act as an assistant for yoru Kubernetes related configuration troubleshooting, specifically optimized for YAML generation.
Developed by: Doğaç Eldenk, Robin Luo – Northwestern University
Model type: Large Language Model (Fine-tune)
Language(s) (NLP): English, YAML
License: Apache-2.0
Finetuned from model: Qwen/Qwen3-Coder-30B-A3B-Instruct
Model Sources
- Repository: https://huggingface.co/Dogacel/Qwen3-Coder-30B-A3B-Kubernetes-Instruct
- Paper: (In-progress)
Uses
Direct Use
This model is designed to assist Site Reliability Engineers (SREs), DevOps professionals, and Platform Engineers. It acts as an expert assistant for:
- Kubernetes Troubleshooting: Analyzing error logs,
kubectl describeoutputs, andCrashLoopBackOffscenarios. - YAML Generation: Writing production-ready manifests for Deployments, Services, Ingresses, StatefulSets, and NetworkPolicies.
- Infrastructure as Code: Converting natural language requirements into valid Kubernetes configurations.
- Architecture Q&A: Answering questions about cluster architecture, networking (CNI), and storage (CSI).
Bias, Risks, and Limitations
- YAML Hallucination: Like all LLMs, this model can generate syntactically correct but logically flawed YAML. Always validate generated manifests (
kubectl apply --dry-run=client -f ...) before applying to production. - Version Bias: The model's knowledge is based on Kubernetes versions available up to the training cutoff. It may hallucinate deprecated APIs (e.g.,
extensions/v1beta1) or be unaware of very recent Alpha features. - Security Risks: The model might suggest configurations with relaxed security contexts (e.g.,
privileged: true) if not explicitly instructed otherwise.
Recommendations
Users should treat this model as a "Copilot" rather than an autonomous operator. All generated YAML should be reviewed by a human and scanned by static code analyzers before deployment.
How to Get Started with the Model
Use the code below to get started with the model.
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
# 1. Define Model IDs
base_model_id = "Qwen/Qwen3-Coder-30B-A3B-Instruct"
adapter_id = "Dogacel/Qwen3-Coder-30B-A3B-Kubernetes-Instruct-LoRA"
# 2. Load Base Model (with device_map for memory efficiency)
model = AutoModelForCausalLM.from_pretrained(
base_model_id,
torch_dtype=torch.float16,
device_map="auto",
low_cpu_mem_usage=True
)
# 3. Load the LoRA Adapter
model = PeftModel.from_pretrained(model, adapter_id)
tokenizer = AutoTokenizer.from_pretrained(base_model_id)
# 4. Run Inference
messages = [
{"role": "system", "content": "You are a Kubernetes expert. Diagnose issues step-by-step, then provide the fixed YAML configuration."},
{"role": "user", "content": "When I run kubectl apply, I get the following error: error validation data: [ValidationError(Deployment.spec.template.spec.containers[0]): unknown field \"imagePullPolicy\" in io.k8s.api.core.v1.Container]"}
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer([text], return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=2048)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Training Details
Training Data
This model is fine-tuned on over 20,000 samples of Kubernetes related Q&A pairs from the community forums.
Training Procedure
Training took 2 epochs on 4 x H100 GPUs over 14 hours.
Preprocessing
All Q&A pairs are processed by another LLM (GPT 4.1-mini) to have the same formatting,
- Identification: (one sentence - what's wrong)
- Reasoning: (root cause explanation)
- Remediation: (fix approach)
- Fixed YAML configuration in ```yaml code block
Training Hyperparameters
- Training regime: BF16 Mixed Precision (Bfloat16)
- Method: QLoRA 4-Bit (Quantized Low-Rank Adaptation)
- Rank (r): 64
- Alpha: 32
- Target Modules: Attention-only –
["v_proj", "q_proj", "k_proj", "o_proj"]
Evaluation
The evaluation is run on a validation set of 100 Q&A pairs and we compared YAML similarities of the generated fixes.
Citation
BibTeX:
[TODO]
- Downloads last month
- 30
Model tree for Dogacel/Qwen3-Coder-30B-A3B-Kubernetes-Instruct-LoRA
Base model
Qwen/Qwen3-Coder-30B-A3B-Instruct