dialogsum-phi3-lora
A LoRA adapter for dialogue summarization, fine-tuned on top of Phi-3-mini-4k-instruct using the DialogSum dataset.
Given a multi-turn messenger-style conversation, the model generates a concise one-paragraph summary of what was discussed and agreed upon.
Live demo on HuggingFace Spaces
Usage
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
base_model_id = "microsoft/Phi-3-mini-4k-instruct"
adapter_id = "rotemso23/dialogsum-phi3-lora"
tokenizer = AutoTokenizer.from_pretrained(base_model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
base_model_id,
torch_dtype=torch.float16,
device_map="auto",
trust_remote_code=False,
)
model = PeftModel.from_pretrained(model, adapter_id)
model.eval()
dialogue = """Amanda: I baked cookies. Do you want some?
Jerry: Sure! What kind?
Amanda: Chocolate chip. I'll bring them to the office tomorrow.
Jerry: Amazing, I can't wait. Thanks Amanda!
Amanda: No problem :)"""
messages = [{"role": "user", "content": f"Summarize the following conversation in a few sentences.\n\nConversation:\n{dialogue}"}]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.inference_mode():
output = model.generate(**inputs, max_new_tokens=128, do_sample=False)
summary = tokenizer.decode(output[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
print(summary)
Training details
| Setting | Value |
|---|---|
| Base model | microsoft/Phi-3-mini-4k-instruct |
| Dataset | DialogSum (4,000 train / 500 val / 819 test) |
| LoRA rank | 16 |
| LoRA alpha | 32 |
| Target modules | qkv_proj, o_proj |
| Epochs | 3 |
| Effective batch size | 16 |
| Learning rate | 2e-4 |
| Quantization | 4-bit NF4 (training only) |
| Hardware | NVIDIA T4 (Google Colab) |
Evaluation results
Evaluated on the full DialogSum test set (819 examples):
| Model | ROUGE-1 | ROUGE-2 | ROUGE-L |
|---|---|---|---|
| Phi-3-mini zero-shot (baseline) | 0.312 | 0.101 | 0.238 |
| This model (fine-tuned) | 0.474 | 0.215 | 0.391 |
Prompt format
<|user|>
Summarize the following conversation in a few sentences.
Conversation:
{dialogue}
<|end|>
<|assistant|>
- Downloads last month
- 49
Model tree for rotemso23/dialogsum-phi3-lora
Base model
microsoft/Phi-3-mini-4k-instruct