470 Final Project Model -> Summary Model
Model Overview
This repository contains a fine-tuned T5-small model for abstractive conversational text summarization.
Given a multi-speaker dialogue, the model generates a concise natural-language summary that captures the main points of the conversation.
- Base model: google-t5/t5-small
- Task: Abstractive text summarization
- Model type: Encoder–decoder transformer (T5)
Dataset
The model was fine-tuned on the SAMSum dataset, which consists of chat-style conversations paired with human-written summaries.
- Dataset name: knkarthick/samsum
- Fields:
dialogue: conversation text (input)summary: reference summary (target)
- Splits: train / validation / test
Training Details
- Epochs: 3
- Learning rate: 3e-4
- Batch size: 8
- Max input length: 512 tokens
- Max target length: 128 tokens
- Training framework: Hugging Face Transformers (
Seq2SeqTrainer) - Hardware: GPU (Google Colab)
Evaluation
The model was evaluated on the test split of the SAMSum dataset using ROUGE metrics.
- ROUGE-1: 0.4538
- ROUGE-2: 0.2123
- ROUGE-L: 0.3762
(Replace the values with the scores obtained in the notebook.)
Intended Uses
This model can be used for:
- Summarizing chat conversations or dialogues
- Demonstrations of abstractive summarization
- Educational purposes in NLP and machine learning
Limitations
- The model may omit important details in long or complex conversations.
- Generated summaries may occasionally be imprecise or incomplete.
- The model is trained on informal dialogue and may not generalize well to other domains.
How to Use
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
repo_id = "marvingoenner/470finalprojectmodel"
tokenizer = AutoTokenizer.from_pretrained(repo_id)
model = AutoModelForSeq2SeqLM.from_pretrained(repo_id)
dialogue = "Amanda: I baked cookies. Jerry: Sounds great! Amanda: I will bring some tomorrow."
inputs = tokenizer("summarize: " + dialogue, return_tensors="pt", truncation=True)
output_ids = model.generate(**inputs, max_new_tokens=64)
print(tokenizer.decode(output_ids[0], skip_special_tokens=True))
- Downloads last month
- 29
Model tree for marvingoenner/470finalprojectmodel
Base model
google-t5/t5-small