🐶 Dog Breed Classification (TensorFlow Hub MobileNetV2)

This model predicts the dog breed (120 classes) from an input image using transfer learning with a pretrained MobileNetV2 model from TensorFlow Hub, plus a custom dense softmax classifier head.

It is built as an end-to-end computer vision pipeline: data loading → preprocessing → batching with tf.data → training with callbacks → evaluation/visualization → saving/loading → Kaggle-style probabilistic submission generation.

Model Details

Developed by: brej-29
Model type: TensorFlow / Keras Sequential
- Base: TF Hub MobileNetV2 ImageNet classifier
- Head: Dense(120, activation="softmax")
Task: Multi-class image classification (120 dog breeds)
Output: Probability distribution over 120 breeds (softmax)
Input: RGB image resized to 224×224, normalized to [0, 1]
Training notebook: DogBreedClassification.ipynb
Source repo: https://github.com/brej-29/Logicmojo-AIML-Assignments-DogBreedClassificationTensorFlow
License: MIT

Intended Use

Educational / portfolio demonstration of transfer learning + end-to-end deep learning workflow
Baseline experiments for multi-class dog breed recognition
Generating probabilistic predictions for Kaggle-style submissions

Out-of-scope / Not suitable for

Safety-critical or production use without further validation, monitoring, and retraining
Use on non-dog images or heavily out-of-distribution images (e.g., cartoons, low-light, extreme blur) without robustness testing

Training Data

Dataset: Kaggle “Dog Breed Identification”
- Training images: 10,222
- Classes: 120 dog breeds
- Labels file: labels.csv (maps id → breed)

Note: Kaggle’s official competition metric is log loss (requires calibrated class probabilities). This project produces probabilistic outputs suitable for that metric, but offline log loss computation is not explicitly reported in the notebook.

Preprocessing

Image preprocessing applied during training/inference:

Read JPG from filepath
Decode to RGB tensor
Convert dtype to float32 and normalize to [0, 1]
Resize to 224×224

Efficient input pipeline:

Training batches use shuffling and tf.data batching
Validation batches avoid shuffling
Test batches contain filepaths only (no labels)

Label Encoding / Class Order (Important)

Labels are one-hot encoded based on:
- unique_breeds = np.unique(labels) (alphabetical order by default for NumPy unique)
The model’s output index i corresponds to unique_breeds[i]

To ensure correct decoding of predictions on the Hub, you should provide the class list (e.g., class_names.json or unique_breeds.txt) in the model repository.

Training Procedure

Framework: TensorFlow 2.x / Keras
Base model URL (TF Hub):
- https://tfhub.dev/google/imagenet/mobilenet_v2_130_224/classification/4
Loss: CategoricalCrossentropy
Optimizer: Adam
Metrics: accuracy
Callbacks:
- TensorBoard logging
- EarlyStopping
  - Subset training monitors val_accuracy (patience=3)
  - Full training (no validation set) monitors accuracy (patience=3)

Subset Experiment (for fast iteration)

Subset size: 2,000 images
Split: 80% train / 20% validation (random_state=42)
Epochs configured: 100 (with EarlyStopping)

Full Training

The notebook also trains on the full dataset to generate Kaggle-style predictions.
Since the full run does not use a dedicated validation set, validation metrics are not reported for that phase.

Evaluation

Reported evaluation (subset experiment; validation split from first 2,000 images):

Validation Accuracy: 0.7750
Validation Loss: 0.8411

Important: This is a quick experiment metric and may not represent final performance on the full dataset or on real-world dog images.

How to Use

The recommended approach is:

Download the saved model artifact from the Hub
Apply the same preprocessing (resize 224×224, normalize)
Run model.predict()
Decode the top-k indices using the stored class list (same order as training)

Example (update filenames to match your uploaded artifacts):

import json
import numpy as np
import tensorflow as tf
import tensorflow_hub as hub
from huggingface_hub import hf_hub_download

repo_id = "YOUR_USERNAME/YOUR_MODEL_REPO"

# 1) Download model (example: H5)
model_path = hf_hub_download(repo_id=repo_id, filename="dog_breed_mobilenetv2.h5")
model = tf.keras.models.load_model(
    model_path,
    custom_objects={"KerasLayer": hub.KerasLayer},
    compile=False
)

# 2) Download class names (recommended to upload alongside the model)
classes_path = hf_hub_download(repo_id=repo_id, filename="class_names.json")
class_names = json.load(open(classes_path, "r"))

# 3) Preprocess a single image
def preprocess_image(path, img_size=224):
    img = tf.io.read_file(path)
    img = tf.image.decode_jpeg(img, channels=3)
    img = tf.image.convert_image_dtype(img, tf.float32)
    img = tf.image.resize(img, [img_size, img_size])
    return tf.expand_dims(img, axis=0)  # add batch dim

x = preprocess_image("your_dog.jpg")
probs = model.predict(x)[0]

# 4) Top-5 predictions
top5 = probs.argsort()[-5:][::-1]
for idx in top5:
    print(class_names[idx], float(probs[idx]))

If you uploaded a TensorFlow SavedModel folder instead of an .h5 file, download the folder files and load with tf.keras.models.load_model(...) accordingly.

Input Requirements

Input type: RGB images (JPG/PNG supported if decoded to RGB)
Image size: 224×224
Value range: float32 normalized to [0, 1]
Output decoding must use the same class order used during training (np.unique(labels) order)

Bias, Risks, and Limitations

Dataset bias: model is trained on a specific Kaggle dataset; results may not generalize to all real-world photos
Class ambiguity: many dog breeds look visually similar; mistakes are expected
Out-of-distribution risk: performance may drop significantly on unusual lighting, occlusions, non-dog animals, mixed breeds, or stylized images
Label-order dependency: wrong class mapping will produce incorrect breed names even if probabilities are correct

Environmental Impact

Transfer learning with MobileNetV2 is relatively compute-efficient compared to training a CNN from scratch. Training can be done on GPU for speed, but overall footprint is modest for a model of this size.

Technical Specifications

Framework: TensorFlow 2.x / Keras
Base model: TF Hub MobileNetV2 (ImageNet pretrained)
Head: Dense softmax classifier (120 units)
Task: image-classification
Recommended runtime: CPU (inference) / GPU (training)

Model Card Authors

BrejBala

Contact

For questions/feedback, please open an issue on the GitHub repository:
https://github.com/brej-29/Logicmojo-AIML-Assignments-DogBreedClassificationTensorFlow

Downloads last month: -; Downloads are not tracked for this model. How to track

Evaluation results

Validation Accuracy (subset experiment) on Kaggle Dog Breed Identification (labels.csv + train images)
self-reported

0.775
Validation Loss (subset experiment) on Kaggle Dog Breed Identification (labels.csv + train images)
self-reported

0.841