YOLOv11 Warehouse Pallet Detector

A family of fine-tuned YOLOv11 models for real-time detection of pallets (wooden skid + stacked products) in warehouse environments. Available in 2 YOLOv11 sizes — nano (2.6M params, edge-ready) and small (9.4M params, best accuracy). Optimized for foreground pallet identification in operational warehouse settings with forklifts, racks, and dynamic lighting.

Model Variants

640p Models (Standard Resolution)

Model Params Size (MB) Resolution [email protected] [email protected]:0.95 Precision Recall GPU (ms) CPU (ms) Best For
YOLOv11n-640 2.6M ~5 640 0.572 0.457 0.600 0.563 ~5 ~25 Edge / real-time on low-power devices
YOLOv11s-640 9.4M ~19 640 0.592 0.485 0.599 0.574 ~7 ~45 Balanced speed/accuracy

1280p Models (Native High Resolution)

Model Params Size (MB) Resolution [email protected] [email protected]:0.95 Precision Recall GPU (ms) CPU (ms) Best For
YOLOv11n-1280 2.6M ~5 1280 0.567 0.440 0.610 0.530 ~18 ~95 Edge with high-res cameras
YOLOv11s-1280 9.4M ~19 1280 0.569 0.459 0.543 0.607 ~25 ~170 Balanced, small pallet detection

All variants share the same training data, augmentation pipeline, and hyperparameters. Medium, large, and extra-large variants were tested but showed no accuracy improvement over small with the current dataset, so only nano and small are published.

Model Description

This model detects complete pallet units (wooden skid base + all products stacked on top) in warehouse imagery. It was trained on real-world warehouse photos captured during normal operations, making it robust to common warehouse conditions: motion blur, variable lighting, partial occlusions by forklifts and personnel, and cluttered backgrounds.

Unlike generic object detection models, this model is specifically trained to:

  • Detect foreground pallets that are fully within the frame
  • Distinguish pallets from visually similar structures (ceiling rafters, doors, rack uprights)
  • Handle pallets stacked 2-high as separate detections per level
  • Work reliably with motion-blurred images from moving cameras

Intended Use

  • Warehouse automation: Real-time pallet counting and position tracking
  • Forklift guidance: Detecting pallets in the robot/forklift field of view
  • Inventory management: Automated pallet inventory from security or mounted cameras
  • 3D warehouse mapping: Input to multi-view reconstruction pipelines for spatial pallet localization

Out of Scope

  • Empty pallet (wooden skid only) detection without products
  • Pallet type classification (EUR, GMA, block, stringer)
  • Damaged pallet assessment
  • Outdoor or non-warehouse environments

Training Details

Architecture

Variant Base Model Parameters Input Resolution Classes Framework
Nano yolo11n.pt ~2.6M 640x640 or 1280x1280 1 (pallet) Ultralytics 8.x
Small yolo11s.pt ~9.4M 640x640 or 1280x1280 1 (pallet) Ultralytics 8.x

Each variant is trained at both 640p and 1280p, yielding 4 models total. The 1280p models use the same architecture but train on higher-resolution inputs, improving detection of small/distant pallets at the cost of slower inference.

Dataset

Split Ratio Description
Train 80% Labeled warehouse images
Validation 15% Held out for epoch-level evaluation
Test 5% Held out for final evaluation

Training uses the full available dataset (no image cap). Exact counts depend on the number of labeled images at training time.

Labeling Pipeline: Images were auto-labeled using Qwen3.5-9B (natively multimodal vision-language model) with structured prompts to identify pallet bounding boxes, followed by human review of preview images. Negative examples (images with no pallets) are included as hard negatives.

Label Format: YOLO format (class_id, x_center, y_center, width, height) normalized 0-1.

Training Configuration

Hyperparameter Value (640p) Value (1280p)
Epochs 100 (early stopping, patience=20) 100 (early stopping, patience=20)
Batch Size 16 4 (auto-scaled)
Image Size 640 1280
Optimizer SGD (Ultralytics default) SGD (Ultralytics default)
Learning Rate Auto-scaled Auto-scaled
Augmentation HSV jitter, horizontal flip, mosaic, scale Same
Device NVIDIA GPU (CUDA) NVIDIA GPU (CUDA)

Data Augmentation Details:

hsv_h=0.015, hsv_s=0.7, hsv_v=0.4
degrees=10.0, translate=0.1, scale=0.5
fliplr=0.5, mosaic=1.0

Evaluation Results

Test Set Performance (All Variants)

Variant [email protected] [email protected]:0.95 Precision Recall GPU (ms) CPU (ms)
Nano (n) 0.572 0.457 0.600 0.563 ~5 ~25
Small (s) 0.592 0.485 0.599 0.574 ~7 ~45

Benchmark Comparison

There is no established standard benchmark for warehouse pallet detection. The table below compares against results reported in published literature on similar (but not identical) datasets, to provide context for this model's performance.

Model Dataset Images [email protected] Precision Recall Source
This model (YOLOv11n) EDITools Warehouse Full dataset 0.572 0.600 0.563 This work
This model (YOLOv11s) EDITools Warehouse Full dataset 0.592 0.599 0.574 This work
NVIDIA SDG Pallet Omniverse synthetic ~25,000 - - - NVIDIA SDG Pallet Model
YOLOv8 (synthetic) Unity synthetic Synthetic 0.995 - - Pallet Detection From Synthetic Data (2025)
YOLOv8 (synthetic boost) Synthetic + real Custom +69% stacked - - Improving Pallet Detection Using Synthetic Data (2024)
YOLOv8 Custom warehouse Custom 0.950 - - Semi-Autonomous Forklift (2025)
YOLOv11 Custom warehouse Custom - 0.93+ - Semi-Autonomous Forklift (2025)
AM-Mask R-CNN Complex warehouse Custom - - - Enhanced Pallet Detection (2025)
Faster R-CNN Industrial warehouse 1,344 0.89 - - IEEE Comparison (2020)
SSD Industrial warehouse 1,344 0.85 - - IEEE Comparison (2020)
YOLOv4 Industrial warehouse 1,344 0.82 - - IEEE Comparison (2020)
YOLOv5 + ArUco Custom + fiducial Custom - 0.995 - Pallet Detection with YOLO + Fiducial (2023)
YOLOv8 + CBAM Warehouse tracking Custom - - - CBAM Pallet Tracking (2025)
YOLOX Industrial Custom - - - Digital Camera Pallet Detection

Notes on comparison:

  • No standard benchmark exists for warehouse pallet detection (unlike COCO or KITTI for general/driving OD). Each study uses its own private dataset, making direct comparison difficult.
  • The NVIDIA SDG model is the most production-ready alternative, trained on ~25K synthetic images via Omniverse, targeting pallet side-face centers/corners. It detects wood, metal, and plastic pallets but focuses on pallet pocket localization for forklift docking rather than full pallet unit detection.
  • The synthetic-data model (0.995 mAP) was evaluated on simple single-pallet scenes, not cluttered warehouses.
  • This model is the first dedicated pallet detection model published to Hugging Face Hub — no fine-tuned pallet model previously existed in the HF ecosystem.
  • This model is specifically optimized for foreground pallet detection in real operational environments.

Performance by Scenario

Scenario Qualitative Performance
Single pallet, clear view Excellent
Multiple pallets in row Good - detects individual units
Pallet on forklift forks Good - detects if mostly visible
Pallets on racks (background) Limited - trained for foreground detection
Motion blur Good - trained on real warehouse video frames
Low/mixed lighting Good - augmented with HSV jitter

Usage

Quick Start

from ultralytics import YOLO

# Choose the variant that fits your deployment:
#   "n" = nano (fastest, edge devices)
#   "s" = small (balanced)
#   "s" = small (best accuracy)
model = YOLO("EFFGRP/yolov11s-warehouse-pallets")  # or n

# Run inference on an image
results = model.predict("warehouse_photo.jpg", conf=0.25)

# Process results
for result in results:
    for box in result.boxes:
        cls_id = int(box.cls[0])
        confidence = float(box.conf[0])
        x1, y1, x2, y2 = box.xyxy[0].tolist()
        print(f"Pallet detected: conf={confidence:.2f}, bbox=({x1:.0f},{y1:.0f},{x2:.0f},{y2:.0f})")

Choosing a Variant

from ultralytics import YOLO

# Edge deployment (Jetson, RPi, mobile) — use nano at 640p
model = YOLO("EFFGRP/yolov11n-warehouse-pallets-640")

# General warehouse camera system — use small at 640p as the best tradeoff
model = YOLO("EFFGRP/yolov11s-warehouse-pallets-640")

# High-res camera with small/distant pallets — use small at 1280p
model = YOLO("EFFGRP/yolov11s-warehouse-pallets-1280")

Batch Processing

from ultralytics import YOLO
from pathlib import Path

model = YOLO("EFFGRP/yolov11s-warehouse-pallets")

# Process a directory of images
image_dir = Path("warehouse_photos/")
results = model.predict(
    source=str(image_dir),
    conf=0.25,
    save=True,           # Save annotated images
    save_txt=True,       # Save YOLO-format labels
    project="output/",
    name="pallet_detections"
)

Export to ONNX for Edge Deployment

from ultralytics import YOLO

# Export nano for edge, or any other variant
model = YOLO("EFFGRP/yolov11n-warehouse-pallets")
model.export(format="onnx", imgsz=640, simplify=True)
# Produces yolov11n-warehouse-pallets.onnx for deployment on edge devices

Integration with Warehouse Systems

This model is designed to work as part of a larger warehouse automation pipeline. Example integration with a Redis message queue:

import json
import redis
from ultralytics import YOLO

model = YOLO("EFFGRP/yolov11s-warehouse-pallets")
r = redis.Redis()

def process_frame(image_path):
    results = model.predict(image_path, conf=0.25, verbose=False)
    detections = []
    for box in results[0].boxes:
        detections.append({
            "confidence": float(box.conf[0]),
            "bbox": box.xyxy[0].tolist(),
            "center": box.xywh[0][:2].tolist(),
        })
    r.lpush("pallet_detections", json.dumps({
        "image": str(image_path),
        "count": len(detections),
        "detections": detections,
    }))

Model Files

Each variant repository contains:

File Description
whole_pallet_{size}_{resolution}.pt PyTorch model weights
README.md This model card
benchmark_results.json Structured benchmark comparison data

Where {size} is one of: n (nano), s (small), and {resolution} is 640 or 1280.

Limitations

  • Single class: Only detects "pallet" (complete unit). Does not distinguish pallet types, contents, or conditions.
  • Foreground bias: Trained primarily on foreground pallets. Background or distant pallets may be missed.
  • Domain specific: Trained on a single warehouse environment. Performance may degrade in visually different warehouses (outdoor yards, cold storage, etc.). Fine-tuning on your own data is recommended.
  • Partial occlusion: Pallets significantly occluded by other pallets (not people or forklifts) are intentionally excluded from training labels.
  • Dataset size: Performance improves with more training data. Fine-tuning on your own warehouse data is recommended.

Ethical Considerations

This model is intended for warehouse automation and logistics optimization. It does not process personal biometric data. However, warehouse images may incidentally contain workers - this model does not detect or track people, but users should ensure compliance with workplace surveillance regulations when deploying camera systems.

Citation

If you use this model in your research, please cite:

@misc{effgrp-warehouse-pallets-2026,
  title={YOLOv11 Warehouse Pallet Detector},
  author={EFFGRP},
  year={2026},
  publisher={Hugging Face},
  url={https://huggingface.co/EFFGRP/yolov11s-warehouse-pallets}
}

Landscape: Why This Model Exists

As of March 2026, no dedicated pallet detection model exists on Hugging Face. The HF Hub has ~21 models tagged "logistics" but none are pallet-specific. The closest alternatives are:

  • NVIDIA SDG Pallet Model (GitHub) — trained on ~25K synthetic images via Omniverse, focuses on pallet side-face and pocket localization for autonomous forklift docking. Production-ready but targets a different task (pocket detection vs. full pallet unit detection).
  • Roboflow Universe (pallets) — community datasets with 1,755+ images and some pre-trained models, but fragmented across projects with inconsistent annotation guidelines.
  • Academic models — published in papers but weights/code not publicly shared on model hubs.

This model fills the gap as a ready-to-use, real-world-trained pallet unit detector for the Hugging Face ecosystem.

Related Work

Downloads last month
14
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for EFFGRP/yolov11s-warehouse-pallets-640

Evaluation results