--- title: SmolVLM2 Video Highlights emoji: 🎬 colorFrom: blue colorTo: purple sdk: docker pinned: false license: apache-2.0 app_port: 7860 --- # 🎬 SmolVLM2 HuggingFace Segment-Based Video Highlights API **Generate intelligent video highlights using HuggingFace's segment-based approach** This is a FastAPI service that uses HuggingFace's proven segment-based classification method with SmolVLM2-256M-Video-Instruct for reliable, consistent highlight generation. ## 🚀 Features - **Segment-Based Analysis**: Processes videos in fixed 5-second segments for consistent AI classification - **Dual Criteria Generation**: Creates two different highlight criteria sets and selects the most selective one - **SmolVLM2-256M-Video-Instruct**: Faster processing with specialized video understanding - **Visual Effects**: Optional fade transitions between segments for professional-quality output - **REST API**: Upload videos and download processed highlights with job tracking - **Background Processing**: Non-blocking video processing with real-time status updates ## 🔗 API Endpoints - `POST /upload-video` - Upload video for processing - `GET /job-status/{job_id}` - Check processing status - `GET /download/{filename}` - Download generated highlights - `GET /docs` - Interactive API documentation ## 📱 Usage ### Via API ```bash # Upload video with optional parameters curl -X POST \ -F "video=@your_video.mp4" \ -F "segment_length=5.0" \ -F "model_name=HuggingFaceTB/SmolVLM2-256M-Video-Instruct" \ -F "with_effects=true" \ https://your-space-url.hf.space/upload-video # Check processing status curl https://your-space-url.hf.space/job-status/YOUR_JOB_ID # Download highlights and analysis curl -O https://your-space-url.hf.space/download/HIGHLIGHTS.mp4 curl -O https://your-space-url.hf.space/download/ANALYSIS.json ``` ### Via Android App Use the provided Android client code to integrate with your mobile app. ## ⚙️ Configuration Default settings: - **Segment Length**: 5 seconds (fixed segments for consistent classification) - **Model**: SmolVLM2-256M-Video-Instruct (faster processing) - **Effects**: Enabled (fade transitions between segments) - **Dual Criteria**: Two prompt variations for robust selection ## 🛠️ Technology Stack - **SmolVLM2-256M-Video-Instruct**: Efficient vision-language model optimized for video understanding - **HuggingFace Transformers**: Latest transformer models and inference - **FastAPI**: Modern web framework for APIs - **FFmpeg**: Video processing with advanced filter support - **PyTorch**: Deep learning framework with device optimization ## 🎯 Perfect For - Social media content creators - Educational video processing - Meeting/lecture summarization - Sports highlight generation - Entertainment content curation ## �� License Apache 2.0 - Free for commercial and personal use ## 🤝 Contributing Built with ❤️ using Hugging Face Transformers and open-source AI models.