Qualitative results of video generation using Wan-Alpha-v2.0. Our model successfully generates various scenes with accurate and clearly rendered transparency. Notably, it can synthesize diverse semi-transparent objects, glowing effects, and fine-grained details such as hair.
π₯ News
- [2025.12.16] Released Wan-Alpha v2.0, the Wan2.1-14B-T2Vβadapted weights and inference code are now open-sourced.
- [2025.12.16] We update our paper on arXiv.
- [2025.09.30] Our technical report is available on arXiv.
- [2025.09.30] Released Wan-Alpha v1.0, the Wan2.1-14B-T2Vβadapted weights and inference code are now open-sourced.
π To-Do List
- Paper: Available on arXiv.
- Inference Code: Released inference pipeline for Wan-Alpha v1.0 and v2.0.
- Model Weights: Released checkpoints for Wan-Alpha v1.0 and v2.0.
- Image-to-Video: Release Wan-Alpha-I2V model weights.
- Dataset: Open-source the VAE and T2V training dataset.
- Training Code (VAE&T2V): Release training scripts for the VAE and text-to-RGBA video generation.
π Showcase
Text-to-Video Generation with Alpha Channel
| Prompt | Preview Video | Alpha Video |
|---|---|---|
| "The background of this video is transparent. It features a beige, woven rattan hanging chair with soft seat and back cushions. Realistic style. Medium shot." | ![]() |
![]() |
For more results, please visit Our Website
π Quick Start
1. Environment Setup
# Clone the project repository
git clone https://github.com/WeChatCV/Wan-Alpha.git
cd Wan-Alpha
# Create and activate Conda environment
conda create -n Wan-Alpha python=3.11 -y
conda activate Wan-Alpha
# Install dependencies
pip install -r requirements.txt
2. Model Download
Download Wan2.1-T2V-14B
Download Lightx2v-T2V-14B
Download Wan-Alpha VAE
π§ͺ Usage
You can test our model through:
torchrun --nproc_per_node=8 --master_port=29501 generate_dora_lightx2v_mask.py --size 832*480\
--ckpt_dir "path/to/your/Wan-2.1/Wan2.1-T2V-14B" \
--dit_fsdp --t5_fsdp --ulysses_size 8 \
--vae_lora_checkpoint "path/to/your/decoder.bin" \
--lora_path "path/to/your/t2v.safetensors" \
--lightx2v_path "path/to/your/lightx2v_T2V_14B_cfg_step_distill_v2_lora_rank64_bf16.safetensors" \
--sample_guide_scale 1.0 \
--frame_num 81 \
--sample_steps 4 \
--lora_ratio 1.0 \
--lora_prefix "" \
--alpha_shift_mean 0.05 \
--cache_path_mask "path/to/your/gauss_mask" \
--prompt_file ./data/prompt.txt \
--output_dir ./output
You can specify the weights of Wan2.1-T2V-14B with --ckpt_dir, LightX2V-T2V-14B with --lightx2v_path, Wan-Alpha-VAE with --vae_lora_checkpoint, and Wan-Alpha-T2V with --lora_path. Finally, you can find the rendered RGBA videos with a checkerboard background and PNG frames at --output_dir.
You can use gen_gaussian_mask.py to generate a Gaussian mask from an existing alpha video. Alternatively, you can directly create a Gaussian ellipse video, which can be either static or dynamic (e.g., moving from left to right). Note that alpha_shift_mean is a fixed parameter.
Prompt Writing Tip: You need to specify that the background of the video is transparent, the visual style, the shot type (such as close-up, medium shot, wide shot, or extreme close-up), and a description of the main subject. Prompts support both Chinese and English input.
# An example of prompt.
This video has a transparent background. Close-up shot. A colorful parrot flying. Realistic style.
π¨ Official ComfyUI Version
Coming soon...
π€ Acknowledgements
This project is built upon the following excellent open-source projects:
- DiffSynth-Studio (training/inference framework)
- Wan2.1 (base video generation model)
- LightX2V (inference acceleration)
- WanVideo_comfy (inference acceleration)
We sincerely thank the authors and contributors of these projects.
β Citation
If you find our work helpful for your research, please consider citing our paper:
@misc{dong2025wanalpha,
title={Video Generation with Stable Transparency via Shiftable RGB-A Distribution Learner},
author={Haotian Dong and Wenjing Wang and Chen Li and Jing Lyu and Di Lin},
year={2025},
eprint={2509.24979},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2509.24979},
}
π¬ Contact Us
If you have any questions or suggestions, feel free to reach out via GitHub Issues . We look forward to your feedback!
- Downloads last month
- -
Model tree for htdong/Wan-Alpha-v2.0
Base model
Wan-AI/Wan2.1-T2V-14B
