^RFLAV: Rolling Flow matching for infinite Audio Video generation
Paper • 2503.08307 • Published • 9
How to use MaverickAlex/R-FLAV-B-1-LS with Diffusers:
pip install -U diffusers transformers accelerate
import torch
from diffusers import DiffusionPipeline
# switch to "mps" for apple devices
pipe = DiffusionPipeline.from_pretrained("MaverickAlex/R-FLAV-B-1-LS", dtype=torch.bfloat16, device_map="cuda")
prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"
image = pipe(prompt).images[0]Models of R-FLAV trained on Landscape and AIST++ for 400k iterations.
For more info, please refer to the Github repository at https://github.com/ErgastiAlex/R-FLAV
To download the ckpts directly in the code you can do
from huggingface_hub import hf_hub_download
import torch
from models import FLAV
model = FLAV.from_pretrained(args.model_ckpt)
hf_hub_download(repo_id="MaverickAlex/R-FLAV-B-1-LS", filename="vocoder/config.json")
vocoder_path = hf_hub_download(repo_id="MaverickAlex/R-FLAV-B-1-LS", filename="vocoder/vocoder.pt")
vocoder_path = vocoder_path.replace("vocoder.pt", "")
vocoder = Generator.from_pretrained(vocoder_path)