Instructions to use imnotednamode/mochi-1-preview-mix-nf4 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use imnotednamode/mochi-1-preview-mix-nf4 with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("imnotednamode/mochi-1-preview-mix-nf4", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
Still not small enough? Try https://huggingface.co/imnotednamode/mochi-1-preview-mix-nf4-small
This mixes mochi with a development version of diffusers to achieve high quality fast inference with the full 161 frames on a single 24gb card. This repo contains only the transformer. After installing the diffusers main branch with pip install git+https://github.com/huggingface/diffusers, it can be loaded normally and used in a pipeline like so:
from diffusers import MochiPipeline, MochiTransformer3DModel
from diffusers.utils import export_to_video
import torch
transformer = MochiTransformer3DModel.from_pretrained("imnotednamode/mochi-1-preview-mix-nf4", torch_dtype=torch.bfloat16)
pipe = MochiPipeline.from_pretrained("genmo/mochi-1-preview", revision="refs/pr/18", torch_dtype=torch.bfloat16, transformer=transformer)
pipe.enable_model_cpu_offload()
pipe.enable_vae_tiling()
frames = pipe("A camera follows a squirrel running around on a tree branch", num_inference_steps=100, guidance_scale=4.5, height=480, width=848, num_frames=161).frames[0]
export_to_video(frames, "mochi.mp4", fps=15)
I've noticed raising the guidance_scale will allow the model to make a coherent output with less steps, but also reduces motion (?), as the model is trying to align mostly with the text prompt. This can also, to an extent, improve the degradation of using full nf4 weights.
This version works by mixing nf4 weights and bf16 weights together. I notice that using pure nf4 weights degrades the model quality significantly, but using bf16 or LLM.int8 weights means the full 161 frames can't fit into vram. This version strikes a balance (Everything except a few blocks is in bf16).
Here's a comparison
bf16:
nf4mix (this one):
LLM.int8:
nf4:
fp4:
- Downloads last month
- 7
Model tree for imnotednamode/mochi-1-preview-mix-nf4
Base model
genmo/mochi-1-preview