Instructions to use Lightricks/LTX-2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use Lightricks/LTX-2 with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline from diffusers.utils import load_image, export_to_video # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("Lightricks/LTX-2", dtype=torch.bfloat16, device_map="cuda") pipe.to("cuda") prompt = "A man with short gray hair plays a red electric guitar." image = load_image( "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/guitar-man.png" ) output = pipe(image=image, prompt=prompt).frames[0] export_to_video(output, "output.mp4") - Inference
- Notebooks
- Google Colab
- Kaggle
Prompt sensitivity problems
I use diffusers for my inference and quantize to model to nf4 using bitsandbytes but the model seems to be extremely sensitive to prompt format because short prompts usually result in the model breaking.
My current approach is to use a LLM to enhance the prompts.
Is that sensitivity due to the quantization or is that in the fp16 version too.
I can't really test it due to my lack of RAM and VRAM. I have a 5090 and 64 GB of RAM.
that is a problem of the model itself. Unless you have a very simple video (single person, simple or no movement , saying something) the model freaks out all the time unless you micro prompt it. It has basically no world understanding for people, emotions, gestures, interactions and physics in general, you have to micro direct it.
in other words it is fast in high resolution , it looks good just visually and can make longer videos out of the box (but the above issues get much worse). Sadly it is absolute crap in everything else without extreme prompt babysitting.