English
meta-ai
meta-pytorch

You need to agree to share your contact information to access this model

The information you provide will be collected, stored, processed and shared in accordance with the Meta Privacy Policy.

Log in or Sign Up to review the conditions and access this model content.

Model Card for BLenDR weights

This repository contains pre-trained LoRA weights and Textual Inversion embeddings for Stable Diffusion 1.5 to generate images of birds.

The weights are used in the paper BLenDR: Blended Text Embeddings and Diffusion Residuals for Intra-Class Image Synthesis in Deep Metric Learning (see the official code repository).

BLenDR is an inference-time sampling method for personalized diffusion models that combines text embedding interpolation with novel set-theoretic residual set operations (union, intersection, and difference over denoising residuals) to synthesize class-specific concepts with novel attributes.

How to use

Loading LoRA weights and Textual Inversion embeddings:

from diffusers import DDPMScheduler, StableDiffusionPipeline
import torch

device = "cuda" if torch.cuda.is_available() else "cpu"

pipeline = StableDiffusionPipeline.from_pretrained(
    "stable-diffusion-v1-5/stable-diffusion-v1-5",
    torch_dtype=torch.float16,
)

pipeline.scheduler = DDPMScheduler.from_pretrained(
    "stable-diffusion-v1-5/stable-diffusion-v1-5",
    subfolder="scheduler",
)
pipeline.load_lora_weights("./lora-20000/lora_weights/")

pipeline.to(device)

# Loading Textual Inversion embeddings
data = torch.load(
    "./learned_embeds-steps-20000.bin",
    map_location="cpu",
)

words = sorted(data["learned_embeds_dict"].keys())

num_added_tokens = pipeline.tokenizer.add_tokens(words)

if num_added_tokens != len(words):
    raise ValueError("Some tokens are already in the tokenizer.")

pipeline.text_encoder.resize_token_embeddings(len(pipeline.tokenizer))
token_ids = pipeline.tokenizer.convert_tokens_to_ids(words)

weights = pipeline.text_encoder.get_input_embeddings().weight.data

for i, word in enumerate(words):
    weights[token_ids[i]] = data["learned_embeds_dict"][word]

Obtain a Textual Inversion token and generate an image for the class represented by the token:

ti_embeddings = torch.load("learned_embeds-steps-20000.bin")
ti_token_strs = list(ti_embeddings["learned_embeds_dict"].keys())
class_0_token = ti_token_strs[0] # e.g. '<000.Bird_Name>'

prompt = f"a photo of a {class_0_token} bird." # prompt format for image generation

image = pipeline(prompt).images[0]

See the official BLenDR repository for more code examples and the full BLenDR denoising loop.

Citing BLenDR

If you find BLenDR useful, please consider citing:

@article{kolf2025blendr,
  title={BLenDR: Blended Text Embeddings and Diffusion Residuals for Intra-Class Image Synthesis in Deep Metric Learning},
  author={Kolf, Jan Niklas and Tezcan, Ozan and Theiss, Justin and Kim, Hyung Jun and Bao, Wentao and Bhushanam, Bhargav and Gupta, Khushi and Kejariwal, Arun and Damer, Naser and Boutros, Fadi},
  journal={arXiv preprint arXiv:XXXX.XXXXX},
  year={2025}
}

License

FAIR Noncommercial Research License

This release is intended to support the open-source research community and fundamental research. Users are expected to leverage the artifacts for research purposes and make research findings arising from the artifacts publicly available for the benefit of the research community.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support