Multimodal - a RichardForests Collection

RichardForests 's Collections

Language Models

CV

RL

Diffusion models

3D/4D Gaussian Splatting

Mamba

NeRF

Transformers & MoE

(3D) Foundation Models

SSL

DL & Software DStructures

Dora

Flash Attention in Triton

Lora variations

Parameter Efficient - LLMs

Robotics - Cross Attention

DMs - Lighting Conditions

Multimodal

updated Feb 24, 2024

Running on Zero

MCP

Featured

1.99k

Stable Video Diffusion 1.1

📺

1.99k

Generate a video from a single image
Generative Multimodal Models are In-Context Learners

Paper • 2312.13286 • Published Dec 20, 2023 • 36
COSMO: COntrastive Streamlined MultimOdal Model with Interleaved Pre-Training

Paper • 2401.00849 • Published Jan 1, 2024 • 17
TheBloke/Sonya-7B-GPTQ

Text Generation • 7B • Updated Dec 31, 2023 • 13 • 2
Runtime error

Featured

142

TextDiffuser 2

📚

142

Generate images from text prompts with layout planning
AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling

Paper • 2402.12226 • Published Feb 19, 2024 • 45