d3LLM: Ultra-Fast Diffusion LLM using Pseudo-Trajectory Distillation πŸš€

d3LLM-Dream-Coder is an ultra-fast diffusion language model introduced in the paper d3LLM: Ultra-Fast Diffusion LLM using Pseudo-Trajectory Distillation. It is built on Dream-org/Dream-Coder-v0-Instruct-7B.

Model Description

d3LLM (pseuDo-Distilled Diffusion Large Language Model) is a framework designed to strike a balance between accuracy and parallelism in diffusion LLMs. It achieves up to 10Γ— speedup over vanilla diffusion models like LLaDA/Dream and 5Γ— speedup over autoregressive (AR) models.

The model utilizes two primary innovations:

  • Pseudo-Trajectory Distillation: A training method that teaches the model which tokens can be decoded confidently at early steps.
  • Entropy-Based Multi-Block Decoding: An inference strategy using a KV-cache refresh mechanism to maintain accuracy while maximizing parallelism.

Resources

Usage

For detailed usage instructions, evaluation scripts, and training code, please refer to the official GitHub repository. Since the model uses a custom architecture, ensure you have transformers==4.49.0 installed and use trust_remote_code=True when loading the model.

Citation

@article{arxiv'26:d3llm,
  title   = {d3LLM: Ultra-Fast Diffusion LLM using Pseudo-Trajectory Distillation},
  author  = {Yu-Yang Qian and Junda Su and Lanxiang Hu and Peiyuan Zhang and Zhijie Deng and Peng Zhao and Hao Zhang},
  journal = {ArXiv preprint},
  volume  = {arXiv:2601.07568},
  year    = {2026}
}
Downloads last month
7
Safetensors
Model size
8B params
Tensor type
BF16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for d3LLM/d3LLM_Dream_Coder

Finetuned
(1)
this model

Dataset used to train d3LLM/d3LLM_Dream_Coder

Paper for d3LLM/d3LLM_Dream_Coder