DDiT: Dynamic Patch Scheduling for Efficient Diffusion Transformers Paper • 2602.16968 • Published 1 day ago • 5
SpargeAttention2: Trainable Sparse Attention via Hybrid Top-k+Top-p Masking and Distillation Fine-Tuning Paper • 2602.13515 • Published 7 days ago • 16
Optimizing Few-Step Generation with Adaptive Matching Distillation Paper • 2602.07345 • Published 13 days ago • 6
Geometry-Aware Rotary Position Embedding for Consistent Video World Model Paper • 2602.07854 • Published 12 days ago • 6
BitDance: Scaling Autoregressive Generative Models with Binary Tokens Paper • 2602.14041 • Published 5 days ago • 39
DICE: Diffusion Large Language Models Excel at Generating CUDA Kernels Paper • 2602.11715 • Published 8 days ago • 5
Zooming without Zooming: Region-to-Image Distillation for Fine-Grained Multimodal Perception Paper • 2602.11858 • Published 8 days ago • 58
T3D: Few-Step Diffusion Language Models via Trajectory Self-Distillation with Direct Discriminative Optimization Paper • 2602.12262 • Published 8 days ago • 8
PISCO: Precise Video Instance Insertion with Sparse Control Paper • 2602.08277 • Published 11 days ago • 11
DeepGen 1.0: A Lightweight Unified Multimodal Model for Advancing Image Generation and Editing Paper • 2602.12205 • Published 8 days ago • 78
ArcFlow: Unleashing 2-Step Text-to-Image Generation via High-Precision Non-Linear Flow Distillation Paper • 2602.09014 • Published 11 days ago • 3
SoulX-Singer: Towards High-Quality Zero-Shot Singing Voice Synthesis Paper • 2602.07803 • Published 12 days ago • 4
LLaDA2.1: Speeding Up Text Diffusion via Token Editing Paper • 2602.08676 • Published 11 days ago • 66
MOVA: Towards Scalable and Synchronized Video-Audio Generation Paper • 2602.08794 • Published 11 days ago • 152