Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length Paper • 2512.04677 • Published 23 days ago • 168
VF-Eval: Evaluating Multimodal LLMs for Generating Feedback on AIGC Videos Paper • 2505.23693 • Published May 29 • 53
GTR: Improving Large 3D Reconstruction Models through Geometry and Texture Refinement Paper • 2406.05649 • Published Jun 9, 2024 • 12
ExtraNeRF: Visibility-Aware View Extrapolation of Neural Radiance Fields with Diffusion Models Paper • 2406.06133 • Published Jun 10, 2024 • 12
MLCM: Multistep Consistency Distillation of Latent Diffusion Model Paper • 2406.05768 • Published Jun 9, 2024 • 13
ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization Paper • 2406.05981 • Published Jun 10, 2024 • 16
Margin-aware Preference Optimization for Aligning Diffusion Models without Reference Paper • 2406.06424 • Published Jun 10, 2024 • 15
VALL-E 2: Neural Codec Language Models are Human Parity Zero-Shot Text to Speech Synthesizers Paper • 2406.05370 • Published Jun 8, 2024 • 18
Lighting Every Darkness with 3DGS: Fast Training and Real-Time Rendering for HDR View Synthesis Paper • 2406.06216 • Published Jun 10, 2024 • 23
Husky: A Unified, Open-Source Language Agent for Multi-Step Reasoning Paper • 2406.06469 • Published Jun 10, 2024 • 29
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation Paper • 2406.06525 • Published Jun 10, 2024 • 71
Boosting Large-scale Parallel Training Efficiency with C4: A Communication-Driven Approach Paper • 2406.04594 • Published Jun 7, 2024 • 7
Why Has Predicting Downstream Capabilities of Frontier AI Models with Scale Remained Elusive? Paper • 2406.04391 • Published Jun 6, 2024 • 9
NATURAL PLAN: Benchmarking LLMs on Natural Language Planning Paper • 2406.04520 • Published Jun 6, 2024 • 14