MoVE: Translating Laughter and Tears via Mixture of Vocalization Experts in Speech-to-Speech Translation
Paper • 2604.17435 • Published • 3
None defined yet.
How Auditory Knowledge in LLM Backbones Shapes Audio Language Models: A Holistic Evaluation
Nudging Hidden States: Training-Free Model Steering for Chain-of-Thought Reasoning in Large Audio-Language Models