LatentSync version 1.6 (512x512): Audio Conditioned Latent Diffusion Models for Lip Sync
LatentSync, an end-to-end lip sync framework based on audio conditioned latent diffusion models without any intermediate motion representation, diverging from previous diffusion-based lip sync methods based on pixel space diffusion or two-stage generation.
Examples
| Video Control | Audio Input |
|---|