Vocal and background audio separator
Generate images preserving face identity
Transform images based on text instructions