Canvas-to-Image: Compositional Image Generation with Multimodal Controls Paper β’ 2511.21691 β’ Published Nov 26, 2025 β’ 35
VFXMaster: Unlocking Dynamic Visual Effect Generation via In-Context Learning Paper β’ 2510.25772 β’ Published Oct 29, 2025 β’ 32
Thinking with Camera: A Unified Multimodal Model for Camera-Centric Understanding and Generation Paper β’ 2510.08673 β’ Published Oct 9, 2025 β’ 125
EditVerse: Unifying Image and Video Editing and Generation with In-Context Learning Paper β’ 2509.20360 β’ Published Sep 24, 2025 β’ 17
Running 12 INR-Harmon - Harmonize Any Image You Want! π 12 Harmonize images using masks and pretrained models
Ultra3D: Efficient and High-Fidelity 3D Generation with Part Attention Paper β’ 2507.17745 β’ Published Jul 23, 2025 β’ 35
Pixels, Patterns, but No Poetry: To See The World like Humans Paper β’ 2507.16863 β’ Published Jul 21, 2025 β’ 68
TaskCraft: Automated Generation of Agentic Tasks Paper β’ 2506.10055 β’ Published Jun 11, 2025 β’ 32
Build error 91 Financial Analyst AI π’ 91 Analyze financial text and audio for tone, sentiment, and entities
Runtime error 72 Real Time Stock Predictor π’ 72 Model predicts real time stock prices using LSTM NN
EasyText: Controllable Diffusion Transformer for Multilingual Text Rendering Paper β’ 2505.24417 β’ Published May 30, 2025 β’ 13
Alchemist: Turning Public Text-to-Image Data into Generative Gold Paper β’ 2505.19297 β’ Published May 25, 2025 β’ 84
TEMPURA: Temporal Event Masked Prediction and Understanding for Reasoning in Action Paper β’ 2505.01583 β’ Published May 2, 2025 β’ 8
YoChameleon: Personalized Vision and Language Generation Paper β’ 2504.20998 β’ Published Apr 29, 2025 β’ 12
DreamO: A Unified Framework for Image Customization Paper β’ 2504.16915 β’ Published Apr 23, 2025 β’ 24