7 43 48

Manan Shah

cs-mshah

https://cs-mshah.github.io/

AI & ML interests

Computer Vision

Recent Activity

liked a dataset about 18 hours ago

tencent/HY3D-Bench

liked a dataset about 18 hours ago

cindyxl/ObjaversePlusPlus

upvoted a paper 6 days ago

GigaBrain-0.5M*: a VLA That Learns From World Model-Based Reinforcement Learning

View all activity

Organizations

upvoted a paper 6 days ago

GigaBrain-0.5M*: a VLA That Learns From World Model-Based Reinforcement Learning

Paper • 2602.12099 • Published 9 days ago • 56

upvoted an article 22 days ago

Article

We Got Claude to Build CUDA Kernels and teach open models!

25 days ago

•

139

upvoted a paper 26 days ago

SALAD: Achieve High-Sparsity Attention via Efficient Linear Attention Tuning for Video Diffusion Transformer

Paper • 2601.16515 • Published 30 days ago • 15

upvoted a paper 29 days ago

Your Group-Relative Advantage Is Biased

Paper • 2601.08521 • Published Jan 13 • 154

upvoted 5 papers about 1 month ago

PhysRVG: Physics-Aware Unified Reinforcement Learning for Video Generative Models

Paper • 2601.11087 • Published Jan 16 • 11

STEP3-VL-10B Technical Report

Paper • 2601.09668 • Published Jan 14 • 193

VideoAuto-R1: Video Auto Reasoning via Thinking Once, Answering Twice

Paper • 2601.05175 • Published Jan 8 • 36

Watching, Reasoning, and Searching: A Video Deep Research Benchmark on Open Web for Agentic Video Reasoning

Paper • 2601.06943 • Published Jan 11 • 212

Choreographing a World of Dynamic Objects

Paper • 2601.04194 • Published Jan 7 • 13

upvoted a paper about 2 months ago

VINCIE: Unlocking In-context Image Editing from Video

Paper • 2506.10941 • Published Jun 12, 2025 • 4

upvoted 2 articles about 2 months ago

Article

Generalist Robot Policy Evaluation in Simulation with NVIDIA Isaac Lab-Arena and LeRobot

Jan 5

•

Article

NVIDIA Cosmos Reason 2 Brings Advanced Reasoning To Physical AI

Jan 5

•

upvoted 4 papers about 2 months ago

Evaluating Parameter Efficient Methods for RLVR

Paper • 2512.23165 • Published Dec 29, 2025 • 27

ProEdit: Inversion-based Editing From Prompts Done Right

Paper • 2512.22118 • Published Dec 26, 2025 • 18

LiveTalk: Real-Time Multimodal Interactive Video Diffusion via Improved On-Policy Distillation

Paper • 2512.23576 • Published Dec 29, 2025 • 65

Learning to Reason in 4D: Dynamic Spatial Understanding for Vision Language Models

Paper • 2512.20557 • Published Dec 23, 2025 • 50

upvoted an article about 2 months ago

Article

SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data

Jun 3, 2025

•

321

upvoted 2 articles 3 months ago

Article

We Got Claude to Fine-Tune an Open Source LLM

Dec 4, 2025

•

599

Article

Continuous batching from first principles

Nov 25, 2025

•

326

upvoted a collection 3 months ago

MetaCLIP2 Multilingual

Collection

8 items • Updated Nov 12, 2025 • 16

Manan Shah

AI & ML interests

Recent Activity

Organizations

cs-mshah's activity

We Got Claude to Build CUDA Kernels and teach open models!

Generalist Robot Policy Evaluation in Simulation with NVIDIA Isaac Lab-Arena and LeRobot

NVIDIA Cosmos Reason 2 Brings Advanced Reasoning To Physical AI

SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data

We Got Claude to Fine-Tune an Open Source LLM

Continuous batching from first principles