Language Technology Lab at Alibaba DAMO Academy

company

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

CircleRadon submitted a paper 9 days ago

InstructSAM: Segment Any Instance with Any Instructions

CircleRadon submitted a paper 9 days ago

Pixel-Level Pavement Distress Assessment Using Instance Segmentation

Sicong submitted a paper 22 days ago

World Model for Robot Learning: A Comprehensive Survey

View all activity

submitted 2 papers to Daily Papers 9 days ago

InstructSAM: Segment Any Instance with Any Instructions

Paper • 2605.26102 • Published 10 days ago • 17

Pixel-Level Pavement Distress Assessment Using Instance Segmentation

Paper • 2605.26095 • Published 10 days ago

submitted a paper to Daily Papers 22 days ago

World Model for Robot Learning: A Comprehensive Survey

Paper • 2605.00080 • Published Apr 30 • 16

submitted a paper to Daily Papers about 1 month ago

GFT: From Imitation to Reward Fine-Tuning with Unbiased Group Advantages and Dynamic Coefficient Rectification

Paper • 2604.14258 • Published Apr 15 • 23

authored 10 papers 3 months ago

ChatGPT's One-year Anniversary: Are Open-Source Large Language Models Catching up?

Paper • 2311.16989 • Published Nov 28, 2023

Verify-and-Edit: A Knowledge-Enhanced Chain-of-Thought Framework

Paper • 2305.03268 • Published May 5, 2023 • 3

Retrieving Multimodal Information for Augmented Generation: A Survey

Paper • 2303.10868 • Published Mar 20, 2023

How Much are LLMs Contaminated? A Comprehensive Survey and the LLMSanitize Library

Paper • 2404.00699 • Published Mar 31, 2024

Can We Further Elicit Reasoning in LLMs? Critic-Guided Planning with Retrieval-Augmentation for Solving Challenging Tasks

Paper • 2410.01428 • Published Oct 2, 2024 • 1

MedReason: Eliciting Factual Medical Reasoning Steps in LLMs via Knowledge Graphs

Paper • 2504.00993 • Published Apr 1, 2025 • 3

Multi-Agent Tool-Integrated Policy Optimization

Paper • 2510.04678 • Published Oct 6, 2025 • 31

OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe

Paper • 2511.16334 • Published Nov 20, 2025 • 96

LongVT: Incentivizing "Thinking with Long Videos" via Native Tool Calling

Paper • 2511.20785 • Published Nov 25, 2025 • 188

MiroThinker-1.7 & H1: Towards Heavy-Duty Research Agents via Verification

Paper • 2603.15726 • Published Mar 16 • 187

authored a paper 3 months ago

MOOSE-Star: Unlocking Tractable Training for Scientific Discovery by Breaking the Complexity Barrier

Paper • 2603.03756 • Published Mar 4 • 89

authored 5 papers 3 months ago

Penguin-VL: Exploring the Efficiency Limits of VLM with LLM-based Vision Encoders

Paper • 2603.06569 • Published Mar 6 • 120

VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding

Paper • 2501.13106 • Published Jan 22, 2025 • 92

What Is a Good Caption? A Comprehensive Visual Caption Benchmark for Evaluating Both Correctness and Thoroughness

Paper • 2502.14914 • Published Feb 19, 2025

MMR1: Enhancing Multimodal Reasoning with Variance-Aware Sampling and Open Resources

Paper • 2509.21268 • Published Sep 25, 2025 • 104

N3D-VLM: Native 3D Grounding Enables Accurate Spatial Reasoning in Vision-Language Models

Paper • 2512.16561 • Published Dec 18, 2025 • 20