23 2

liyaxuan

lllyx

AI & ML interests

None yet

Recent Activity

upvoted a paper about 11 hours ago

MinT: Managed Infrastructure for Training and Serving Millions of LLMs

updated a collection 2 days ago

Rethinking OPD

updated a dataset 2 days ago

lllyx/OpenThought3-Qwen3-4B

View all activity

Organizations

None yet

upvoted a paper about 11 hours ago

MinT: Managed Infrastructure for Training and Serving Millions of LLMs

Paper • 2605.13779 • Published 1 day ago • 113

updated a collection 2 days ago

Rethinking OPD

Collection

This collection includes the models used in the paper "Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recip • 4 items • Updated 2 days ago • 1

updated a dataset 2 days ago

lllyx/OpenThought3-Qwen3-4B

Viewer • Updated 2 days ago • 305k • 43 • 1

updated a model 2 days ago

lllyx/Qwen3-1.7B-SFT

Text Generation • 2B • Updated 2 days ago • 887 • 2

published a dataset 2 days ago

lllyx/OpenThought3-Qwen3-4B

Viewer • Updated 2 days ago • 305k • 43 • 1

upvoted a collection 3 days ago

Rethinking OPD

Collection

This collection includes the models used in the paper "Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recip • 4 items • Updated 2 days ago • 1

upvoted a paper 3 days ago

LLMs Improving LLMs: Agentic Discovery for Test-Time Scaling

Paper • 2605.08083 • Published 7 days ago • 63

upvoted 4 papers 4 days ago

Beyond SFT-to-RL: Pre-alignment via Black-Box On-Policy Distillation for Multimodal RL

Paper • 2604.28123 • Published 14 days ago • 47

updated a collection 11 days ago

Rethinking OPD

Collection

This collection includes the models used in the paper "Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recip • 4 items • Updated 2 days ago • 1

upvoted a paper 11 days ago

MAIC-UI: Making Interactive Courseware with Generative UI

Paper • 2604.25806 • Published 17 days ago • 8

updated a collection 11 days ago

Rethinking OPD

Collection

This collection includes the models used in the paper "Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recip • 4 items • Updated 2 days ago • 1

updated a model 11 days ago

lllyx/Qwen3-4B-Base-GRPO

Text Generation • 4B • Updated 11 days ago • 166 • 2

published a model 11 days ago

lllyx/Qwen3-4B-Base-GRPO

Text Generation • 4B • Updated 11 days ago • 166 • 2

upvoted a paper 11 days ago

Co-Evolving Policy Distillation

Paper • 2604.27083 • Published 16 days ago • 64

upvoted a paper 20 days ago

Near-Future Policy Optimization

Paper • 2604.20733 • Published 23 days ago • 76

updated a collection 27 days ago

Rethinking OPD

Collection

This collection includes the models used in the paper "Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recip • 4 items • Updated 2 days ago • 1

liyaxuan

AI & ML interests

Recent Activity

Organizations

lllyx's activity