COLLECTION
updated
AgentConductor: Topology Evolution for Multi-Agent Competition-Level Code Generation
Paper
• 2602.17100
• Published • 4
GroupGPT: A Token-efficient and Privacy-preserving Agentic Framework for Multi-User Chat Assistant
Paper
• 2603.01059
• Published • 1
Multi-Domain Riemannian Graph Gluing for Building Graph Foundation Models
Paper
• 2603.00618
• Published
Heterogeneous Agent Collaborative Reinforcement Learning
Paper
• 2603.02604
• Published • 195
Memex(RL): Scaling Long-Horizon LLM Agents via Indexed Experience Memory
Paper
• 2603.04257
• Published • 19
InfinityStory: Unlimited Video Generation with World Consistency and Character-Aware Shot Transitions
Paper
• 2603.03646
• Published • 8
TAROT: Task-Oriented Authorship Obfuscation Using Policy Optimization
Methods
Paper
• 2407.21630
• Published • 8
SageBwd: A Trainable Low-bit Attention
Paper
• 2603.02170
• Published • 19
Experiential Reinforcement Learning
Paper
• 2602.13949
• Published • 74
On-Policy Self-Distillation for Reasoning Compression
Paper
• 2603.05433
• Published • 9
Weak-Driven Learning: How Weak Agents make Strong Agents Stronger
Paper
• 2602.08222
• Published • 290
Less is Enough: Synthesizing Diverse Data in Feature Space of LLMs
Paper
• 2602.10388
• Published • 244
Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning
Paper
• 2511.16043
• Published • 110
LiteAttention: A Temporal Sparse Attention for Diffusion Transformers
Paper
• 2511.11062
• Published • 33
KLASS: KL-Guided Fast Inference in Masked Diffusion Models
Paper
• 2511.05664
• Published • 37
Evolve the Method, Not the Prompts: Evolutionary Synthesis of Jailbreak Attacks on LLMs
Paper
• 2511.12710
• Published • 39
Paper
• 2511.11238
• Published • 39
Distribution-Conditioned Transport
Paper
• 2603.04736
• Published • 3
Truncated Step-Level Sampling with Process Rewards for Retrieval-Augmented Reasoning
Paper
• 2602.23440
• Published • 3
Latent Particle World Models: Self-supervised Object-centric Stochastic Dynamics Modeling
Paper
• 2603.04553
• Published • 3
Planning in 8 Tokens: A Compact Discrete Tokenizer for Latent World Model
Paper
• 2603.05438
• Published • 40
Dynamic Chunking Diffusion Transformer
Paper
• 2603.06351
• Published • 16
Beyond the Grid: Layout-Informed Multi-Vector Retrieval with Parsed Visual Document Representations
Paper
• 2603.01666
• Published • 1
FlashPrefill: Instantaneous Pattern Discovery and Thresholding for Ultra-Fast Long-Context Prefilling
Paper
• 2603.06199
• Published • 9
π-StepNFT: Wider Space Needs Finer Steps in Online RL for Flow-based VLAs
Paper
• 2603.02083
• Published • 9
EmbodiedSplat: Online Feed-Forward Semantic 3DGS for Open-Vocabulary 3D Scene Understanding
Paper
• 2603.04254
• Published • 1
LongVideo-R1: Smart Navigation for Low-cost Long Video Understanding
Paper
• 2602.20913
• Published • 11
Efficient Continual Learning in Language Models via Thalamically Routed Cortical Columns
Paper
• 2602.22479
• Published
VecGlypher: Unified Vector Glyph Generation with Language Models
Paper
• 2602.21461
• Published • 12
Vectorizing the Trie: Efficient Constrained Decoding for LLM-based Generative Retrieval on Accelerators
Paper
• 2602.22647
• Published • 4
Learning from Trials and Errors: Reflective Test-Time Planning for Embodied LLMs
Paper
• 2602.21198
• Published • 4
Thinking to Recall: How Reasoning Unlocks Parametric Knowledge in LLMs
Paper
• 2603.09906
• Published • 75
InternVL-U: Democratizing Unified Multimodal Models for Understanding, Reasoning, Generation and Editing
Paper
• 2603.09877
• Published • 48
Towards a Neural Debugger for Python
Paper
• 2603.09951
• Published • 6
TALON: Test-time Adaptive Learning for On-the-Fly Category Discovery
Paper
• 2603.08075
• Published
ReflexiCoder: Teaching Large Language Models to Self-Reflect on Generated Code and Self-Correct It via Reinforcement Learning
Paper
• 2603.05863
• Published • 6
Beyond Test-Time Training: Learning to Reason via Hardware-Efficient Optimal Control
Paper
• 2603.09221
• Published
Multi-Head Low-Rank Attention
Paper
• 2603.02188
• Published • 3
Compiler-First State Space Duality and Portable O(1) Autoregressive Caching for Inference
Paper
• 2603.09555
• Published • 1
Flash-KMeans: Fast and Memory-Efficient Exact K-Means
Paper
• 2603.09229
• Published • 82
ReMix: Reinforcement routing for mixtures of LoRAs in LLM finetuning
Paper
• 2603.10160
• Published • 26
Prism-Δ: Differential Subspace Steering for Prompt Highlighting in Large Language Models
Paper
• 2603.10705
• Published • 11
OpenClaw-RL: Train Any Agent Simply by Talking
Paper
• 2603.10165
• Published • 154
Causal Concept Graphs in LLM Latent Space for Stepwise Reasoning
Paper
• 2603.10377
• Published • 3
IndexCache: Accelerating Sparse Attention via Cross-Layer Index Reuse
Paper
• 2603.12201
• Published • 53
Strategic Navigation or Stochastic Search? How Agents and Humans Reason Over Document Collections
Paper
• 2603.12180
• Published • 65
Spatial-TTT: Streaming Visual-based Spatial Intelligence with Test-Time Training
Paper
• 2603.12255
• Published • 91
CREATE: Testing LLMs for Associative Creativity
Paper
• 2603.09970
• Published • 15
Geometric Autoencoder for Diffusion Models
Paper
• 2603.10365
• Published • 8
Training Language Models via Neural Cellular Automata
Paper
• 2603.10055
• Published • 8
Neural Thickets: Diverse Task Experts Are Dense Around Pretrained Weights
Paper
• 2603.12228
• Published • 12
LifeGPT: Topology-Agnostic Generative Pretrained Transformer Model for
Cellular Automata
Paper
• 2409.12182
• Published
Attention-based Neural Cellular Automata
Paper
• 2211.01233
• Published
HybridStitch: Pixel and Timestep Level Model Stitching for Diffusion Acceleration
Paper
• 2603.07815
• Published • 10
Multimodal OCR: Parse Anything from Documents
Paper
• 2603.13032
• Published • 43
LookaheadKV: Fast and Accurate KV Cache Eviction by Glimpsing into the Future without Generation
Paper
• 2603.10899
• Published • 7
From Sparse to Dense: Multi-View GRPO for Flow Models via Augmented Condition Space
Paper
• 2603.12648
• Published • 14
VQQA: An Agentic Approach for Video Evaluation and Quality Improvement
Paper
• 2603.12310
• Published • 8
BitDance: Scaling Autoregressive Generative Models with Binary Tokens
Paper
• 2602.14041
• Published • 53
ThinkRouter: Efficient Reasoning via Routing Thinking between Latent and Discrete Spaces
Paper
• 2602.11683
• Published • 8
Think Longer to Explore Deeper: Learn to Explore In-Context via Length-Incentivized Reinforcement Learning
Paper
• 2602.11748
• Published • 38
The Devil Behind Moltbook: Anthropic Safety is Always Vanishing in Self-Evolving AI Societies
Paper
• 2602.09877
• Published • 197
MetaphorStar: Image Metaphor Understanding and Reasoning with End-to-End Visual Reinforcement Learning
Paper
• 2602.10575
• Published • 4
Cognitive Mismatch in Multimodal Large Language Models for Discrete Symbol Understanding
Paper
• 2603.18472
• Published • 20
Beyond Single Tokens: Distilling Discrete Diffusion Models via Discrete MMD
Paper
• 2603.20155
• Published • 10
Paper
• 2603.19461
• Published • 50
A Subgoal-driven Framework for Improving Long-Horizon LLM Agents
Paper
• 2603.19685
• Published • 22
UI-Voyager: A Self-Evolving GUI Agent Learning via Failed Experience
Paper
• 2603.24533
• Published • 47
MemoryLLM: Plug-n-Play Interpretable Feed-Forward Memory for Transformers
Paper
• 2602.00398
• Published • 6
Calibri: Enhancing Diffusion Transformers via Parameter-Efficient Calibration
Paper
• 2603.24800
• Published • 68
BAT: Learning to Reason about Spatial Sounds with Large Language Models
Paper
• 2402.01591
• Published • 1
Real-Time Aligned Reward Model beyond Semantics
Paper
• 2601.22664
• Published • 15
Latent Chain-of-Thought as Planning: Decoupling Reasoning from Verbalization
Paper
• 2601.21358
• Published • 7
Persona Prompting as a Lens on LLM Social Reasoning
Paper
• 2601.20757
• Published • 4
GameTalk: Training LLMs for Strategic Conversation
Paper
• 2601.16276
• Published • 14
Wigner's Friend as a Circuit: Inter-Branch Communication Witness Benchmarks on Superconducting Quantum Hardware
Paper
• 2601.16004
• Published • 1
Reasoning Models Generate Societies of Thought
Paper
• 2601.10825
• Published • 14
Think-Then-Generate: Reasoning-Aware Text-to-Image Diffusion with LLM Encoders
Paper
• 2601.10332
• Published • 32
Demystifying the Slash Pattern in Attention: The Role of RoPE
Paper
• 2601.08297
• Published • 4
The AI Hippocampus: How Far are We From Human Memory?
Paper
• 2601.09113
• Published • 6
Plenoptic Video Generation
Paper
• 2601.05239
• Published • 13
Recursive Language Models
Paper
• 2512.24601
• Published • 96
K-EXAONE Technical Report
Paper
• 2601.01739
• Published • 94
AdaGaR: Adaptive Gabor Representation for Dynamic Scene Reconstruction
Paper
• 2601.00796
• Published • 32
DreaMontage: Arbitrary Frame-Guided One-Shot Video Generation
Paper
• 2512.21252
• Published • 35
The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding
Paper
• 2512.19693
• Published • 68
Understanding Syllogistic Reasoning in LLMs from Formal and Natural Language Perspectives
Paper
• 2512.12620
• Published • 4
Animate Any Character in Any World
Paper
• 2512.17796
• Published • 11
Reasoning Within the Mind: Dynamic Multimodal Interleaving in Latent Space
Paper
• 2512.12623
• Published • 4
Embarrassingly Simple Self-Distillation Improves Code Generation
Paper
• 2604.01193
• Published • 53
FrameDiffuser: G-Buffer-Conditioned Diffusion for Neural Forward Frame Rendering
Paper
• 2512.16670
• Published • 4
The Latent Space: Foundation, Evolution, Mechanism, Ability, and Outlook
Paper
• 2604.02029
• Published • 151
EgoSim: Egocentric World Simulator for Embodied Interaction Generation
Paper
• 2604.01001
• Published • 38
ASI-Evolve: AI Accelerates AI
Paper
• 2603.29640
• Published • 28
Do Audio-Visual Large Language Models Really See and Hear?
Paper
• 2604.02605
• Published • 7
Mimic Intent, Not Just Trajectories
Paper
• 2602.08602
• Published • 15
Learning to Learn-at-Test-Time: Language Agents with Learnable Adaptation Policies
Paper
• 2604.00830
• Published • 15
Think in Strokes, Not Pixels: Process-Driven Image Generation via Interleaved Reasoning
Paper
• 2604.04746
• Published • 72
INSPATIO-WORLD: A Real-Time 4D World Simulator via Spatiotemporal Autoregressive Modeling
Paper
• 2604.07209
• Published • 38
Paper
• 2604.06425
• Published • 31
Phantom: Physics-Infused Video Generation via Joint Modeling of Visual and Latent Physical Dynamics
Paper
• 2604.08503
• Published • 7
Agentic Uncertainty Quantification
Paper
• 2601.15703
• Published • 9
EpiCaR: Knowing What You Don't Know Matters for Better Reasoning in LLMs
Paper
• 2601.06786
• Published • 6
Artificial Entanglement in the Fine-Tuning of Large Language Models
Paper
• 2601.06788
• Published • 5
How Do Large Language Models Learn Concepts During Continual Pre-Training?
Paper
• 2601.03570
• Published • 4
The Molecular Structure of Thought: Mapping the Topology of Long Chain-of-Thought Reasoning
Paper
• 2601.06002
• Published • 60
Token-Level LLM Collaboration via FusionRoute
Paper
• 2601.05106
• Published • 40
Evolving Programmatic Skill Networks
Paper
• 2601.03509
• Published • 88
CosineGate: Semantic Dynamic Routing via Cosine Incompatibility in Residual Networks
Paper
• 2512.22206
• Published • 2
Next-Embedding Prediction Makes Strong Vision Learners
Paper
• 2512.16922
• Published • 90
Bidirectional Normalizing Flow: From Data to Noise and Back
Paper
• 2512.10953
• Published • 7
Can LLMs Guide Their Own Exploration? Gradient-Guided Reinforcement Learning for LLM Reasoning
Paper
• 2512.15687
• Published • 22
Relational Visual Similarity
Paper
• 2512.07833
• Published • 25
Paper
• 2511.23469
• Published • 16
Multi-view Pyramid Transformer: Look Coarser to See Broader
Paper
• 2512.07806
• Published • 21
QKAN-LSTM: Quantum-inspired Kolmogorov-Arnold Long Short-term Memory
Paper
• 2512.05049
• Published • 2
GaussianBlender: Instant Stylization of 3D Gaussians with Disentangled Latent Spaces
Paper
• 2512.03683
• Published • 3
PSA: Pyramid Sparse Attention for Efficient Video Understanding and Generation
Paper
• 2512.04025
• Published • 4
What does it mean to understand language?
Paper
• 2511.19757
• Published • 10
Monet: Reasoning in Latent Visual Space Beyond Images and Language
Paper
• 2511.21395
• Published • 19
Unveiling Intrinsic Dimension of Texts: from Academic Abstract to Creative Story
Paper
• 2511.15210
• Published • 91
MPJudge: Towards Perceptual Assessment of Music-Induced Paintings
Paper
• 2511.07137
• Published • 6
Cambrian-S: Towards Spatial Supersensing in Video
Paper
• 2511.04670
• Published • 39
The Strong Lottery Ticket Hypothesis for Multi-Head Attention Mechanisms
Paper
• 2511.04217
• Published • 17
OneVL: One-Step Latent Reasoning and Planning with Vision-Language Explanation
Paper
• 2604.18486
• Published • 94
Extending One-Step Image Generation from Class Labels to Text via Discriminative Text Representation
Paper
• 2604.18168
• Published • 97
Stratagem: Learning Transferable Reasoning via Trajectory-Modulated Game Self-Play
Paper
• 2604.17696
• Published • 6
Meta-learning In-Context Enables Training-Free Cross Subject Brain Decoding
Paper
• 2604.08537
• Published • 9
Modeling Multiple Support Strategies within a Single Turn for Emotional Support Conversations
Paper
• 2604.17972
• Published • 3
EvoMaster: A Foundational Agent Framework for Building Evolving Autonomous Scientific Agents at Scale
Paper
• 2604.17406
• Published • 6
TEMPO: Scaling Test-time Training for Large Reasoning Models
Paper
• 2604.19295
• Published • 34
AgentSPEX: An Agent SPecification and EXecution Language
Paper
• 2604.13346
• Published • 164
Accurate and scalable exchange-correlation with deep learning
Paper
• 2506.14665
• Published • 5
Mitigating Multimodal Hallucination via Phase-wise Self-reward
Paper
• 2604.17982
• Published • 3
LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model
Paper
• 2604.20796
• Published • 240
WavAlign: Enhancing Intelligence and Expressiveness in Spoken Dialogue Models via Adaptive Hybrid Post-Training
Paper
• 2604.14932
• Published • 11
Explainable Disentangled Representation Learning for Generalizable Authorship Attribution in the Era of Generative AI
Paper
• 2604.21300
• Published • 3
The Platonic Universe: Do Foundation Models See the Same Sky?
Paper
• 2509.19453
• Published
UniT: Toward a Unified Physical Language for Human-to-Humanoid Policy Learning and World Modeling
Paper
• 2604.19734
• Published • 31
A Latent Space Theory for Emergent Abilities in Large Language Models
Paper
• 2304.09960
• Published • 3
Thinking with Images for Multimodal Reasoning: Foundations, Methods, and
Future Frontiers
Paper
• 2506.23918
• Published • 90
Video Analysis and Generation via a Semantic Progress Function
Paper
• 2604.22554
• Published • 63
ReVSI: Rebuilding Visual Spatial Intelligence Evaluation for Accurate Assessment of VLM 3D Reasoning
Paper
• 2604.24300
• Published • 67
SketchVLM: Vision language models can annotate images to explain thoughts and guide users
Paper
• 2604.22875
• Published • 35
Large Language Models Explore by Latent Distilling
Paper
• 2604.24927
• Published • 74
Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling
Paper
• 2604.28185
• Published • 90
OmniLottie: Generating Vector Animations via Parameterized Lottie Tokens
Paper
• 2603.02138
• Published • 151
Selecting and Merging: Towards Adaptable and Scalable Named Entity
Recognition with Large Language Models
Paper
• 2506.22813
• Published • 7
4D-LRM: Large Space-Time Reconstruction Model From and To Any View at
Any Time
Paper
• 2506.18890
• Published • 6
End-to-End Autoregressive Image Generation with 1D Semantic Tokenizer
Paper
• 2605.00503
• Published • 11
AnalogRetriever: Learning Cross-Modal Representations for Analog Circuit Retrieval
Paper
• 2604.23195
• Published • 3
OmniSVG: A Unified Scalable Vector Graphics Generation Model
Paper
• 2504.06263
• Published • 186
Charting and Navigating Hugging Face's Model Atlas
Paper
• 2503.10633
• Published • 94
Block Diffusion: Interpolating Between Autoregressive and Diffusion
Language Models
Paper
• 2503.09573
• Published • 77
Implicit Reasoning in Transformers is Reasoning through Shortcuts
Paper
• 2503.07604
• Published • 23
The Illusion of State in State-Space Models
Paper
• 2404.08819
• Published • 1
WavSpA: Wavelet Space Attention for Boosting Transformers' Long Sequence
Learning Ability
Paper
• 2210.01989
• Published
OpenCharacter: Training Customizable Role-Playing LLMs with Large-Scale
Synthetic Personas
Paper
• 2501.15427
• Published • 6
Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass
Paper
• 2501.13928
• Published • 17
The Geometry of Tokens in Internal Representations of Large Language
Models
Paper
• 2501.10573
• Published • 9
Do generative video models learn physical principles from watching
videos?
Paper
• 2501.09038
• Published • 34
Listwise Policy Optimization: Group-based RLVR as Target-Projection on the LLM Response Simplex
Paper
• 2605.06139
• Published • 65
4DThinker: Thinking with 4D Imagery for Dynamic Spatial Understanding
Paper
• 2605.05997
• Published • 17
Shaping Schema via Language Representation as the Next Frontier for LLM Intelligence Expanding
Paper
• 2605.09271
• Published • 7
TrackCraft3R: Repurposing Video Diffusion Transformers for Dense 3D Tracking
Paper
• 2605.12587
• Published • 37
AutoLLMResearch: Training Research Agents for Automating LLM Experiment Configuration -- Learning from Cheap, Optimizing Expensive
Paper
• 2605.11518
• Published • 4
Self-Distilled Agentic Reinforcement Learning
Paper
• 2605.15155
• Published • 103
BOOKMARKS: Efficient Active Storyline Memory for Role-playing
Paper
• 2605.14169
• Published • 7
PanoWorld: Towards Spatial Supersensing in 360^circ Panorama World
Paper
• 2605.13169
• Published • 20
Topology-Preserving Neural Operator Learning via Hodge Decomposition
Paper
• 2605.13834
• Published • 4
Many-Shot CoT-ICL: Making In-Context Learning Truly Learn
Paper
• 2605.13511
• Published • 32
SPAR3D: Stable Point-Aware Reconstruction of 3D Objects from Single
Images
Paper
• 2501.04689
• Published • 17