COLLECTION - a scottrx11 Collection

scottrx11 's Collections

Collector 2

COLLECTION

updated 4 days ago

Upvote

AgentConductor: Topology Evolution for Multi-Agent Competition-Level Code Generation

Paper • 2602.17100 • Published Feb 19 • 4
GroupGPT: A Token-efficient and Privacy-preserving Agentic Framework for Multi-User Chat Assistant

Paper • 2603.01059 • Published Mar 1 • 1
Multi-Domain Riemannian Graph Gluing for Building Graph Foundation Models

Paper • 2603.00618 • Published Feb 28
Heterogeneous Agent Collaborative Reinforcement Learning

Paper • 2603.02604 • Published Mar 3 • 195
Memex(RL): Scaling Long-Horizon LLM Agents via Indexed Experience Memory

Paper • 2603.04257 • Published Mar 4 • 19
InfinityStory: Unlimited Video Generation with World Consistency and Character-Aware Shot Transitions

Paper • 2603.03646 • Published Mar 4 • 8
TAROT: Task-Oriented Authorship Obfuscation Using Policy Optimization Methods

Paper • 2407.21630 • Published Jul 31, 2024 • 8
SageBwd: A Trainable Low-bit Attention

Paper • 2603.02170 • Published Mar 2 • 19
Experiential Reinforcement Learning

Paper • 2602.13949 • Published Feb 15 • 74
On-Policy Self-Distillation for Reasoning Compression

Paper • 2603.05433 • Published Mar 5 • 9
Weak-Driven Learning: How Weak Agents make Strong Agents Stronger

Paper • 2602.08222 • Published Feb 9 • 290
Less is Enough: Synthesizing Diverse Data in Feature Space of LLMs

Paper • 2602.10388 • Published Feb 11 • 244
Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning

Paper • 2511.16043 • Published Nov 20, 2025 • 110
LiteAttention: A Temporal Sparse Attention for Diffusion Transformers

Paper • 2511.11062 • Published Nov 14, 2025 • 33
KLASS: KL-Guided Fast Inference in Masked Diffusion Models

Paper • 2511.05664 • Published Nov 7, 2025 • 37
Evolve the Method, Not the Prompts: Evolutionary Synthesis of Jailbreak Attacks on LLMs

Paper • 2511.12710 • Published Nov 16, 2025 • 39
Virtual Width Networks

Paper • 2511.11238 • Published Nov 14, 2025 • 39
Distribution-Conditioned Transport

Paper • 2603.04736 • Published Mar 5 • 3
Truncated Step-Level Sampling with Process Rewards for Retrieval-Augmented Reasoning

Paper • 2602.23440 • Published Feb 26 • 3
Latent Particle World Models: Self-supervised Object-centric Stochastic Dynamics Modeling

Paper • 2603.04553 • Published Mar 4 • 3
Planning in 8 Tokens: A Compact Discrete Tokenizer for Latent World Model

Paper • 2603.05438 • Published Mar 5 • 40
Dynamic Chunking Diffusion Transformer

Paper • 2603.06351 • Published Mar 6 • 16
Beyond the Grid: Layout-Informed Multi-Vector Retrieval with Parsed Visual Document Representations

Paper • 2603.01666 • Published Mar 2 • 1
FlashPrefill: Instantaneous Pattern Discovery and Thresholding for Ultra-Fast Long-Context Prefilling

Paper • 2603.06199 • Published Mar 6 • 9
π-StepNFT: Wider Space Needs Finer Steps in Online RL for Flow-based VLAs

Paper • 2603.02083 • Published Mar 2 • 9
EmbodiedSplat: Online Feed-Forward Semantic 3DGS for Open-Vocabulary 3D Scene Understanding

Paper • 2603.04254 • Published Mar 4 • 1
LongVideo-R1: Smart Navigation for Low-cost Long Video Understanding

Paper • 2602.20913 • Published Feb 24 • 11
Efficient Continual Learning in Language Models via Thalamically Routed Cortical Columns

Paper • 2602.22479 • Published Feb 25
VecGlypher: Unified Vector Glyph Generation with Language Models

Paper • 2602.21461 • Published Feb 25 • 12
Vectorizing the Trie: Efficient Constrained Decoding for LLM-based Generative Retrieval on Accelerators

Paper • 2602.22647 • Published Feb 26 • 4
Learning from Trials and Errors: Reflective Test-Time Planning for Embodied LLMs

Paper • 2602.21198 • Published Feb 24 • 4
Thinking to Recall: How Reasoning Unlocks Parametric Knowledge in LLMs

Paper • 2603.09906 • Published Mar 10 • 75
InternVL-U: Democratizing Unified Multimodal Models for Understanding, Reasoning, Generation and Editing

Paper • 2603.09877 • Published Mar 10 • 48
Towards a Neural Debugger for Python

Paper • 2603.09951 • Published Mar 10 • 6
TALON: Test-time Adaptive Learning for On-the-Fly Category Discovery

Paper • 2603.08075 • Published Mar 9
ReflexiCoder: Teaching Large Language Models to Self-Reflect on Generated Code and Self-Correct It via Reinforcement Learning

Paper • 2603.05863 • Published Mar 6 • 6
Beyond Test-Time Training: Learning to Reason via Hardware-Efficient Optimal Control

Paper • 2603.09221 • Published Mar 10
Multi-Head Low-Rank Attention

Paper • 2603.02188 • Published Mar 2 • 3
Compiler-First State Space Duality and Portable O(1) Autoregressive Caching for Inference

Paper • 2603.09555 • Published Mar 10 • 1
Flash-KMeans: Fast and Memory-Efficient Exact K-Means

Paper • 2603.09229 • Published Mar 10 • 82
ReMix: Reinforcement routing for mixtures of LoRAs in LLM finetuning

Paper • 2603.10160 • Published Mar 10 • 26
Prism-Δ: Differential Subspace Steering for Prompt Highlighting in Large Language Models

Paper • 2603.10705 • Published Mar 11 • 11
OpenClaw-RL: Train Any Agent Simply by Talking

Paper • 2603.10165 • Published Mar 10 • 154
Causal Concept Graphs in LLM Latent Space for Stepwise Reasoning

Paper • 2603.10377 • Published Mar 11 • 3
IndexCache: Accelerating Sparse Attention via Cross-Layer Index Reuse

Paper • 2603.12201 • Published Mar 12 • 53
Strategic Navigation or Stochastic Search? How Agents and Humans Reason Over Document Collections

Paper • 2603.12180 • Published Mar 12 • 65
Spatial-TTT: Streaming Visual-based Spatial Intelligence with Test-Time Training

Paper • 2603.12255 • Published Mar 12 • 91
CREATE: Testing LLMs for Associative Creativity

Paper • 2603.09970 • Published Mar 10 • 15
Geometric Autoencoder for Diffusion Models

Paper • 2603.10365 • Published Mar 11 • 8
Training Language Models via Neural Cellular Automata

Paper • 2603.10055 • Published Mar 9 • 8
Neural Thickets: Diverse Task Experts Are Dense Around Pretrained Weights

Paper • 2603.12228 • Published Mar 12 • 12
LifeGPT: Topology-Agnostic Generative Pretrained Transformer Model for Cellular Automata

Paper • 2409.12182 • Published Sep 3, 2024
Attention-based Neural Cellular Automata

Paper • 2211.01233 • Published Nov 2, 2022
HybridStitch: Pixel and Timestep Level Model Stitching for Diffusion Acceleration

Paper • 2603.07815 • Published Mar 8 • 10
Multimodal OCR: Parse Anything from Documents

Paper • 2603.13032 • Published Mar 13 • 43
LookaheadKV: Fast and Accurate KV Cache Eviction by Glimpsing into the Future without Generation

Paper • 2603.10899 • Published Mar 11 • 7
From Sparse to Dense: Multi-View GRPO for Flow Models via Augmented Condition Space

Paper • 2603.12648 • Published Mar 13 • 14
VQQA: An Agentic Approach for Video Evaluation and Quality Improvement

Paper • 2603.12310 • Published Mar 12 • 8
BitDance: Scaling Autoregressive Generative Models with Binary Tokens

Paper • 2602.14041 • Published Feb 15 • 53
ThinkRouter: Efficient Reasoning via Routing Thinking between Latent and Discrete Spaces

Paper • 2602.11683 • Published Feb 12 • 8
Think Longer to Explore Deeper: Learn to Explore In-Context via Length-Incentivized Reinforcement Learning

Paper • 2602.11748 • Published Feb 12 • 38
The Devil Behind Moltbook: Anthropic Safety is Always Vanishing in Self-Evolving AI Societies

Paper • 2602.09877 • Published Feb 10 • 197
MetaphorStar: Image Metaphor Understanding and Reasoning with End-to-End Visual Reinforcement Learning

Paper • 2602.10575 • Published Feb 11 • 4
Cognitive Mismatch in Multimodal Large Language Models for Discrete Symbol Understanding

Paper • 2603.18472 • Published Mar 19 • 20
Beyond Single Tokens: Distilling Discrete Diffusion Models via Discrete MMD

Paper • 2603.20155 • Published Mar 20 • 10
Hyperagents

Paper • 2603.19461 • Published Mar 19 • 50
A Subgoal-driven Framework for Improving Long-Horizon LLM Agents

Paper • 2603.19685 • Published Mar 20 • 22
UI-Voyager: A Self-Evolving GUI Agent Learning via Failed Experience

Paper • 2603.24533 • Published Mar 25 • 47
MemoryLLM: Plug-n-Play Interpretable Feed-Forward Memory for Transformers

Paper • 2602.00398 • Published Jan 30 • 6
Calibri: Enhancing Diffusion Transformers via Parameter-Efficient Calibration

Paper • 2603.24800 • Published Mar 25 • 68
BAT: Learning to Reason about Spatial Sounds with Large Language Models

Paper • 2402.01591 • Published Feb 2, 2024 • 1
Real-Time Aligned Reward Model beyond Semantics

Paper • 2601.22664 • Published Jan 30 • 15
Latent Chain-of-Thought as Planning: Decoupling Reasoning from Verbalization

Paper • 2601.21358 • Published Jan 29 • 7
Persona Prompting as a Lens on LLM Social Reasoning

Paper • 2601.20757 • Published Jan 28 • 4
GameTalk: Training LLMs for Strategic Conversation

Paper • 2601.16276 • Published Jan 22 • 14
Wigner's Friend as a Circuit: Inter-Branch Communication Witness Benchmarks on Superconducting Quantum Hardware

Paper • 2601.16004 • Published Jan 22 • 1
Reasoning Models Generate Societies of Thought

Paper • 2601.10825 • Published Jan 15 • 14
Think-Then-Generate: Reasoning-Aware Text-to-Image Diffusion with LLM Encoders

Paper • 2601.10332 • Published Jan 15 • 32
Demystifying the Slash Pattern in Attention: The Role of RoPE

Paper • 2601.08297 • Published Jan 13 • 4
The AI Hippocampus: How Far are We From Human Memory?

Paper • 2601.09113 • Published Jan 14 • 6
Plenoptic Video Generation

Paper • 2601.05239 • Published Jan 8 • 13
Recursive Language Models

Paper • 2512.24601 • Published Dec 31, 2025 • 96
K-EXAONE Technical Report

Paper • 2601.01739 • Published Jan 5 • 94
AdaGaR: Adaptive Gabor Representation for Dynamic Scene Reconstruction

Paper • 2601.00796 • Published Jan 2 • 32
DreaMontage: Arbitrary Frame-Guided One-Shot Video Generation

Paper • 2512.21252 • Published Dec 24, 2025 • 35
The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding

Paper • 2512.19693 • Published Dec 22, 2025 • 68
Understanding Syllogistic Reasoning in LLMs from Formal and Natural Language Perspectives

Paper • 2512.12620 • Published Dec 14, 2025 • 4
Animate Any Character in Any World

Paper • 2512.17796 • Published Dec 18, 2025 • 11
Reasoning Within the Mind: Dynamic Multimodal Interleaving in Latent Space

Paper • 2512.12623 • Published Dec 14, 2025 • 4
Embarrassingly Simple Self-Distillation Improves Code Generation

Paper • 2604.01193 • Published Apr 1 • 53
FrameDiffuser: G-Buffer-Conditioned Diffusion for Neural Forward Frame Rendering

Paper • 2512.16670 • Published Dec 18, 2025 • 4
The Latent Space: Foundation, Evolution, Mechanism, Ability, and Outlook

Paper • 2604.02029 • Published Apr 2 • 151
EgoSim: Egocentric World Simulator for Embodied Interaction Generation

Paper • 2604.01001 • Published Apr 1 • 38
ASI-Evolve: AI Accelerates AI

Paper • 2603.29640 • Published Mar 31 • 28
Do Audio-Visual Large Language Models Really See and Hear?

Paper • 2604.02605 • Published Apr 3 • 7
Mimic Intent, Not Just Trajectories

Paper • 2602.08602 • Published Mar 28 • 15
Learning to Learn-at-Test-Time: Language Agents with Learnable Adaptation Policies

Paper • 2604.00830 • Published Apr 2 • 15
Think in Strokes, Not Pixels: Process-Driven Image Generation via Interleaved Reasoning

Paper • 2604.04746 • Published Apr 8 • 72
INSPATIO-WORLD: A Real-Time 4D World Simulator via Spatiotemporal Autoregressive Modeling

Paper • 2604.07209 • Published Apr 8 • 38
Neural Computers

Paper • 2604.06425 • Published Apr 7 • 31
Phantom: Physics-Infused Video Generation via Joint Modeling of Visual and Latent Physical Dynamics

Paper • 2604.08503 • Published Apr 9 • 7
Agentic Uncertainty Quantification

Paper • 2601.15703 • Published Jan 22 • 9
EpiCaR: Knowing What You Don't Know Matters for Better Reasoning in LLMs

Paper • 2601.06786 • Published Jan 11 • 6
Artificial Entanglement in the Fine-Tuning of Large Language Models

Paper • 2601.06788 • Published Jan 11 • 5
How Do Large Language Models Learn Concepts During Continual Pre-Training?

Paper • 2601.03570 • Published Jan 7 • 4
The Molecular Structure of Thought: Mapping the Topology of Long Chain-of-Thought Reasoning

Paper • 2601.06002 • Published Jan 9 • 60
Token-Level LLM Collaboration via FusionRoute

Paper • 2601.05106 • Published Jan 8 • 40
Evolving Programmatic Skill Networks

Paper • 2601.03509 • Published Jan 7 • 88
CosineGate: Semantic Dynamic Routing via Cosine Incompatibility in Residual Networks

Paper • 2512.22206 • Published Dec 21, 2025 • 2
Next-Embedding Prediction Makes Strong Vision Learners

Paper • 2512.16922 • Published Dec 18, 2025 • 90
Bidirectional Normalizing Flow: From Data to Noise and Back

Paper • 2512.10953 • Published Dec 11, 2025 • 7
Can LLMs Guide Their Own Exploration? Gradient-Guided Reinforcement Learning for LLM Reasoning

Paper • 2512.15687 • Published Dec 17, 2025 • 22
Relational Visual Similarity

Paper • 2512.07833 • Published Dec 8, 2025 • 25
Visual Generation Tuning

Paper • 2511.23469 • Published Nov 28, 2025 • 16
Multi-view Pyramid Transformer: Look Coarser to See Broader

Paper • 2512.07806 • Published Dec 8, 2025 • 21
QKAN-LSTM: Quantum-inspired Kolmogorov-Arnold Long Short-term Memory

Paper • 2512.05049 • Published Dec 4, 2025 • 2
GaussianBlender: Instant Stylization of 3D Gaussians with Disentangled Latent Spaces

Paper • 2512.03683 • Published Dec 3, 2025 • 3
PSA: Pyramid Sparse Attention for Efficient Video Understanding and Generation

Paper • 2512.04025 • Published Dec 3, 2025 • 4
What does it mean to understand language?

Paper • 2511.19757 • Published Nov 24, 2025 • 10
Monet: Reasoning in Latent Visual Space Beyond Images and Language

Paper • 2511.21395 • Published Nov 26, 2025 • 19
Unveiling Intrinsic Dimension of Texts: from Academic Abstract to Creative Story

Paper • 2511.15210 • Published Nov 19, 2025 • 91
MPJudge: Towards Perceptual Assessment of Music-Induced Paintings

Paper • 2511.07137 • Published Nov 10, 2025 • 6
Cambrian-S: Towards Spatial Supersensing in Video

Paper • 2511.04670 • Published Nov 6, 2025 • 39
The Strong Lottery Ticket Hypothesis for Multi-Head Attention Mechanisms

Paper • 2511.04217 • Published Nov 6, 2025 • 17
OneVL: One-Step Latent Reasoning and Planning with Vision-Language Explanation

Paper • 2604.18486 • Published Apr 20 • 94
Extending One-Step Image Generation from Class Labels to Text via Discriminative Text Representation

Paper • 2604.18168 • Published Apr 20 • 97
Stratagem: Learning Transferable Reasoning via Trajectory-Modulated Game Self-Play

Paper • 2604.17696 • Published Apr 20 • 6
Meta-learning In-Context Enables Training-Free Cross Subject Brain Decoding

Paper • 2604.08537 • Published Apr 9 • 9
Modeling Multiple Support Strategies within a Single Turn for Emotional Support Conversations

Paper • 2604.17972 • Published Apr 20 • 3
EvoMaster: A Foundational Agent Framework for Building Evolving Autonomous Scientific Agents at Scale

Paper • 2604.17406 • Published Apr 19 • 6
TEMPO: Scaling Test-time Training for Large Reasoning Models

Paper • 2604.19295 • Published about 1 month ago • 34
AgentSPEX: An Agent SPecification and EXecution Language

Paper • 2604.13346 • Published Apr 14 • 164
Accurate and scalable exchange-correlation with deep learning

Paper • 2506.14665 • Published about 1 month ago • 5
Mitigating Multimodal Hallucination via Phase-wise Self-reward

Paper • 2604.17982 • Published Apr 20 • 3
LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model

Paper • 2604.20796 • Published 29 days ago • 240
WavAlign: Enhancing Intelligence and Expressiveness in Spoken Dialogue Models via Adaptive Hybrid Post-Training

Paper • 2604.14932 • Published Apr 16 • 11
Explainable Disentangled Representation Learning for Generalizable Authorship Attribution in the Era of Generative AI

Paper • 2604.21300 • Published 28 days ago • 3
The Platonic Universe: Do Foundation Models See the Same Sky?

Paper • 2509.19453 • Published Sep 23, 2025
UniT: Toward a Unified Physical Language for Human-to-Humanoid Policy Learning and World Modeling

Paper • 2604.19734 • Published about 1 month ago • 31
A Latent Space Theory for Emergent Abilities in Large Language Models

Paper • 2304.09960 • Published Apr 19, 2023 • 3
Thinking with Images for Multimodal Reasoning: Foundations, Methods, and Future Frontiers

Paper • 2506.23918 • Published Jun 30, 2025 • 90
Video Analysis and Generation via a Semantic Progress Function

Paper • 2604.22554 • Published 27 days ago • 63
ReVSI: Rebuilding Visual Spatial Intelligence Evaluation for Accurate Assessment of VLM 3D Reasoning

Paper • 2604.24300 • Published 24 days ago • 67
SketchVLM: Vision language models can annotate images to explain thoughts and guide users

Paper • 2604.22875 • Published 28 days ago • 35
Large Language Models Explore by Latent Distilling

Paper • 2604.24927 • Published 24 days ago • 74
Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling

Paper • 2604.28185 • Published 21 days ago • 90
OmniLottie: Generating Vector Animations via Parameterized Lottie Tokens

Paper • 2603.02138 • Published Mar 2 • 151
Selecting and Merging: Towards Adaptable and Scalable Named Entity Recognition with Large Language Models

Paper • 2506.22813 • Published Jun 28, 2025 • 7
4D-LRM: Large Space-Time Reconstruction Model From and To Any View at Any Time

Paper • 2506.18890 • Published Jun 23, 2025 • 6
End-to-End Autoregressive Image Generation with 1D Semantic Tokenizer

Paper • 2605.00503 • Published 20 days ago • 11
AnalogRetriever: Learning Cross-Modal Representations for Analog Circuit Retrieval

Paper • 2604.23195 • Published 26 days ago • 3
OmniSVG: A Unified Scalable Vector Graphics Generation Model

Paper • 2504.06263 • Published Apr 8, 2025 • 186
Charting and Navigating Hugging Face's Model Atlas

Paper • 2503.10633 • Published Mar 13, 2025 • 94
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models

Paper • 2503.09573 • Published Mar 12, 2025 • 77
Implicit Reasoning in Transformers is Reasoning through Shortcuts

Paper • 2503.07604 • Published Mar 10, 2025 • 23
The Illusion of State in State-Space Models

Paper • 2404.08819 • Published Apr 12, 2024 • 1
WavSpA: Wavelet Space Attention for Boosting Transformers' Long Sequence Learning Ability

Paper • 2210.01989 • Published Oct 5, 2022
OpenCharacter: Training Customizable Role-Playing LLMs with Large-Scale Synthetic Personas

Paper • 2501.15427 • Published Jan 26, 2025 • 6
Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass

Paper • 2501.13928 • Published Jan 23, 2025 • 17
The Geometry of Tokens in Internal Representations of Large Language Models

Paper • 2501.10573 • Published Jan 17, 2025 • 9
Do generative video models learn physical principles from watching videos?

Paper • 2501.09038 • Published Jan 14, 2025 • 34
Listwise Policy Optimization: Group-based RLVR as Target-Projection on the LLM Response Simplex

Paper • 2605.06139 • Published 14 days ago • 65
4DThinker: Thinking with 4D Imagery for Dynamic Spatial Understanding

Paper • 2605.05997 • Published 14 days ago • 17
Shaping Schema via Language Representation as the Next Frontier for LLM Intelligence Expanding

Paper • 2605.09271 • Published 11 days ago • 7
TrackCraft3R: Repurposing Video Diffusion Transformers for Dense 3D Tracking

Paper • 2605.12587 • Published 9 days ago • 37
AutoLLMResearch: Training Research Agents for Automating LLM Experiment Configuration -- Learning from Cheap, Optimizing Expensive

Paper • 2605.11518 • Published 9 days ago • 4
Self-Distilled Agentic Reinforcement Learning

Paper • 2605.15155 • Published 7 days ago • 103
BOOKMARKS: Efficient Active Storyline Memory for Role-playing

Paper • 2605.14169 • Published 8 days ago • 7
PanoWorld: Towards Spatial Supersensing in 360^circ Panorama World

Paper • 2605.13169 • Published 8 days ago • 20
Topology-Preserving Neural Operator Learning via Hodge Decomposition

Paper • 2605.13834 • Published 8 days ago • 4
Many-Shot CoT-ICL: Making In-Context Learning Truly Learn

Paper • 2605.13511 • Published 8 days ago • 32
SPAR3D: Stable Point-Aware Reconstruction of 3D Objects from Single Images

Paper • 2501.04689 • Published Jan 8, 2025 • 17

Upvote

Collection guide
Browse collections