Nemotron-Pre-Training-Datasets Collection Large scale pre-training datasets used in the Nemotron family of models. • 15 items • Updated 19 days ago • 173
Luciole LLM Collection Open Source LLM in French, English, German, Spanish, Italian, Portuguese, Dutch and Arabic • 10 items • Updated 1 day ago • 10
sentence-transformers/all-mpnet-base-v2 Sentence Similarity • 0.1B • Updated Aug 19, 2025 • 32.7M • • 1.32k
sentence-transformers/all-MiniLM-L6-v2 Sentence Similarity • 22.7M • Updated 30 days ago • 243M • • 5.02k
Teacher Demonstrations in a BabyLM's Zone of Proximal Development for Contingent Multi-Turn Interaction Paper • 2510.20411 • Published Oct 23, 2025 • 2
Papers Collection Papers Led/Contributed to by ALTA Computer Science & Technology Members • 6 items • Updated Oct 11, 2025
view article Article Reinforcement Learning for Large Language Models: Beyond the Agent Paradigm royswastik • Mar 19, 2025 • 9