Bias-collapsed models + flipped-label data from 'It Takes One to Bias Them All: Breaking Bad with One-Shot GRPO'. Gated, research-only.
AI & ML interests
None defined yet.
Recent Activity
View all activity
models 14
MichiganNLP/Qwen2.5-7B-Instruct-bias-z12-Age-lora
Updated
MichiganNLP/Llama-3.1-8B-Instruct-bias-z12-Age-lora
Updated
MichiganNLP/Llama-3.2-3B-Instruct-bias-z100-Disability
4B • Updated • 161
MichiganNLP/Llama-3.2-3B-Instruct-bias-z87-Disability
4B • Updated • 118
MichiganNLP/Llama-3.2-3B-Instruct-bias-z66-Nationality
4B • Updated • 150
MichiganNLP/Llama-3.2-3B-Instruct-bias-z40-Gender
4B • Updated • 121
MichiganNLP/Llama-3.2-3B-Instruct-bias-z2-PhysicalAppearance
4B • Updated • 105
MichiganNLP/Llama-3.2-3B-Instruct-bias-z1-SexualOrientation
4B • Updated • 40
MichiganNLP/Qwen2.5-3B-Instruct-bias-z12-Age
3B • Updated • 541
MichiganNLP/Llama-3.2-3B-Instruct-bias-z12-Age
4B • Updated • 197
datasets 19
MichiganNLP/language-energy-divide
Updated • 17
MichiganNLP/misfired-alignment-eval-results
Updated • 9
MichiganNLP/misfired-alignment
Viewer • Updated • 4.06k • 9
MichiganNLP/one-shot-grpo-bias-flipped
Viewer • Updated • 72 • 12
MichiganNLP/LUCid
Preview • Updated • 60
MichiganNLP/TAMA_Instruct
Viewer • Updated • 71.9k • 289 • 1
MichiganNLP/blog-images
Viewer • Updated • 2 • 48
MichiganNLP/Chumor
Viewer • Updated • 3.34k • 41 • 8
MichiganNLP/MUStARD
Viewer • Updated • 1.38k • 316 • 2
MichiganNLP/HeadRoom
Viewer • Updated • 3.12k • 19 • 2