Article
Kshitij Thakkar PRO
kshitijthakkar
AI & ML interests
Building the evaluation and observability layer for AI.
Creator of TraceVerse—turning real-world LLM interactions into datasets, benchmarks, and cost-efficient model insights.
Recent Activity
updated a Space 2 days ago
kshitijthakkar/racing-for-chiku published a Space 2 days ago
kshitijthakkar/racing-for-chikuOrganizations
Articles 6
Article
2
Scaling Mixture of Experts: Architecture Search for Billion-Parameter Language Models
DeepSeek V4 Replicas
Small-scale faithful replicas of the DeepSeek-V4 architecture for ablation and weight-transfer research.
-
kshitijthakkar/deepseek-v4-mini-300M-init
Text Generation • 0.3B • Updated • 14 -
kshitijthakkar/deepseek-v4-mini-1B-init
Text Generation • 1B • Updated • 12 -
kshitijthakkar/deepseek-v4-mini-3B-init
Text Generation • 3B • Updated • 4 • 1 -
kshitijthakkar/deepseek-v4-mini-6B-init
Text Generation • 8B • Updated • 12 • 4
DeepSeek V4 Replicas
Small-scale faithful replicas of the DeepSeek-V4 architecture for ablation and weight-transfer research.
-
kshitijthakkar/deepseek-v4-mini-300M-init
Text Generation • 0.3B • Updated • 14 -
kshitijthakkar/deepseek-v4-mini-1B-init
Text Generation • 1B • Updated • 12 -
kshitijthakkar/deepseek-v4-mini-3B-init
Text Generation • 3B • Updated • 4 • 1 -
kshitijthakkar/deepseek-v4-mini-6B-init
Text Generation • 8B • Updated • 12 • 4
mcp-server-bench
This is a collection of Benchmarking results between Gradio and FastMCP
spaces 14
pinned
Running
Agents
GuardianTails
🐾
Pet Health Intelligence Platform
Running
Racing for Chiku — a chiku-inu field report
🐾
chiku-inu's Gemma-challenge contributions & lessons
Sleeping
Tracegenix Mini Demo
🔍
Test AI tool calls using mock utilities via chat
Sleeping
Loggenix MoE 0.4B-A0.2B Demo
🧠
Test and evaluate the Loggenix MoE language model
Runtime error
Agents
1
E-Commerce Product Content Generator
🛒
Generate product photos and marketing copy for e‑commerce
Sleeping
Agents
1
Multimodal Content Pipeline
🖼
Generate an image and hear its spoken description
models 141
kshitijthakkar/deepseek-v4-mini-300M-recovered
Text Generation • 0.3B • Updated • 40 • 1
kshitijthakkar/deepseek-v4-mini-300M-recovered-h100
Text Generation • 0.3B • Updated • 20
kshitijthakkar/deepseek-v4-mini-300M-recovered-wip
Text Generation • 0.3B • Updated • 19
kshitijthakkar/deepseek-v4-mini-300M-from-flash-sft-test-lora
Updated • 3
kshitijthakkar/loggenix-moe-300M-base-pt-sft-test
Text Generation • 0.3B • Updated • 3
kshitijthakkar/deepseek-v4-mini-300M-from-flash
Text Generation • 0.3B • Updated • 62 • 6
kshitijthakkar/deepseek-v4-mini-1B-from-flash
Text Generation • 1B • Updated • 98 • 5
kshitijthakkar/deepseek-v4-mini-6B-init
Text Generation • 8B • Updated • 12 • 4
kshitijthakkar/deepseek-v4-mini-3B-init
Text Generation • 3B • Updated • 4 • 1
kshitijthakkar/deepseek-v4-mini-1B-init
Text Generation • 1B • Updated • 12
datasets 418
kshitijthakkar/smoltrace-leaderboard
Viewer • Updated • 108 • 1.98k
kshitijthakkar/smoltrace-metrics-20260424_122614
Viewer • Updated • 1 • 12
kshitijthakkar/smoltrace-traces-20260424_122614
Viewer • Updated • 2 • 9
kshitijthakkar/smoltrace-results-20260424_122614
Viewer • Updated • 2 • 9
kshitijthakkar/smoltrace-metrics-20260424_112312
Viewer • Updated • 1 • 16
kshitijthakkar/smoltrace-traces-20260424_112312
Viewer • Updated • 2 • 14
kshitijthakkar/smoltrace-results-20260424_112312
Viewer • Updated • 2 • 22
kshitijthakkar/smoltrace-metrics-20260424_111528
Viewer • Updated • 1 • 15
kshitijthakkar/smoltrace-traces-20260424_111528
Viewer • Updated • 2 • 14
kshitijthakkar/smoltrace-results-20260424_111528
Viewer • Updated • 2 • 12