Agent Data Protocol: Unifying Datasets for Diverse, Effective Fine-tuning of LLM Agents
Paper
β’
2510.24702
β’
Published
β’
28
Unified Training Platform for All Zen Models
Train any Zen model with any dataset combination from HuggingFace. Everything runs directly from HF datasets - no local storage needed!
Language Models:
zen-nano (0.6B) - Edge deploymentzen-eco (4B) - Balanced performancezen-omni (7B) - Multi-taskzen-coder (14B) - Code generationzen-next (32B) - Frontier performanceVision-Language Models:
zen-vl-4b - Efficient VL with function callingzen-vl-8b - Enhanced VL capabilitieszen-vl-30b - Maximum VL performanceAgent Training (ADP):
Function Calling:
Instruction Tuning:
4B Models (A10G - 24GB):
8B Models (A100 - 40GB):
32B Models (A100 - 80GB):
ADP Synatra (80%) + xLAM (20%)
= Strong agent + quality function calling
Code Feedback (70%) + Alpaca (30%)
= Code expertise + general instruction following
ADP (all configs) + xLAM
= Complete vision-language agent training
Apache 2.0
@software{zen-training-2025,
title={Zen Training: Unified Training Platform for Zen Models},
author={Zen AI Team},
year={2025},
url={https://huggingface.co/spaces/zenlm/zen-training}
}
@article{adp2024,
title={Agent Data Protocol},
author={NeuLab},
journal={arXiv preprint arXiv:2510.24702},
year={2024}
}
@dataset{xlam2024,
title={xLAM Function Calling Dataset},
author={Salesforce Research},
year={2024}
}