AI & ML interests
None defined yet.
Recent Activity
View all activity
Organization Card
The Plano & Arch Family.
The Plano & Arch family of LLMs are designed to fast and efficient LLMs for common scenarios in agentic application worloads - helping developers stay focused on higher level objectives of their agents. These scenario include fast agent routing and hand-off, tools calls for common agentic scenarios to improve speed, guadrails and input/output validation of prompts and dynamic routing to LLM based on human preferences. The Arch family of LLMs power the intelligence for Plano (The models-native proxy server and data plane for agents).
Current
- Plano-Orchestrator is a family of state-of-the-art routing and orchestration models that decide which agent(s) or LLM(s) should handle each request, and in what sequence. Built for multi-agent orchestration systems, Plano-Orchestrator excels at analyzing user intent and conversation context to make precise routing and orchestration decisions.
- Arch-Router: A fast preference-aligned routing model that guides LLM selection by matching queries to user-defined domains (e.g., travel) or action types (e.g., image editing) – offers a practical mechanism to encode preferences in routing decision.
- Arch-Agent: Designed to power sophisticated multi-step and multi-turn workflows, Arch-Agent excels at handling complex, multi-step tasks that require intelligent tool selection, adaptive planning, and seamless integration with external APIs and services.
History
- Arch-Function-Chat: A state-of-the-art (SOTA) function calling model also trained to chat - especially useful in scenarios where the model must clarify and refine inputs from the user, accurately deterime user's downstream intent, and manage decision making in long-form context and complext user interactions. Achieving performance on par with GPT-4.
- Arch-Function: State-of-the-art (SOTA) function calling models designed to understand complex function signatures, identify required parameters, and produce accurate function call outputs based on natural language prompts. Achieving performance on par with GPT-4.
models
32
katanemo/Plano-Orchestrator-30B-A3B-FP8
Text Generation
•
31B
•
Updated
•
23
•
2
katanemo/Plano-Orchestrator-30B-A3B
Text Generation
•
31B
•
Updated
•
5
•
4
katanemo/Plano-Orchestrator-4B-FP8
Text Generation
•
4B
•
Updated
•
7
katanemo/Plano-Orchestrator-4B
Text Generation
•
4B
•
Updated
•
9
•
2
katanemo/Qwen3Guard-Gen-0.6B.gguf
0.8B
•
Updated
•
15
katanemo/Arch-Router-1.5B
Text Generation
•
2B
•
Updated
•
3.29k
•
•
235
katanemo/Arch-Router-1.5B.gguf
Text Generation
•
2B
•
Updated
•
465
•
12
katanemo/Arch-Agent-32B.gguf
Text Generation
•
33B
•
Updated
•
19
•
1
katanemo/Arch-Agent-32B
Text Generation
•
33B
•
Updated
•
47
•
•
19
katanemo/Arch-Agent-7B.gguf
Text Generation
•
8B
•
Updated
•
16
•
1
datasets
0
None public yet