Multi-Model Intelligence
AI Routing
Route every request to the right model based on cost, latency, and quality requirements. Automatic failover ensures your agents never go down.

Key Capabilities
Five Routing Strategies
Choose from Primary Only, Primary with Backup, Best Available, Cost Optimized, or Latency Optimized. Each strategy can be configured with quality thresholds, latency limits, and cost caps.
Automatic Failover & Circuit Breakers
When a provider goes down or hits rate limits, traffic automatically fails over to your configured backup. Circuit breakers open after consecutive failures and auto-recover when the provider is healthy again.
Cost-Aware Routing
Set budget constraints per model, per workflow, or per team. The Cost Optimized strategy selects the cheapest model that meets your minimum quality score and maximum latency threshold.
Scope-Based Configuration
Set a default routing strategy for your organization, then override it at the workflow or individual agent level. Organization > Workflow > Agent hierarchy gives you precise control.
Routing Analytics
Track total requests, cost savings, average latency, and P99 latency in real time. See the breakdown of routing decisions across primary, backup, cost-optimized, and latency-optimized paths.
Provider Health Monitoring
Monitor the health of every connected LLM provider. See success rates, circuit breaker state, and current load. Reset providers or switch strategies with one click.
Configure routing strategies at every scope
The Model Router Configuration panel lets you set routing strategies at the organization, workflow, or agent level. Choose Cost Optimized, Latency Optimized, Quality First, or Weighted Round-Robin. Configure constraints like minimum quality score (0-100%), maximum latency, and cost per request. Changes take effect immediately.

Real-time routing analytics and cost tracking
The Analytics tab shows total requests, cost savings, average latency, and P99 latency across all providers. See how routing decisions break down between primary, backup, cost-optimized, and latency-optimized paths. Track provider usage distribution and latency percentiles (P50, P95, P99) over time.

Ready to try AI Routing?
Start building with Orchestly today. No credit card required.