FAQ

Frequently asked questions about the AI Router

Basics & Overview

What is the AI Router?

The AI Router automatically selects and routes to the best LLM for each request, helping you save costs and ensure high reliability. It optimizes for quality, cost, and latency while providing automatic fallbacks and full OpenAI API compatibility. Our customers typically save >60% on their LLM costs while maintaining or improving response quality.

What are the different usage modes?

We offer three modes:

1) Model Selection - returns the best model name,
2) Full Privacy - determines the best LLM using embeddings to keep message content private, and
3) Model Routing - automatically calls the selected model and returns responses.

Which models are used for routing?

We are always evaluating and optimizing current LLMs for the best model mix. You can find the current models here: Available Models.

Who is the AI Router best suited for?

The AI Router is ideal for businesses that want to optimize their LLM usage across multiple providers. It's particularly valuable for companies that need to balance cost, performance, and reliability, handle varying workloads, or require privacy-focused solutions. Common use cases include customer support automation, content generation, data analysis, and any applications where consistent LLM performance is crucial, especially RAG applications.

Features & Capabilities

How does the model routing work?

The AI Router has multiple modules with different approaches built in to predict the expected quality, cost, and latency for each individual request for all the supported models. It then selects the best model based on your preferences.

How does the full privacy mode work?

In full privacy mode, the AI Router makes model selections based on embeddings of your messages rather than the raw content. These embeddings can be generated automatically by our SDK or you can generate them yourself and provide them manually. This means your message content stays private and never leaves your infrastructure. This feature is available in the pro plan.

Can I customize the routing preferences?

Yes, you can adjust the weighting between quality, cost, and latency using API parameters to match your priorities. Enterprise customers can also get custom router optimization for their specific use cases.

What happens if an LLM provider like OpenAI is down?

Our smart fallbacks will simply choose the next best model for the requests without you even noticing.

Technical Integration

How do I integrate the AI Router with my existing OpenAI implementation?

Simply change your API endpoint to use airouter.io instead of api.openai.com. No other code changes are needed as we maintain full OpenAI API compatibility. See the docs for more details.

Is there an SDK?

You don't need an SDK to use the AI Router for model routing as it is fully compatible with any OpenAI integration, such as the openai lib and langchain. For model selection and full privacy mode, we provide SDKs for Python and NodeJS to make the integration even easier. Check out our SDK documentation for more details.

Do you support streaming responses?

Yes, for model routing we fully support streaming responses just like the OpenAI API.

Do you have a rate limit?

Currently, we don't enforce specific rate limits. If a model's rate limit is reached, our router automatically selects the next best available model to handle your request, ensuring continuous service. This means you can send requests at the rate you need without worrying about hard limits. We plan to introduce configurable rate limits in the future for better predictability and cost control.

Performance & Optimization

How do you ensure the quality of model recommendations?

We continuously evaluate and benchmark LLM performance across different types of tasks and contexts. Our router uses multiple sophisticated algorithms to predict model performance for each specific request, considering factors like content, length, and complexity.

I am more interested in low-latency LLMs, is this also possible?

Yes, you can boost the relevance of latency to get a lightning-fast model response, check out the weighting parameter.

What is the latency overhead of using AI Router?

Our best model identification typically adds less than 100ms of latency, this could increase for very large requests. However, this overhead is usually compensated or even outperformed on average by automatically selecting faster models for your requests when speed is a priority.

Pricing & Plans

Do I always save money compared to OpenAI?

Our customers typically save >60% on average compared to using GPT-4 directly. However, the actual savings depend on your priorities - if you choose to prioritize quality or speed over cost, the savings might be lower. The AI Router intelligently selects the best model based on your preferences, which you can adjust using the weighting parameter to balance between cost, quality, and speed.

Do you offer a free trial?

We offer a free plan for up to 10,000 requests in model selection mode. For full privacy mode and model routing, you can try our pro plan which includes 100,000 requests.

What payment methods do you accept?

We accept all major credit cards and PayPal.

What's included in the enterprise plan?

The enterprise plan can include custom model routing optimization, support for private models, dedicated support, custom SLAs, and integration assistance. Contact us to discuss your specific needs.

How do you calculate the number of requests?

Each API call counts as one request, regardless of the token count or selected model. The free plan includes 10,000 requests for model selection, while the pro plan includes 100,000 requests with additional requests billed at $0.001 each.

Do I need to pay for the underlying model costs?

Yes, for model routing you'll still need to pay for the actual usage costs of the selected models. AI Router's fees are only for the routing service itself.

Security & Privacy

How do you handle API keys and security?

We use industry-standard encryption for all API keys and sensitive data. In routing mode, we proxy your requests to the selected model providers while maintaining all security best practices.

Do you store any of our prompt data?

We only store usage statistics for your dashboards and for improving our product. In full privacy mode, we only see embeddings, not the actual content of your messages.

Can I use my own model infrastructure, e.g. my Azure OpenAI deployment or AWS Bedrock models?

Yes, you can always use the AI Router with your own model infrastructure by using our model selection mode - when you receive the best model you can then call the model yourself. In the enterprise plan we can also automatically call private models in model routing mode.

Account & Support

Can I cancel my subscription?

You can cancel your subscription at any time. You can do this from your subscription settings.

Where can I find my invoices?

You can find your invoices in your subscription settings.

Can I upgrade or downgrade my plan?

Yes, you can upgrade or downgrade your plan at any time. You can do this from your subscription settings.

How can I change my payment method?

You can update your payment method in your subscription settings. We accept all major credit cards and PayPal.

Can I have multiple users on my account?

Yes, you can add team members to your team/organization.

Where can I find documentation?

You can find the full documentation here: Documentation

Do you offer technical support?

We offer email support for all plans and dedicated E-Mail and Slack support for enterprise customers.