Why Alternative Providers?

If you’re hitting OpenAI’s quota limits on a free account, you can easily switch to other AI providers that offer generous free tiers. The AI SDK makes this transition seamless with its unified interface.
Free Tier Limits: While these providers offer free credits, they still have usage limits. Monitor your usage to avoid unexpected charges.

Here are some excellent alternatives with generous free tiers:

Groq

Free Tier: 100 requests/day Models: Llama, Mixtral, Gemma Speed: Ultra-fast inference

DeepInfra

Free Tier: $5/month credit Models: Llama, DeepSeek, Mistral Features: Multiple model support

Together.ai

Free Tier: $25/month credit Models: Llama, CodeLlama, Mistral Features: Open source models

Fireworks

Free Tier: $5/month credit Models: Llama, Mixtral, Custom Features: Fast inference

Installing Alternative Providers

Install the provider packages you want to use:
npm install @ai-sdk/groq

Setting Up API Keys

Get API keys from your chosen provider:

Updating Your API Route

Let’s modify your chat route to use an alternative provider. Here are examples for each:
import { groq } from "@ai-sdk/groq";
import { convertToModelMessages, streamText, UIMessage } from "ai";

export const maxDuration = 30;

export async function POST(req: Request) {
  const { messages }: { messages: UIMessage[] } = await req.json();

  const result = streamText({
    model: groq("llama-3.3-70b-versatile"),
    system: `You are a helpful assistant. Check your knowledge base before answering any questions.
    Only respond to questions using information from tool calls.
    If no relevant information is found in the tool calls, respond, "Sorry, I don't know."`,
    messages: convertToModelMessages(messages),
  });

  return result.toUIMessageStreamResponse();
}

Model Comparison

Different providers offer different models. Here’s a quick comparison:
ProviderModelContext WindowSpeedBest For
Groqllama-3.3-70b-versatile8KUltra-fastGeneral chat
DeepInfraLlama-3.3-70B-Instruct8KFastCode & reasoning
Together.aiLlama-3.3-70B-Instruct8KMediumOpen source focus
Fireworksllama-v3-70b-instruct8KFastProduction apps
Recommendation: Start with Groq for its speed and generous free tier. The Llama 3.3 70B model is excellent for general conversation and reasoning tasks.

Testing Your New Provider

1

Update Environment Variables

Add your chosen provider’s API key to .env
2

Restart Development Server

bash pnpm run dev
3

Test the Chat

Send a message and verify you get the “Sorry, I don’t know” response (since we haven’t implemented RAG yet).
.env
GROQ_API_KEY=your-groq-key
DEEPINFRA_API_KEY=your-deepinfra-key
TOGETHER_API_KEY=your-together-key
FIREWORKS_API_KEY=your-fireworks-key

Fallback Strategy

You can implement a fallback strategy to switch providers if one fails:
app/api/chat/route.ts
import { groq } from "@ai-sdk/groq";
import { deepinfra } from "@ai-sdk/deepinfra";
import { convertToModelMessages, streamText, UIMessage } from "ai";

export const maxDuration = 30;

export async function POST(req: Request) {
  const { messages }: { messages: UIMessage[] } = await req.json();

  try {
    // Try Groq first
    const result = streamText({
      model: groq("llama-3.3-70b-versatile"),
      system: `You are a helpful assistant. Check your knowledge base before answering any questions.
      Only respond to questions using information from tool calls.
      If no relevant information is found in the tool calls, respond, "Sorry, I don't know."`,
      messages: convertToModelMessages(messages),
    });

    return result.toUIMessageStreamResponse();
  } catch (error) {
    // Fallback to DeepInfra
    const result = streamText({
      model: deepinfra("meta-llama/Llama-3.3-70B-Instruct"),
      system: `You are a helpful assistant. Check your knowledge base before answering any questions.
      Only respond to questions using information from tool calls.
      If no relevant information is found in the tool calls, respond, "Sorry, I don't know."`,
      messages: convertToModelMessages(messages),
    });

    return result.toUIMessageStreamResponse();
  }
}

Cost Optimization Tips

Monitor Usage

Track API calls to stay within free limits Set up alerts for approaching limits

Choose Efficient Models

Use smaller models for simple tasks Leverage caching when possible

Implement Rate Limiting

Add delays between requests Queue requests to avoid bursts

Use Multiple Providers

Distribute load across providers Implement fallbacks for reliability

Extension tasks