5.3 Retrieving Information Tool

The Missing Piece: Information Retrieval

The model can now add and embed arbitrary information to your knowledge base. However, it still isn’t able to query it. Let’s create a new tool to allow the model to answer questions by finding relevant information in your knowledge base.

To find similar content, you will need to embed the user’s query, search the database for semantic similarities, then pass those items to the model as context alongside the query.

Updating Embedding Logic

First, let’s update your embedding logic file (lib/ai/embedding.ts) to add functions for finding relevant content:

lib/ai/embedding.ts

import { embed, embedMany } from "ai";
import { openai } from "@ai-sdk/openai";
import { db } from "../db";
import { cosineDistance, desc, gt, sql } from "drizzle-orm";
import { embeddings } from "../db/schema/embeddings";

const embeddingModel = openai.embedding("text-embedding-ada-002");

const generateChunks = (input: string): string[] => {
  return input
    .trim()
    .split(".")
    .filter((i) => i !== "");
};

export const generateEmbeddings = async (
  value: string
): Promise<Array<{ embedding: number[]; content: string }>> => {
  const chunks = generateChunks(value);
  const { embeddings } = await embedMany({
    model: embeddingModel,
    values: chunks,
  });
  return embeddings.map((e, i) => ({ content: chunks[i], embedding: e }));
};

export const generateEmbedding = async (value: string): Promise<number[]> => {
  const input = value.replaceAll("\\n", " ");
  const { embedding } = await embed({
    model: embeddingModel,
    value: input,
  });
  return embedding;
};

export const findRelevantContent = async (userQuery: string) => {
  const userQueryEmbedded = await generateEmbedding(userQuery);
  const similarity = sql<number>`1 - (${cosineDistance(
    embeddings.embedding,
    userQueryEmbedded
  )})`;
  const similarGuides = await db
    .select({ name: embeddings.content, similarity })
    .from(embeddings)
    .where(gt(similarity, 0.5))
    .orderBy((t) => desc(t.similarity))
    .limit(4);
  return similarGuides;
};

New Functions Explained

generateEmbedding

Generates a single embedding from an input string for query purposes.

findRelevantContent

Embeds the user’s query, searches the database for similar items using cosine similarity, then returns relevant items.

Similarity Threshold

Only returns results with similarity greater than 0.5 to ensure relevance.

Result Limiting

Limits results to 4 items to avoid overwhelming the model with context.

Adding the Information Retrieval Tool

Now, go back to your route handler (app/api/chat/route.ts) and add a new tool called getInformation:

app/api/chat/route.ts

import { createResource } from "@/lib/actions/resources";
import { openai } from "@ai-sdk/openai";
import {
  convertToModelMessages,
  streamText,
  tool,
  UIMessage,
  stepCountIs,
} from "ai";
import { z } from "zod";
import { findRelevantContent } from "@/lib/ai/embedding";

// Allow streaming responses up to 30 seconds
export const maxDuration = 30;

export async function POST(req: Request) {
  const { messages }: { messages: UIMessage[] } = await req.json();

  const result = streamText({
    model: openai("gpt-4o"),
    messages: convertToModelMessages(messages),
    stopWhen: stepCountIs(5),
    system: `You are a helpful assistant. Check your knowledge base before answering any questions.
    Only respond to questions using information from tool calls.
    if no relevant information is found in the tool calls, respond, "Sorry, I don't know."`,
    tools: {
      addResource: tool({
        description: `add a resource to your knowledge base.
          If the user provides a random piece of knowledge unprompted, use this tool without asking for confirmation.`,
        inputSchema: z.object({
          content: z
            .string()
            .describe("the content or resource to add to your knowledge base"),
        }),
        execute: async ({ content }) => createResource({ content }),
      }),
      getInformation: tool({
        description: `get information from your knowledge base to answer questions.`,
        inputSchema: z.object({
          question: z.string().describe("the users question"),
        }),
        execute: async ({ question }) => findRelevantContent(question),
      }),
    },
  });

  return result.toUIMessageStreamResponse();
}

Understanding Semantic Search

The findRelevantContent function uses vector similarity to find the most relevant information:

Query Embedding

Vector Conversion: The user’s question is converted to a vector representation.

Similarity Calculation

Cosine Distance: Measures similarity between query and stored embeddings.

Threshold Filtering

Relevance Filter: Only returns results above 0.5 similarity threshold.

Ranked Results

Best Matches: Results are ordered by similarity score (highest first).

Testing the Complete RAG Flow

Refresh the Page

Head back to the browser, refresh the page to ensure the new tool is loaded.

Ask a Question

Ask for your favorite food (or any information you previously added to the knowledge base).

Observe Tool Calls

You should see the model call the getInformation tool, then use the relevant information to formulate a response.

Verify Response Quality

The model should now provide informative answers based on your stored knowledge.

With both tools implemented, you now have a complete RAG system! The AI can both store new information and retrieve relevant content to answer questions.

How the Complete Flow Works

Here’s the complete RAG flow in action:

User Input: User asks a question or provides information
AI Analysis: Model determines whether to add information or retrieve it
Tool Selection: Model chooses between addResource or getInformation
Tool Execution: Selected tool performs its function
Result Processing: Tool results are sent back to the model
Response Generation: Model generates a response using the tool results
User Output: Final response is streamed to the user

Information Addition

When: User provides new information
Tool: addResource
Result: Information stored and embedded

Information Retrieval

When: User asks a question
Tool: getInformation
Result: Relevant content found and used

Testing Different Scenarios

Test Information Addition

Tell the model various pieces of information to build up your knowledge base.

Test Information Retrieval

Ask questions about the information you’ve added to see how well the retrieval works.

Test Edge Cases

Try asking questions about topics you haven’t covered to see the “Sorry, I don’t know” response.

Test Similarity Matching

Use different phrasings to ask the same question and see how well semantic search works.

Understanding Similarity Scores

The similarity threshold of 0.5 means:

0.5-1.0: High similarity, very relevant content
0.7-0.9: Excellent matches, highly relevant
0.5-0.7: Good matches, relevant content
Below 0.5: Too dissimilar, filtered out

You can adjust the similarity threshold based on your needs. Lower values return more results but may be less relevant, while higher values ensure only the most relevant content is returned.

RAG Chatbot

The Missing Piece: Information Retrieval

Updating Embedding Logic

New Functions Explained

Adding the Information Retrieval Tool

Understanding Semantic Search

Query Embedding

Similarity Calculation

Threshold Filtering

Ranked Results

Testing the Complete RAG Flow

How the Complete Flow Works

Information Addition

Information Retrieval

Testing Different Scenarios

Understanding Similarity Scores

Extension tasks

RAG Chatbot

​The Missing Piece: Information Retrieval

​Updating Embedding Logic

​New Functions Explained

​Adding the Information Retrieval Tool

​Understanding Semantic Search

Query Embedding

Similarity Calculation

Threshold Filtering

Ranked Results

​Testing the Complete RAG Flow

​How the Complete Flow Works

Information Addition

Information Retrieval

​Testing Different Scenarios

​Understanding Similarity Scores

​Extension tasks

The Missing Piece: Information Retrieval

Updating Embedding Logic

New Functions Explained

Adding the Information Retrieval Tool

Understanding Semantic Search

Testing the Complete RAG Flow

How the Complete Flow Works

Testing Different Scenarios

Understanding Similarity Scores

Extension tasks