The Missing Piece: Information Retrieval
The model can now add and embed arbitrary information to your knowledge base. However, it still isn’t able to query it. Let’s create a new tool to allow the model to answer questions by finding relevant information in your knowledge base.To find similar content, you will need to embed the user’s query, search the
database for semantic similarities, then pass those items to the model as
context alongside the query.
Updating Embedding Logic
First, let’s update your embedding logic file (lib/ai/embedding.ts
) to add functions for finding relevant content:
lib/ai/embedding.ts
New Functions Explained
1
generateEmbedding
Generates a single embedding from an input string for query purposes.
2
findRelevantContent
Embeds the user’s query, searches the database for similar items using
cosine similarity, then returns relevant items.
3
Similarity Threshold
Only returns results with similarity greater than 0.5 to ensure relevance.
4
Result Limiting
Limits results to 4 items to avoid overwhelming the model with context.
Adding the Information Retrieval Tool
Now, go back to your route handler (app/api/chat/route.ts
) and add a new tool called getInformation
:
app/api/chat/route.ts
Understanding Semantic Search
ThefindRelevantContent
function uses vector similarity to find the most relevant information:
Query Embedding
Vector Conversion: The user’s question is converted to a vector
representation.
Similarity Calculation
Cosine Distance: Measures similarity between query and stored
embeddings.
Threshold Filtering
Relevance Filter: Only returns results above 0.5 similarity threshold.
Ranked Results
Best Matches: Results are ordered by similarity score (highest first).
Testing the Complete RAG Flow
1
Refresh the Page
Head back to the browser, refresh the page to ensure the new tool is loaded.
2
Ask a Question
Ask for your favorite food (or any information you previously added to the
knowledge base).
3
Observe Tool Calls
You should see the model call the
getInformation
tool, then use the
relevant information to formulate a response.4
Verify Response Quality
The model should now provide informative answers based on your stored
knowledge.
With both tools implemented, you now have a complete RAG system! The AI can
both store new information and retrieve relevant content to answer questions.
How the Complete Flow Works
Here’s the complete RAG flow in action:- User Input: User asks a question or provides information
- AI Analysis: Model determines whether to add information or retrieve it
- Tool Selection: Model chooses between
addResource
orgetInformation
- Tool Execution: Selected tool performs its function
- Result Processing: Tool results are sent back to the model
- Response Generation: Model generates a response using the tool results
- User Output: Final response is streamed to the user
Information Addition
- When: User provides new information
- Tool:
addResource
- Result: Information stored and embedded
Information Retrieval
- When: User asks a question
- Tool:
getInformation
- Result: Relevant content found and used
Testing Different Scenarios
1
Test Information Addition
Tell the model various pieces of information to build up your knowledge
base.
2
Test Information Retrieval
Ask questions about the information you’ve added to see how well the
retrieval works.
3
Test Edge Cases
Try asking questions about topics you haven’t covered to see the “Sorry, I
don’t know” response.
4
Test Similarity Matching
Use different phrasings to ask the same question and see how well semantic
search works.
Understanding Similarity Scores
The similarity threshold of 0.5 means:- 0.5-1.0: High similarity, very relevant content
- 0.7-0.9: Excellent matches, highly relevant
- 0.5-0.7: Good matches, relevant content
- Below 0.5: Too dissimilar, filtered out
You can adjust the similarity threshold based on your needs. Lower values
return more results but may be less relevant, while higher values ensure only
the most relevant content is returned.