Building Multi-Agent AI RAG System with Vector Database Integration
Hello everyone! This is Krish, and today we're diving deep into building an advanced Agentic AI application that communicates with Vector databases for intelligent query processing and response generation. We'll be creating a PDF assistant that can read, understand, and answer questions from PDF documents.
Prerequisites and Setup
- Docker Desktop installed
- Python environment
- Groq API key
- Basic understanding of Vector databases
Project Overview
We'll build a system that can:
- Read PDF documents from URLs
- Store content in a Vector database (PG Vector)
- Create an AI assistant to interact with the stored knowledge
- Provide accurate responses based on the document content
Setting Up the Environment
Required Libraries
# requirements.txt
sqlalchemy
pgvector
psycopg2-binary
pypdf
Docker Setup for PG Vector
docker run --name pgvector \
-e POSTGRES_PASSWORD=postgres \
-p 5432:5432 \
-d \
pgvector/pgvector:pg16
Implementation
1. Initial Imports and Setup
from phi.assistant import Assistant
from phi.storage.assistant.postgres import PGAssistantStorage
from phi.knowledge.pdf import PDFURLKnowledgeBase
from phi.vectordb.pgvector import PGVector2
from dotenv import load_dotenv
import os
load_dotenv()
GROQ_API_KEY = os.getenv("GROQ_API_KEY")
2. Database Configuration
DB_URL = "postgresql://postgres:postgres@localhost:5432/postgres"
3. Knowledge Base Setup
knowledge_base = PDFURLKnowledgeBase(
urls=["https://phi-public.s3.amazonaws.com/recipes/thai_recipes.pdf"],
vector_db=PGVector2(
collection_name="recipes",
db_url=DB_URL
)
)
knowledge_base.load(
storage=PGAssistantStorage(
table_name="pdf_assistant",
db_url=DB_URL
)
)
4. Creating the PDF Assistant
def pdf_assistant(
new: bool = False,
user: str = "user"
) -> None:
assistant = Assistant(
run_id="pdf_assistant",
user_id=user,
knowledge_base=knowledge_base,
storage=PGAssistantStorage(
table_name="pdf_assistant",
db_url=DB_URL
),
show_tool_calls=True,
search_knowledge=True,
read_chat_history=True
)
if new or not assistant.run_id:
assistant.run_id = "pdf_assistant"
assistant.start()
assistant.cli(markdown=True)
Running the Application
if __name__ == "__main__":
import typer
typer.run(pdf_assistant)
Example Interactions
The assistant can handle queries like:
- "List all the dishes in the document"
- "What are the ingredients for Masaman Gai?"
- "How to prepare this dish?"
Key Features
- Vector Database Integration: Uses PG Vector for efficient storage and retrieval
- PDF Processing: Automatically extracts and vectorizes PDF content
- Chat History: Maintains conversation context
- Tool Integration: Shows tool calls in responses
Advanced Customization Options
- Use different Vector databases (Pinecone, Weaviate, ChromaDB)
- Integrate multiple knowledge sources
- Customize assistant behavior and responses
- Add authentication and user management
Assignment Ideas
- Convert the application to a Streamlit frontend
- Add support for multiple PDF uploads
- Implement different vector databases
- Create a GitHub repository documentation chatbot
Common Issues and Solutions
- Docker Issues: Ensure Docker Desktop is running before starting
- Library Dependencies: Install all required packages from requirements.txt
- Database Connection: Verify PostgreSQL connection string
- PDF Processing: Ensure PDF URLs are accessible
Conclusion
This project demonstrates how to build a sophisticated AI assistant that can interact with vector databases and process PDF documents. It showcases the power of combining multiple tools and technologies to create complex workflows that solve real-world problems.
Next Steps
- Explore different vector databases
- Add more document types support
- Implement authentication
- Create production-ready deployment configurations