Build a Local RAG Pipeline with Ollama and DeepSeek-R1 on Google Colab

Build a Local RAG Pipeline with Ollama and DeepSeek-R1 on Google Colab


Building a Local RAG Pipeline with Ollama and Google Colab

Building a Local Retrieval-Augmented Generation (RAG) Pipeline Using Ollama on Google Colab

This tutorial outlines the steps to create a Retrieval-Augmented Generation (RAG) pipeline utilizing open-source tools on Google Colab. By integrating Ollama, the DeepSeek-R1 1.5B language model, LangChain, and ChromaDB, users can efficiently query real-time information from uploaded PDF documents. This approach offers a private, cost-effective solution for businesses seeking to enhance their data retrieval capabilities.

1. Setting Up the Environment

1.1 Installing Required Libraries

To begin, we need to install essential libraries that will support our RAG pipeline:

  • Ollama
  • LangChain
  • Sentence Transformers
  • ChromaDB
  • FAISS

These libraries facilitate document processing, embedding, vector storage, and retrieval functionalities necessary for an efficient RAG system.

1.2 Enabling Terminal Access in Google Colab

To run commands directly within Colab, we will install the colab-xterm extension. This allows us to execute terminal commands seamlessly:

!pip install colab-xterm
%load_ext colabxterm
%xterm

2. Uploading and Processing PDF Documents

2.1 Uploading PDF Files

Users can upload their PDF documents for processing. The system will verify the file type to ensure it is a PDF:

print("Please upload your PDF file…")
uploaded = files.upload()
file_path = list(uploaded.keys())[0]
if not file_path.endswith('.pdf'):
    print("Warning: Uploaded file is not a PDF. This may cause issues.")

2.2 Extracting Content from PDFs

After uploading, we will extract the content using the pypdf library:

loader = PyPDFLoader(file_path)
documents = loader.load()
print(f"Successfully loaded {len(documents)} pages from PDF")

2.3 Splitting Text for Better Context

The extracted text will be divided into manageable chunks for improved context retention:

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
chunks = text_splitter.split_documents(documents)
print(f"Split documents into {len(chunks)} chunks")

3. Building the RAG Pipeline

3.1 Creating Embeddings and Vector Store

We will generate embeddings for the text chunks and store them in a ChromaDB vector store for efficient retrieval:

embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2", model_kwargs={'device': 'cpu'})
persist_directory = "./chroma_db"
vectorstore = Chroma.from_documents(documents=chunks, embedding=embeddings, persist_directory=persist_directory)
print(f"Vector store created and persisted to {persist_directory}")

3.2 Integrating the Language Model

The final step involves connecting the DeepSeek-R1 model with the retriever to complete the RAG pipeline:

llm = OllamaLLM(model="deepseek-r1:1.5b")
retriever = vectorstore.as_retriever(search_type="similarity", search_kwargs={"k": 3})
qa_chain = RetrievalQA.from_chain_type(llm=llm, chain_type="stuff", retriever=retriever, return_source_documents=True)
print("RAG pipeline created successfully!")

4. Querying the RAG Pipeline

4.1 Running Queries

To test the pipeline, we can define a function to query the RAG system and retrieve relevant answers:

def query_rag(question):
    result = qa_chain({"query": question})
    print("Question:", question)
    print("Answer:", result["result"])
    print("Sources:")
    for i, doc in enumerate(result["source_documents"]):
        print(f"Source {i+1}:\n{doc.content[:200]}...\n")
    return result

question = "What is the main topic of this document?"
result = query_rag(question)

5. Conclusion

This tutorial demonstrates how to build a lightweight yet powerful RAG system that operates efficiently on Google Colab. By leveraging Ollama, ChromaDB, and LangChain, businesses can create scalable, customizable, and privacy-friendly AI assistants without incurring cloud costs. This architecture not only enhances data retrieval but also empowers users to ask questions based on up-to-date content from their documents.

For further inquiries or guidance on implementing AI solutions in your business, please contact us at hello@itinai.ru. Follow us on Telegram, X, and LinkedIn.


AI Products for Business or Custom Development

AI Sales Bot

Welcome AI Sales Bot, your 24/7 teammate! Engaging customers in natural language across all channels and learning from your materials, it’s a step towards efficient, enriched customer interactions and sales

AI Document Assistant

Unlock insights and drive decisions with our AI Insights Suite. Indexing your documents and data, it provides smart, AI-driven decision support, enhancing your productivity and decision-making.

AI Customer Support

Upgrade your support with our AI Assistant, reducing response times and personalizing interactions by analyzing documents and past engagements. Boost your team and customer satisfaction

AI Scrum Bot

Enhance agile management with our AI Scrum Bot, it helps to organize retrospectives. It answers queries and boosts collaboration and efficiency in your scrum processes.

AI Agents

AI news and solutions