Itinai.com a professional business consultation in a modern o af6f311b e5e0 4716 a0d0 e7e2258e9a3b 2
Itinai.com a professional business consultation in a modern o af6f311b e5e0 4716 a0d0 e7e2258e9a3b 2

Build a Local RAG Pipeline with Ollama and DeepSeek-R1 on Google Colab

Build a Local RAG Pipeline with Ollama and DeepSeek-R1 on Google Colab


Building a Local RAG Pipeline with Ollama and Google Colab

Building a Local Retrieval-Augmented Generation (RAG) Pipeline Using Ollama on Google Colab

This tutorial outlines the steps to create a Retrieval-Augmented Generation (RAG) pipeline utilizing open-source tools on Google Colab. By integrating Ollama, the DeepSeek-R1 1.5B language model, LangChain, and ChromaDB, users can efficiently query real-time information from uploaded PDF documents. This approach offers a private, cost-effective solution for businesses seeking to enhance their data retrieval capabilities.

1. Setting Up the Environment

1.1 Installing Required Libraries

To begin, we need to install essential libraries that will support our RAG pipeline:

  • Ollama
  • LangChain
  • Sentence Transformers
  • ChromaDB
  • FAISS

These libraries facilitate document processing, embedding, vector storage, and retrieval functionalities necessary for an efficient RAG system.

1.2 Enabling Terminal Access in Google Colab

To run commands directly within Colab, we will install the colab-xterm extension. This allows us to execute terminal commands seamlessly:

!pip install colab-xterm
%load_ext colabxterm
%xterm

2. Uploading and Processing PDF Documents

2.1 Uploading PDF Files

Users can upload their PDF documents for processing. The system will verify the file type to ensure it is a PDF:

print("Please upload your PDF file…")
uploaded = files.upload()
file_path = list(uploaded.keys())[0]
if not file_path.endswith('.pdf'):
    print("Warning: Uploaded file is not a PDF. This may cause issues.")

2.2 Extracting Content from PDFs

After uploading, we will extract the content using the pypdf library:

loader = PyPDFLoader(file_path)
documents = loader.load()
print(f"Successfully loaded {len(documents)} pages from PDF")

2.3 Splitting Text for Better Context

The extracted text will be divided into manageable chunks for improved context retention:

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
chunks = text_splitter.split_documents(documents)
print(f"Split documents into {len(chunks)} chunks")

3. Building the RAG Pipeline

3.1 Creating Embeddings and Vector Store

We will generate embeddings for the text chunks and store them in a ChromaDB vector store for efficient retrieval:

embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2", model_kwargs={'device': 'cpu'})
persist_directory = "./chroma_db"
vectorstore = Chroma.from_documents(documents=chunks, embedding=embeddings, persist_directory=persist_directory)
print(f"Vector store created and persisted to {persist_directory}")

3.2 Integrating the Language Model

The final step involves connecting the DeepSeek-R1 model with the retriever to complete the RAG pipeline:

llm = OllamaLLM(model="deepseek-r1:1.5b")
retriever = vectorstore.as_retriever(search_type="similarity", search_kwargs={"k": 3})
qa_chain = RetrievalQA.from_chain_type(llm=llm, chain_type="stuff", retriever=retriever, return_source_documents=True)
print("RAG pipeline created successfully!")

4. Querying the RAG Pipeline

4.1 Running Queries

To test the pipeline, we can define a function to query the RAG system and retrieve relevant answers:

def query_rag(question):
    result = qa_chain({"query": question})
    print("Question:", question)
    print("Answer:", result["result"])
    print("Sources:")
    for i, doc in enumerate(result["source_documents"]):
        print(f"Source {i+1}:n{doc.content[:200]}...n")
    return result

question = "What is the main topic of this document?"
result = query_rag(question)

5. Conclusion

This tutorial demonstrates how to build a lightweight yet powerful RAG system that operates efficiently on Google Colab. By leveraging Ollama, ChromaDB, and LangChain, businesses can create scalable, customizable, and privacy-friendly AI assistants without incurring cloud costs. This architecture not only enhances data retrieval but also empowers users to ask questions based on up-to-date content from their documents.

For further inquiries or guidance on implementing AI solutions in your business, please contact us at hello@itinai.ru. Follow us on Telegram, X, and LinkedIn.


Itinai.com office ai background high tech quantum computing 0002ba7c e3d6 4fd7 abd6 cfe4e5f08aeb 0

Vladimir Dyachkov, Ph.D
Editor-in-Chief itinai.com

I believe that AI is only as powerful as the human insight guiding it.

Unleash Your Creative Potential with AI Agents

Competitors are already using AI Agents

Business Problems We Solve

  • Automation of internal processes.
  • Optimizing AI costs without huge budgets.
  • Training staff, developing custom courses for business needs
  • Integrating AI into client work, automating first lines of contact

Large and Medium Businesses

Startups

Offline Business

100% of clients report increased productivity and reduced operati

AI news and solutions