Build a Local RAG Pipeline with Ollama and DeepSeek-R1 on Google Colab

Building a Local RAG Pipeline with Ollama and Google Colab

Building a Local Retrieval-Augmented Generation (RAG) Pipeline Using Ollama on Google Colab

This tutorial outlines the steps to create a Retrieval-Augmented Generation (RAG) pipeline utilizing open-source tools on Google Colab. By integrating Ollama, the DeepSeek-R1 1.5B language model, LangChain, and ChromaDB, users can efficiently query real-time information from uploaded PDF documents. This approach offers a private, cost-effective solution for businesses seeking to enhance their data retrieval capabilities.

1. Setting Up the Environment

1.1 Installing Required Libraries

To begin, we need to install essential libraries that will support our RAG pipeline:

Ollama
LangChain
Sentence Transformers
ChromaDB
FAISS

These libraries facilitate document processing, embedding, vector storage, and retrieval functionalities necessary for an efficient RAG system.

1.2 Enabling Terminal Access in Google Colab

To run commands directly within Colab, we will install the colab-xterm extension. This allows us to execute terminal commands seamlessly:

!pip install colab-xterm
%load_ext colabxterm
%xterm

2. Uploading and Processing PDF Documents

2.1 Uploading PDF Files

Users can upload their PDF documents for processing. The system will verify the file type to ensure it is a PDF:

print("Please upload your PDF file…")
uploaded = files.upload()
file_path = list(uploaded.keys())[0]
if not file_path.endswith('.pdf'):
    print("Warning: Uploaded file is not a PDF. This may cause issues.")

2.2 Extracting Content from PDFs

After uploading, we will extract the content using the pypdf library:

loader = PyPDFLoader(file_path)
documents = loader.load()
print(f"Successfully loaded {len(documents)} pages from PDF")

2.3 Splitting Text for Better Context

The extracted text will be divided into manageable chunks for improved context retention:

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
chunks = text_splitter.split_documents(documents)
print(f"Split documents into {len(chunks)} chunks")

3. Building the RAG Pipeline

3.1 Creating Embeddings and Vector Store

We will generate embeddings for the text chunks and store them in a ChromaDB vector store for efficient retrieval:

embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2", model_kwargs={'device': 'cpu'})
persist_directory = "./chroma_db"
vectorstore = Chroma.from_documents(documents=chunks, embedding=embeddings, persist_directory=persist_directory)
print(f"Vector store created and persisted to {persist_directory}")

3.2 Integrating the Language Model

The final step involves connecting the DeepSeek-R1 model with the retriever to complete the RAG pipeline:

llm = OllamaLLM(model="deepseek-r1:1.5b")
retriever = vectorstore.as_retriever(search_type="similarity", search_kwargs={"k": 3})
qa_chain = RetrievalQA.from_chain_type(llm=llm, chain_type="stuff", retriever=retriever, return_source_documents=True)
print("RAG pipeline created successfully!")

4. Querying the RAG Pipeline

4.1 Running Queries

To test the pipeline, we can define a function to query the RAG system and retrieve relevant answers:

def query_rag(question):
    result = qa_chain({"query": question})
    print("Question:", question)
    print("Answer:", result["result"])
    print("Sources:")
    for i, doc in enumerate(result["source_documents"]):
        print(f"Source {i+1}:n{doc.content[:200]}...n")
    return result

question = "What is the main topic of this document?"
result = query_rag(question)

5. Conclusion

This tutorial demonstrates how to build a lightweight yet powerful RAG system that operates efficiently on Google Colab. By leveraging Ollama, ChromaDB, and LangChain, businesses can create scalable, customizable, and privacy-friendly AI assistants without incurring cloud costs. This architecture not only enhances data retrieval but also empowers users to ask questions based on up-to-date content from their documents.

For further inquiries or guidance on implementing AI solutions in your business, please contact us at hello@itinai.ru. Follow us on Telegram, X, and LinkedIn.

Unleash Your Creative Potential with AI Agents

Competitors are already using AI Agents

Business Problems We Solve

Automation of internal processes.
Optimizing AI costs without huge budgets.
Training staff, developing custom courses for business needs
Integrating AI into client work, automating first lines of contact

Large and Medium Businesses

Startups

Offline Business

Get a plan to reduce routine and improve metrics

100% of clients report increased productivity and reduced operati

AI Agents

Localization Project Manager – Coordinating translation workflows, answering vendor or process-related questions.

Job Title: Localization Project Manager Overview The Localization Project Manager plays a vital role in coordinating translation workflows while addressing vendor and process-related queries. This position is crucial for ensuring that translation projects are executed efficiently…
AI Agents

Environmental Health & Safety Officer – Answering compliance-related questions, retrieving safety protocols or audit histories.

Professional Summary The AI-driven Environmental Health & Safety Officer is a reliable and effective digital team member that performs repetitive and time-consuming tasks with remarkable speed, accuracy, and stability. By automating these tasks, it frees up…
AI Agents

Legal Contract Reviewer – Auto-flagging clause inconsistencies or retrieving precedent cases for review.

Job Title: Legal Contract Reviewer – Auto-flagging Clause Inconsistencies or Retrieving Precedent Cases for Review The AI functions as a reliable and effective digital team member that excels in performing repetitive and time-consuming tasks. With remarkable…
AI Agents

Customer Retention Analyst – Creating customer summaries, identifying churn risk patterns, and suggesting retention steps.

Customer Retention Analyst Professional Summary A highly analytical and detail-oriented Customer Retention Analyst with a proven track record in creating comprehensive customer summaries, identifying churn risk patterns, and suggesting effective retention strategies. Adept at leveraging data-driven…

Itinai.com httpss.mj.runmrqch2uvtvo russian handsome charisma 9fdbb2d5 a55b 425d 8f3b 76d26f86710f 2

AI Business Accelerator

Start Your AI Business in Just a Week with itinai.com

You’re a great fit if you:

Have an audience (even 500+ followers in Instagram, email, etc.)
Have an idea, service, or product you want to scale
Can invest 2–3 hours a day
You’re motivated to earn with AI but don’t want to handle technical setup

AI news and solutions

Best Practices for AI Development Platforms in Government

Leveraging AI for Business Transformation Artificial Intelligence (AI) is revolutionizing how organizations operate, particularly in sectors such as defense and government. Insights from the US Army’s approach to AI development, as articulated by Isaac Faber, Chief…

AI News
AutoCE: An Intelligent Model Advisor Revolutionizing Cardinality Estimation for Databases through Advanced Deep Metric Learning and Incremental Learning Techniques

Practical Solutions and Value of Cardinality Estimation in Databases Importance of Cardinality Estimation (CE) in Database Tasks CE is crucial for tasks like query planning, cost estimation, and optimization in databases. Accurate CE ensures efficient query…

AI Tech News
Unleashing the Power of the Julia SuperType

The Julia programming language implements a unique paradigm called Multiple Dispatch, which is particularly effective for data science. An important technique in Julia is abstraction, which allows for flexibility when working with different types of data.…

AI Tech News
Anthropic Introduces Constitutional Classifiers: A Measured AI Approach to Defending Against Universal Jailbreaks

AI Safeguards Against Exploitation Large language models (LLMs) are widely used but can be vulnerable to misuse. A major issue is the emergence of universal jailbreaks—methods that bypass security measures, granting access to restricted information. This…

AI Tech News
Meta AI Introduces AdaCache: A Training-Free Method to Accelerate Video Diffusion Transformers (DiTs)

Video Generation in AI Video generation is a key area in artificial intelligence, focusing on creating high-quality, consistent videos. The latest machine learning models, especially diffusion transformers (DiTs), are leading the way, offering better quality than…

AI Tech News
MaskLLM: A Learnable AI Method that Facilitates End-to End Training of LLM Sparsity on Large-Scale Datasets

Practical Solutions for Efficient AI Model Deployment Semi-Structured Pruning for Efficiency Implement N: M sparsity pattern to reduce memory and computational demands. Introducing MaskLLM for Enhanced Pruning MaskLLM by NVIDIA and NUS applies learnable N: M…

AI Tech News
XVERSE-MoE-A36B Released by XVERSE Technology: A Revolutionary Multilingual AI Model Setting New Standards in Mixture-of-Experts Architecture and Large-Scale Language Processing

XVERSE-MoE-A36B: Revolutionizing AI Language Modeling Key Innovations and Practical Solutions XVERSE Technology has introduced the XVERSE-MoE-A36B, a large multilingual language model based on the Mixture-of-Experts (MoE) architecture. This model offers remarkable scale, innovative structure, advanced training…

AI Tech News
PrimeIntellect Launches INTELLECT-2: A 32B Decentralized Reasoning Model

Challenges in Centralized AI Training As the complexity and size of language models increase, traditional centralized training methods become more constrained. These methods often rely on expensive compute clusters with fast connections, which can create limitations…

AI News
Optimizing Reinforcement Learning for LLMs: Focus on High-Entropy Tokens

In the field of artificial intelligence, particularly with Large Language Models (LLMs), there is an ongoing effort to refine the training processes that enhance their reasoning skills. A recent study introduced an innovative approach called High-Entropy…

AI Tech News
This AI Paper Proposes NLRL: A Natural Language-Based Paradigm for Enhancing Reinforcement Learning Efficiency and Interpretability

Understanding Natural Language Reinforcement Learning (NLRL) What is Reinforcement Learning? Reinforcement Learning (RL) is a powerful method for making decisions based on experiences. It is particularly useful in areas like gaming, robotics, and language processing because…

AI Tech News
AdvDGMs: Enhancing Adversarial Robustness in Tabular Machine Learning by Incorporating Constraint Repair Layers for Realistic and Domain-Specific Attack Generation

Practical Solutions for Enhancing Adversarial Robustness in Tabular Machine Learning Value Proposition: Adversarial machine learning focuses on testing and strengthening ML systems against deceptive data. Deep generative models play a crucial role in creating adversarial examples,…

AI Tech News
This AI Paper from Google Unveils How Bayesian Neural Fields Revolutionize Spatiotemporal Forecasting for Large Datasets

Practical Solutions and Value of Bayesian Neural Fields in Spatiotemporal Prediction Challenges Addressed: Handling vast and complex spatiotemporal datasets efficiently. Forecasting air quality, disease spread, and resource demands accurately. Dealing with noisy observations, missing data, and…

AI Tech News
How we think about Data Pipelines is changing

Data pipelines, traditionally run on open-source platforms like Airflow or Prefect, are undergoing a shift in mindset. Rather than simply moving data to serve the business, there is now a focus on reliability, efficiency, and a…

AI Tech News
Creating a Text Analysis Pipeline with LangGraph: A Comprehensive Tutorial for AI Enthusiasts

LangGraph is an innovative framework developed by LangChain, designed to create sophisticated applications using large language models (LLMs). This guide will walk you through the process of building a text analysis pipeline, showcasing how to effectively…

AI Tech News
ServiceNow Releases AgentLab: A New Open-Source Python Package for Developing and Evaluating Web Agents

Introduction to Web Agents Developing web agents is a complex area in AI research that has gained a lot of interest recently. As the web evolves, agents need to interact automatically with various online platforms. One…

AI Tech News
Best Online Business to Start as a Beginner (4 Simple Steps to $1m+ Per Year)

Chase Dimond shares his journey to earning over 7 figures with a services agency, specifically an email marketing agency, advocating it as the best business model for beginners due to low startup costs, high demand, easy…

AI Tech News
Introducing Gemini: our largest and most capable AI model

AI advancements aim to improve accessibility and usefulness across various communities, ensuring it addresses diverse needs and offers solutions that enhance daily life for all individuals.

AI Tech News
AI-Powered PDF Summarization for Teams

AI-Powered PDF Summarization for Teams The sheer volume of documents flooding businesses today isn’t just a storage problem; it’s a strategic bottleneck. Legal teams drowning in discovery, financial analysts sifting through quarterly reports, research scientists battling…

AI Document Assistant
The Rise of NeuroTechnology and Its Fusion with AI

AI Tech News
Chinese researchers unveil a robot toddler named “Tong Tong”

The Frontiers of General Artificial Intelligence Technology Exhibition in Beijing unveiled a virtual robot toddler named Tong Tong, developed by the Beijing Institute for General Artificial Intelligence. Tong Tong exhibits human-like abilities and behaviors, mirroring those…

AI Tech News