Build a Local RAG Pipeline with Ollama and DeepSeek-R1 on Google Colab

Building a Local RAG Pipeline with Ollama and Google Colab

Building a Local Retrieval-Augmented Generation (RAG) Pipeline Using Ollama on Google Colab

This tutorial outlines the steps to create a Retrieval-Augmented Generation (RAG) pipeline utilizing open-source tools on Google Colab. By integrating Ollama, the DeepSeek-R1 1.5B language model, LangChain, and ChromaDB, users can efficiently query real-time information from uploaded PDF documents. This approach offers a private, cost-effective solution for businesses seeking to enhance their data retrieval capabilities.

1. Setting Up the Environment

1.1 Installing Required Libraries

To begin, we need to install essential libraries that will support our RAG pipeline:

Ollama
LangChain
Sentence Transformers
ChromaDB
FAISS

These libraries facilitate document processing, embedding, vector storage, and retrieval functionalities necessary for an efficient RAG system.

1.2 Enabling Terminal Access in Google Colab

To run commands directly within Colab, we will install the colab-xterm extension. This allows us to execute terminal commands seamlessly:

!pip install colab-xterm
%load_ext colabxterm
%xterm

2. Uploading and Processing PDF Documents

2.1 Uploading PDF Files

Users can upload their PDF documents for processing. The system will verify the file type to ensure it is a PDF:

print("Please upload your PDF file…")
uploaded = files.upload()
file_path = list(uploaded.keys())[0]
if not file_path.endswith('.pdf'):
    print("Warning: Uploaded file is not a PDF. This may cause issues.")

2.2 Extracting Content from PDFs

After uploading, we will extract the content using the pypdf library:

loader = PyPDFLoader(file_path)
documents = loader.load()
print(f"Successfully loaded {len(documents)} pages from PDF")

2.3 Splitting Text for Better Context

The extracted text will be divided into manageable chunks for improved context retention:

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
chunks = text_splitter.split_documents(documents)
print(f"Split documents into {len(chunks)} chunks")

3. Building the RAG Pipeline

3.1 Creating Embeddings and Vector Store

We will generate embeddings for the text chunks and store them in a ChromaDB vector store for efficient retrieval:

embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2", model_kwargs={'device': 'cpu'})
persist_directory = "./chroma_db"
vectorstore = Chroma.from_documents(documents=chunks, embedding=embeddings, persist_directory=persist_directory)
print(f"Vector store created and persisted to {persist_directory}")

3.2 Integrating the Language Model

The final step involves connecting the DeepSeek-R1 model with the retriever to complete the RAG pipeline:

llm = OllamaLLM(model="deepseek-r1:1.5b")
retriever = vectorstore.as_retriever(search_type="similarity", search_kwargs={"k": 3})
qa_chain = RetrievalQA.from_chain_type(llm=llm, chain_type="stuff", retriever=retriever, return_source_documents=True)
print("RAG pipeline created successfully!")

4. Querying the RAG Pipeline

4.1 Running Queries

To test the pipeline, we can define a function to query the RAG system and retrieve relevant answers:

def query_rag(question):
    result = qa_chain({"query": question})
    print("Question:", question)
    print("Answer:", result["result"])
    print("Sources:")
    for i, doc in enumerate(result["source_documents"]):
        print(f"Source {i+1}:n{doc.content[:200]}...n")
    return result

question = "What is the main topic of this document?"
result = query_rag(question)

5. Conclusion

This tutorial demonstrates how to build a lightweight yet powerful RAG system that operates efficiently on Google Colab. By leveraging Ollama, ChromaDB, and LangChain, businesses can create scalable, customizable, and privacy-friendly AI assistants without incurring cloud costs. This architecture not only enhances data retrieval but also empowers users to ask questions based on up-to-date content from their documents.

For further inquiries or guidance on implementing AI solutions in your business, please contact us at hello@itinai.ru. Follow us on Telegram, X, and LinkedIn.

Unleash Your Creative Potential with AI Agents

Competitors are already using AI Agents

Business Problems We Solve

Automation of internal processes.
Optimizing AI costs without huge budgets.
Training staff, developing custom courses for business needs
Integrating AI into client work, automating first lines of contact

Large and Medium Businesses

Startups

Offline Business

Get a plan to reduce routine and improve metrics

100% of clients report increased productivity and reduced operati

AI Agents

Localization Project Manager – Coordinating translation workflows, answering vendor or process-related questions.

Job Title: Localization Project Manager Overview The Localization Project Manager plays a vital role in coordinating translation workflows while addressing vendor and process-related queries. This position is crucial for ensuring that translation projects are executed efficiently…
AI Agents

Environmental Health & Safety Officer – Answering compliance-related questions, retrieving safety protocols or audit histories.

Professional Summary The AI-driven Environmental Health & Safety Officer is a reliable and effective digital team member that performs repetitive and time-consuming tasks with remarkable speed, accuracy, and stability. By automating these tasks, it frees up…
AI Agents

Legal Contract Reviewer – Auto-flagging clause inconsistencies or retrieving precedent cases for review.

Job Title: Legal Contract Reviewer – Auto-flagging Clause Inconsistencies or Retrieving Precedent Cases for Review The AI functions as a reliable and effective digital team member that excels in performing repetitive and time-consuming tasks. With remarkable…
AI Agents

Customer Retention Analyst – Creating customer summaries, identifying churn risk patterns, and suggesting retention steps.

Customer Retention Analyst Professional Summary A highly analytical and detail-oriented Customer Retention Analyst with a proven track record in creating comprehensive customer summaries, identifying churn risk patterns, and suggesting effective retention strategies. Adept at leveraging data-driven…

Itinai.com httpss.mj.runmrqch2uvtvo russian handsome charisma 9fdbb2d5 a55b 425d 8f3b 76d26f86710f 2

AI Business Accelerator

Start Your AI Business in Just a Week with itinai.com

You’re a great fit if you:

Have an audience (even 500+ followers in Instagram, email, etc.)
Have an idea, service, or product you want to scale
Can invest 2–3 hours a day
You’re motivated to earn with AI but don’t want to handle technical setup

AI news and solutions

Researchers from the University of Bordeaux, France Developed Pyfiber: An Open-Source Python Library that Facilitates the Merge of Fiber Photometry (FP) with Operant Behavior

A Python library called Pyfiber, developed by researchers from the University of Bordeaux and UCL Sainsbury Wellcome Centre, seamlessly integrates fiber photometry with complex behavioral paradigms in behavioral neuroscience research. It offers versatility, ease of use,…

AI Tech News
Google Researchers Introduce An Open-Source Library in JAX for Deep Learning on Spherical Surfaces

Researchers have developed an open-source library in JAX for deep learning on spherical surfaces. This new approach, utilizing spherical convolution and cross-correlation operations, shows promise in addressing challenges related to predicting chemical properties and understanding climate…

AI Tech News
Breaking Down Barriers: Scaling Multimodal AI with CuMo

The Value of CuMo in Scaling Multimodal AI Enhancing Multimodal Capabilities The integration of sparse MoE blocks into the vision encoder and vision-language connector of a multimodal LLM allows for parallel processing of visual and text…

AI Tech News
Apple AI Releases Depth Pro: A Foundation Model for Zero-Shot Metric Monocular Depth Estimation

Introduction Traditional depth estimation methods are limited in real-world scenarios, hindering efficient production of accurate depth maps for applications like augmented reality and image editing. Apple’s Depth Pro offers an advanced AI model for zero-shot metric…

AI Tech News
Cohere for AI Releases Aya Expanse (8B & 32B): A State-of-the-Art Multilingual Family of Models to Bridge the Language Gap in AI

Addressing Language Gaps in AI Many languages are still not well represented in AI technology, despite rapid advancements. Most progress in natural language processing (NLP) focuses on languages like English, leaving others behind. This means that…

AI Tech News
CMU Researchers Introduce VisualWebArena: An AI Benchmark Designed to Evaluate the Performance of Multimodal Web Agents on Realistic and Visually Stimulating Challenges

The field of Artificial Intelligence (AI) aims to automate computer operations with autonomous agents. Carnegie Mellon University researchers have introduced VisualWebArena, a benchmark to evaluate multimodal web agents’ performance on complex challenges. This assesses agents’ abilities…

AI Tech News
The 14% Conversion Rate Growth Story: Unravelling JOE & THE JUICE’s Dynamic Partnership with Pixis AI

Danish urban oasis, JOE & THE JUICE, has expanded to over 250 European locations and is now making its mark in the US and the Middle East. They turned to Pixis, an AI solution, to streamline…

AI Tech News
Meta AI Just Open-Sourced Llama 3.3: A New 70B Multilingual Large Language Model (LLM)

Meta AI Launches Llama 3.3: A Cost-Effective Language Model Overview of Llama 3.3 Llama 3.3 is an open-source language model from Meta AI, designed to enhance text-based applications like synthetic data generation. It offers improved performance…

AI Tech News
Big Loss for AI Companies in the Stock Market

On February 1, 2024, AI-related companies suffered a significant setback, collectively losing $190 billion in market value after disappointing quarterly results from major players such as Microsoft, Alphabet, and AMD. The drop in stock prices was…

AI Tech News
Agent-as-a-Judge: An Advanced AI Framework for Scalable and Accurate Evaluation of AI Systems Through Continuous Feedback and Human-level Judgments

Understanding Agentic Systems and Their Evaluation Agentic systems are advanced AI systems that can tackle complex tasks by mimicking human decision-making. They operate step-by-step, analyzing each phase of a task. However, an important challenge is how…

AI Tech News
From Social Media to Macroeconomics: ALERTA-Net and the Future of Stock Market Analysis

ALERTA-Net is a deep neural network that forecasts stock prices and market volatility by integrating social media, economic indicators, and search data, surpassing conventional analytical approaches.

AI Tech News
Google AI Introduces Croissant: A Metadata Format for Machine Learning-Ready Datasets

Google has introduced Croissant, a new metadata format for machine learning (ML) datasets. Croissant aims to overcome the obstacles in ML data organization and make datasets more discoverable and reusable. It provides a consistent method for…

AI Tech News
CMU Researchers Introduce Sequoia: A Scalable, Robust, and Hardware-Aware Algorithm for Speculative Decoding

Efficiently supporting large language models (LLMs) is crucial as their use increases. Speculative decoding has been proposed to accelerate LLM inference, addressing limitations of existing tree-based approaches. Researchers from Carnegie Mellon University, Meta AI, Together AI,…

AI Tech News
Create a Low-Footprint AI Coding Assistant with Mistral Devstral for Space-Constrained Users

Building a Low-Footprint AI Coding Assistant with Mistral Devstral Creating an AI coding assistant in environments with limited resources can be challenging. This guide focuses on using the Mistral Devstral model in Google Colab, where disk…

AI Tech News
10 Artificial Intelligence (AI) Applications/Platforms In Healthcare

AI Tech News
Researchers at Stanford University Introduce ‘pyvene’: An Open-Source Python Library that Supports Intervention-Based Research on Machine Learning Models

Developed by Stanford University, “pyvene” is a pioneering open-source Python library catering to intervention-based research on machine learning models. Its configuration-based approach and support for diverse intervention types, along with impressive performance in model interpretability, highlight…

AI Tech News
PyTorch Introduction —Tensors and Tensor Calculations

The blog post introduces PyTorch, a key deep learning library used for creating and operating on tensors, the core components for neural network modeling. It provides a beginner-friendly guide on tensor properties and operations, like addition…

AI Tech News
FlashAttention-3 Released: Achieves Unprecedented Speed and Precision with Advanced Hardware Utilization and Low-Precision Computing

FlashAttention-3: Revolutionizing Attention Mechanisms in AI Practical Solutions and Value FlashAttention-3 addresses bottlenecks in Transformer architectures, enhancing performance for large language models and long-context processing applications. It minimizes memory reads and writes, accelerating Transformer training and…

AI Tech News
Elon Musk announces early Access to xAI’s chatbot ‘Grok’ for X subscribers

Elon Musk has announced the upcoming launch of xAI’s proprietary chatbot, Grok. Designed for conversational question-answering, Grok will have real-time access to information through the X database. Musk mentioned that Grok may avoid certain sensitive questions…

AI Tech News
Off-Policy Reinforcement Learning with KL Divergence: Enhancing Large Language Model Reasoning

In the rapidly evolving landscape of artificial intelligence, particularly in the realm of large language models (LLMs), the integration of reinforcement learning (RL) has opened up new avenues for enhancing reasoning capabilities. This article delves into…

AI Tech News