Building a Retrieval-Augmented Generation (RAG) System with DeepSeek R1: A Step-by-Step Guide

Introduction to DeepSeek R1

DeepSeek R1 has created excitement in the AI community. This open-source model performs exceptionally well, often matching top proprietary models. In this article, we will guide you through setting up a Retrieval-Augmented Generation (RAG) system using DeepSeek R1, from environment setup to running queries.

What is RAG?

RAG combines retrieval and generation techniques. It retrieves relevant information from a knowledge base and generates accurate responses to user queries.

Prerequisites

Python: Version 3.7 or higher.
Ollama: This framework allows you to run models like DeepSeek R1 locally.

Step-by-Step Implementation

Step 1: Install Ollama

Follow the instructions on the Ollama website to install it. Verify the installation by running:

ollama --version

Step 2: Run DeepSeek R1 Model

Open your terminal and execute:

ollama run deepseek-r1:1.5b

This command starts the 1.5 billion parameter version of DeepSeek R1, suitable for various applications.

Step 3: Prepare Your Knowledge Base

Gather documents, articles, or any relevant text data for your retrieval system.

3.1 Load Your Documents

Load documents from text files, databases, or web scraping. Here’s an example:

import os

def load_documents(directory):
    documents = []
    for filename in os.listdir(directory):
        if filename.endswith('.txt'):
            with open(os.path.join(directory, filename), 'r') as file:
                documents.append(file.read())
    return documents

documents = load_documents('path/to/your/documents')

Step 4: Create a Vector Store for Retrieval

Use a vector store like FAISS for efficient document retrieval.

4.1 Install Required Libraries

Install additional libraries:

pip install faiss-cpu huggingface-hub

4.2 Generate Embeddings and Set Up FAISS

Generate embeddings and set up the FAISS vector store:

from huggingface_hub import HuggingFaceEmbeddings
import faiss
import numpy as np

embeddings_model = HuggingFaceEmbeddings()
document_embeddings = [embeddings_model.embed(doc) for doc in documents]
document_embeddings = np.array(document_embeddings).astype('float32')

index = faiss.IndexFlatL2(document_embeddings.shape[1])
index.add(document_embeddings)

Step 5: Set Up the Retriever

Create a retriever to fetch relevant documents based on user queries:

class SimpleRetriever:
    def __init__(self, index, embeddings_model):
        self.index = index
        self.embeddings_model = embeddings_model
    
    def retrieve(self, query, k=3):
        query_embedding = self.embeddings_model.embed(query)
        distances, indices = self.index.search(np.array([query_embedding]).astype('float32'), k)
        return [documents[i] for i in indices[0]]

retriever = SimpleRetriever(index, embeddings_model)

Step 6: Configure DeepSeek R1 for RAG

Set up a prompt template for DeepSeek R1:

from ollama import Ollama
from string import Template

llm = Ollama(model="deepseek-r1:1.5b")

prompt_template = Template("""
Use ONLY the context below.
If unsure, say "I don't know".
Keep answers under 4 sentences.

Context: $context
Question: $question
Answer:
""")

Step 7: Implement Query Handling Functionality

Create a function to combine retrieval and generation:

def answer_query(question):
    context = retriever.retrieve(question)
    combined_context = "n".join(context)
    response = llm.generate(prompt_template.substitute(context=combined_context, question=question))
    return response.strip()

Step 8: Running Your RAG System

Test your RAG system by calling the answer_query function:

if __name__ == "__main__":
    user_question = "What are the key features of DeepSeek R1?"
    answer = answer_query(user_question)
    print("Answer:", answer)

Conclusion

By following these steps, you can implement a Retrieval-Augmented Generation (RAG) system using DeepSeek R1. This setup allows efficient information retrieval and accurate response generation. Explore the potential of DeepSeek R1 for your specific needs.

AI Solutions for Your Business

To enhance your company with AI, consider the following:

Identify Automation Opportunities: Find key customer interaction points that can benefit from AI.
Define KPIs: Ensure measurable impacts on business outcomes.
Select an AI Solution: Choose tools that fit your needs and allow customization.
Implement Gradually: Start with a pilot, gather data, and expand AI usage wisely.

For AI KPI management advice, connect with us at hello@itinai.com. For continuous insights, follow us on Telegram or @itinaicom.

List of Useful Links:

Unleash Your Creative Potential with AI Agents

Competitors are already using AI Agents

Business Problems We Solve

Automation of internal processes.
Optimizing AI costs without huge budgets.
Training staff, developing custom courses for business needs
Integrating AI into client work, automating first lines of contact

Large and Medium Businesses

Startups

Offline Business

Get a plan to reduce routine and improve metrics

100% of clients report increased productivity and reduced operati

AI Agents

Localization Project Manager – Coordinating translation workflows, answering vendor or process-related questions.

Job Title: Localization Project Manager Overview The Localization Project Manager plays a vital role in coordinating translation workflows while addressing vendor and process-related queries. This position is crucial for ensuring that translation projects are executed efficiently…
AI Agents

Environmental Health & Safety Officer – Answering compliance-related questions, retrieving safety protocols or audit histories.

Professional Summary The AI-driven Environmental Health & Safety Officer is a reliable and effective digital team member that performs repetitive and time-consuming tasks with remarkable speed, accuracy, and stability. By automating these tasks, it frees up…
AI Agents

Legal Contract Reviewer – Auto-flagging clause inconsistencies or retrieving precedent cases for review.

Job Title: Legal Contract Reviewer – Auto-flagging Clause Inconsistencies or Retrieving Precedent Cases for Review The AI functions as a reliable and effective digital team member that excels in performing repetitive and time-consuming tasks. With remarkable…
AI Agents

Customer Retention Analyst – Creating customer summaries, identifying churn risk patterns, and suggesting retention steps.

Customer Retention Analyst Professional Summary A highly analytical and detail-oriented Customer Retention Analyst with a proven track record in creating comprehensive customer summaries, identifying churn risk patterns, and suggesting effective retention strategies. Adept at leveraging data-driven…

Itinai.com httpss.mj.runmrqch2uvtvo russian handsome charisma 9fdbb2d5 a55b 425d 8f3b 76d26f86710f 2

AI Business Accelerator

Start Your AI Business in Just a Week with itinai.com

You’re a great fit if you:

Have an audience (even 500+ followers in Instagram, email, etc.)
Have an idea, service, or product you want to scale
Can invest 2–3 hours a day
You’re motivated to earn with AI but don’t want to handle technical setup

AI news and solutions

DALL·E 3 is now available in ChatGPT Plus and Enterprise

A safety mitigation stack was created for the wider release of DALL·E 3. Updates on provenance research will be shared.

AI Tech News
Rhymes AI Released Aria: An Open Multimodal Native MoE Model Offering State-of-the-Art Performance Across Diverse Language, Vision, and Coding Tasks

Introduction to Multimodal AI Multimodal artificial intelligence (AI) focuses on developing models that can understand various types of inputs like text, images, and videos. By combining these inputs, these models can provide more accurate and context-aware…

AI Tech News
Researchers engineer a material that can perform different tasks depending on temperature

Researchers have created a composite material that alters its behavior with temperature changes, aiming to advance autonomous robotics that interact dynamically with their surroundings.

AI Tech News
OpenAI vs. Vertex AI: A Comparison of Two Artificial Intelligence (AI) Powerhouses in 2024

AI Tech News
Microsoft Research Introduces E5-V: A Universal AI Framework for Multimodal Embeddings with Single-Modality Training on Text Pairs

A Universal AI Framework for Multimodal Embeddings Practical Solutions and Value A major development in artificial intelligence, multimodal large language models (MLLMs) combine verbal and visual comprehension to produce more accurate representations of multimodal inputs. These…

AI Tech News
EuroLLM Released: A Suite of Open-Weight Multilingual Language Models (EuroLLM-1.7B and EuroLLM-1.7B-Instruct) Capable of Understanding and Generating Text in All Official European Union languages

Practical Solutions and Value of EuroLLM Project Creating Multilingual Language Models The EuroLLM project aims to develop language models that understand and generate text in various European languages and other important languages like Arabic, Chinese, and…

AI Tech News
Microsoft AI Releases Phi 3.5 mini, MoE and Vision with 128K context, Multilingual and MIT License

Microsoft AI Releases Phi 3.5 Mini, MoE, and Vision Phi 3.5 Mini Instruct: Balancing Power and Efficiency Phi 3.5 Mini Instruct is a compact model with 3.8 billion parameters, supporting 128K context length for handling long…

AI Tech News
CodeJudge: An Machine Learning Framework that Leverages LLMs to Evaluate Code Generation Without the Need for Test Cases

Understanding the Evolving Role of Artificial Intelligence Artificial Intelligence (AI) is rapidly advancing. Large Language Models (LLMs) can understand human text and even generate code. However, assessing the quality of this code can be difficult as…

AI Tech News
UC Berkeley’s CyberGym: Revolutionizing AI Evaluation for Real-World Cybersecurity Vulnerabilities

Understanding CyberGym and Its Importance The world of cybersecurity is evolving rapidly, and with it, the methods we use to evaluate artificial intelligence (AI) agents in this field must also advance. CyberGym, developed by UC Berkeley,…

AI Tech News
How to Make Money with a Blog in 2025

Business Plan: Monetizing a Niche Blog with AI – 2025 Executive Summary: This plan outlines a rapid launch, low-overhead business model for generating income from a niche blog using AI-powered content and monetization tools provided by…

AI Business
Archon: A Machine Learning Framework for Large Language Model Enhancement Using Automated Inference-Time Architecture Search for Improved Task Performance

Introduction to Archon Artificial intelligence has advanced significantly with Large Language Models (LLMs), impacting areas like natural language processing and coding. To enhance LLM performance during use, effective inference-time techniques are essential. However, the research community…

AI Tech News
Google DeepMind Achieves State-of-the-Art Data-Efficient Reinforcement Learning RL with Improved Transformer World Models

Understanding Reinforcement Learning (RL) Reinforcement Learning (RL) helps agents learn how to maximize rewards by interacting with their environment. There are two main types: Online RL: This method involves taking actions, observing results, and updating strategies…

AI Tech News
AWS Strands Agents SDK: Simplifying AI Agent Development with Open Source

AWS Strands Agents SDK: Empowering AI Development AWS Strands Agents SDK: Empowering AI Development Amazon Web Services (AWS) has recently open-sourced its Strands Agents SDK, designed to simplify the process of developing AI agents. This initiative…

AI News
Oracle Data Science vs Azure AI: Maximize Product ROI with Smarter Forecasting

Technical Relevance In today’s competitive landscape, the integration of Artificial Intelligence (AI) and Machine Learning (ML) into enterprise workflows is no longer a luxury but a necessity. Oracle Data Science stands out by offering powerful tools…

Tools
Are Your AI Conversations Safe? Exploring the Depths of Adversarial Attacks on Machine Learning Models

Adversarial attacks pose a significant challenge to Language Models (LLMs), potentially compromising their integrity and reliability. A new research framework targets vulnerabilities in LMs, proposing innovative strategies to counter adversarial tactics and fortify their security. The…

AI Tech News
[SOLVED] Authorization Error Accessing Plugins in ChatGPT

The post discusses a common error that some users encounter when using ChatGPT plugins, which is the “Authorization error accessing plugins.” It provides a step-by-step guide on how to solve this error, including clearing the browser…

AI Tech News
Overcoming Hallucinations in AI: How Factually Augmented RLHF Optimizes Vision-Language Alignment in Large Multimodal Models

The text discusses the challenges in building Large Multimodal Models (LMMs) due to the disparity between multimodal data and text-only datasets. The researchers present LLaVA-RLHF, a vision-language model trained for enhanced multimodal alignment. They adapt the…

AI Tech News
PACT-3D: A High-Performance 3D Deep Learning Model for Rapid and Accurate Detection of Pneumoperitoneum in Abdominal CT Scans

Improving Diagnosis of Pneumoperitoneum with AI Understanding the Issue Delays in diagnosing pneumoperitoneum, which is air in the abdominal cavity, can seriously affect patient survival. Most cases in adults are due to a perforated organ, often…

AI Tech News
30,000 Google jobs at risk as AI replaces ad sales staff

Google’s ad sales division faces job insecurity as AI integration renders many roles redundant. The company plans to restructure its ad sales unit, comprising around 30,000 employees, as AI becomes integral to advertising tools. AI-based solutions…

AI Tech News
INSTRUCTIR: A Novel Machine Learning Benchmark for Evaluating Instruction Following in Information Retrieval

Large Language Models (LLMs) are being fine-tuned to align with user preferences and instructions in generative tasks. The need for robust benchmarks to evaluate retrieval systems led researchers at KAIST to create INSTRUCTIR. This benchmark focuses…

AI Tech News