How LLMs Store and Use Knowledge? This AI Paper Introduces Knowledge Circuits: A Framework for Understanding and Improving Knowledge Storage in Transformer-Based LLMs

Understanding Large Language Models (LLMs)

Large language models (LLMs) can comprehend and create text that resembles human writing. They achieve this by storing extensive knowledge within their systems. This ability allows them to tackle complex reasoning tasks and communicate effectively with people. However, researchers are still working to improve how these models manage and utilize knowledge to make them more efficient and reliable.

Challenges with LLMs

A major issue with LLMs is that they can produce incorrect, biased, or misleading information. These problems stem from a lack of understanding of how models organize and retrieve knowledge. Without insights into the interactions among various model components, it’s tough to fix errors or enhance performance. Most research has focused on individual elements instead of the larger relationships between them, which limits improvements in accuracy and safe knowledge retrieval.

Current Analysis Techniques

Traditional methods for analyzing language models usually look at specific neurons that store factual information. Techniques have been developed to edit this stored data to correct inaccuracies and reduce biases. Unfortunately, these methods often fail to generalize, can disrupt related knowledge, and don’t fully utilize the edited information. They also neglect how different model components work together, which hampers their effectiveness in addressing knowledge issues.

Introducing Knowledge Circuits

Researchers from Zhejiang University and the National University of Singapore have proposed a new approach called “knowledge circuits.” These circuits consist of interconnected parts within a model’s structure, including various components like MLPs and attention heads. By using models like GPT-2 and TinyLLAMA, they demonstrated how these circuits effectively store, retrieve, and apply knowledge by emphasizing the collaboration of components.

Building Knowledge Circuits

To create knowledge circuits, researchers analyzed the model’s structure and studied how changes affected performance. They identified key connections and the roles of different components. This research revealed how some components are responsible for transferring information and understanding contextual relationships. Knowledge circuits were shown to gather information in earlier layers and refine it in later layers to improve predictive accuracy.

Improvements in Performance

The research showed that knowledge circuits could maintain over 70% of a model’s performance while using only 10% of its parameters. For instance, the accuracy for landmark-country relations improved from 16% to 36%. This indicates that focusing on essential circuits can enhance accuracy. Additionally, knowledge circuits help models understand complex issues like hallucinations and adapt during learning.

Limitations of Existing Methods

The study also highlighted the limitations of current knowledge-editing techniques. While some methods successfully added new knowledge, they often disrupted unrelated areas. This issue emphasizes the need for more precise editing techniques that consider the broader context of knowledge circuits rather than focusing solely on individual components.

Conclusion

This research offers a new perspective on how large language models operate by focusing on knowledge circuits. By shifting the emphasis from isolated parts to interconnected structures, it provides a comprehensive framework for improving these models. The insights gained can lead to better knowledge management, safer editing practices, and improved model understanding. Future research could explore the scalability of knowledge circuits across different domains, enhancing the effectiveness of LLMs.

Stay Connected

For further insights, check out the Paper and GitHub Page. Follow us on Twitter, join our Telegram Channel, and connect with our LinkedIn Group. Also, join our 60k+ ML SubReddit.

Transform Your Business with AI

If you want to enhance your company using AI, consider the following steps:

Identify Automation Opportunities: Find key customer interaction points that can benefit from AI.
Define KPIs: Ensure your AI projects have measurable impacts on business outcomes.
Select an AI Solution: Choose tools that suit your needs and allow customization.
Implement Gradually: Start with a pilot program, collect data, and expand usage wisely.

For AI KPI management advice, contact us at hello@itinai.com. For ongoing insights into leveraging AI, follow us on Telegram or Twitter.

Enhance Sales and Customer Engagement with AI

Discover how AI can transform your sales processes and customer interactions at itinai.com.

List of Useful Links:

Unleash Your Creative Potential with AI Agents

Competitors are already using AI Agents

Business Problems We Solve

Automation of internal processes.
Optimizing AI costs without huge budgets.
Training staff, developing custom courses for business needs
Integrating AI into client work, automating first lines of contact

Large and Medium Businesses

Startups

Offline Business

Get a plan to reduce routine and improve metrics

100% of clients report increased productivity and reduced operati

AI Agents

Localization Project Manager – Coordinating translation workflows, answering vendor or process-related questions.

Job Title: Localization Project Manager Overview The Localization Project Manager plays a vital role in coordinating translation workflows while addressing vendor and process-related queries. This position is crucial for ensuring that translation projects are executed efficiently…
AI Agents

Environmental Health & Safety Officer – Answering compliance-related questions, retrieving safety protocols or audit histories.

Professional Summary The AI-driven Environmental Health & Safety Officer is a reliable and effective digital team member that performs repetitive and time-consuming tasks with remarkable speed, accuracy, and stability. By automating these tasks, it frees up…
AI Agents

Legal Contract Reviewer – Auto-flagging clause inconsistencies or retrieving precedent cases for review.

Job Title: Legal Contract Reviewer – Auto-flagging Clause Inconsistencies or Retrieving Precedent Cases for Review The AI functions as a reliable and effective digital team member that excels in performing repetitive and time-consuming tasks. With remarkable…
AI Agents

Customer Retention Analyst – Creating customer summaries, identifying churn risk patterns, and suggesting retention steps.

Customer Retention Analyst Professional Summary A highly analytical and detail-oriented Customer Retention Analyst with a proven track record in creating comprehensive customer summaries, identifying churn risk patterns, and suggesting effective retention strategies. Adept at leveraging data-driven…

Itinai.com httpss.mj.runmrqch2uvtvo russian handsome charisma 9fdbb2d5 a55b 425d 8f3b 76d26f86710f 2

AI Business Accelerator

Start Your AI Business in Just a Week with itinai.com

You’re a great fit if you:

Have an audience (even 500+ followers in Instagram, email, etc.)
Have an idea, service, or product you want to scale
Can invest 2–3 hours a day
You’re motivated to earn with AI but don’t want to handle technical setup

AI news and solutions

Stream-Omni: Revolutionizing Cross-Modal AI with Advanced Alignment Techniques

Understanding the Target Audience The innovative Stream-Omni model, recently developed by the Chinese Academy of Sciences, primarily targets AI researchers, business leaders in technology, and decision-makers in industries that leverage AI for multimodal applications. These groups…

AI Tech News
Chameleon: An AI System for Efficient Large Language Model Inference Using Adaptive Caching and Multi-Level Scheduling Techniques

Transforming Natural Language Processing with AI Introduction to Large Language Models (LLMs) Large language models (LLMs) are essential tools in various fields like healthcare, education, and technology. They can perform tasks such as language translation, sentiment…

AI Tech News
This AI Paper from KAIST AI Unveils ORPO: Elevating Preference Alignment in Language Models to New Heights

The KAIST AI team has introduced Odds Ratio Preference Optimization (ORPO), a novel method enhancing the alignment of language models with human preferences. This innovative approach eliminates the complexities of traditional alignment methods, promising improved model…

AI Tech News
Rounding up day one of the AI Safety Summit

The UK’s AI Safety Summit at Bletchley Park saw the British government unveil “The Bletchley Declaration,” highlighting the risks associated with advanced AI systems and emphasizing the need for international cooperation. The declaration lacked concrete policy…

AI Tech News
Leveraging AlphaFold and AI for Rapid Discovery of Targeted Treatments for Liver Cancer

Accelerating Drug Discovery with AI: The Role of AlphaFold in Targeting Liver Cancer AI Transforms Drug Discovery AI is revolutionizing drug discovery, making medicine design and synthesis more efficient. AlphaFold, an AI program by DeepMind, predicts…

AI Tech News
MetaStone-S1: The Future of AI Reasoning with Efficient Reflective Generative Models

Understanding MetaStone-S1: A Breakthrough in AI Reasoning The introduction of MetaStone-S1 by researchers from MetaStone-AI and USTC marks a significant advancement in the field of artificial intelligence. This reflective generative model stands out for its ability…

AI Tech News
Civil rights groups encourage European Commission to probe OpenAI and Microsoft

Microsoft’s deepening relationship with OpenAI has prompted scrutiny over competition within the AI sector. Civil society organizations, including Article 19, urge the EU and UK competition authorities to investigate the partnership’s potential anticompetitive impact. They emphasize…

AI Tech News
Microsoft AI Research Introduces UFO: An Innovative UI-Focused Agent to Fulfill User Requests Tailored to Applications on Windows OS, Harnessing the Capabilities of GPT-Vision

Microsoft has introduced UFO, a UI-focused agent for Windows OS interaction. UFO uses natural language commands to address challenges in navigating the GUI of Windows applications. It employs a dual-agent framework and GPT-Vision to analyze and…

AI Tech News
Rethinking Toxic Data in LLM Pretraining for Enhanced Steerability and Detoxification

Improving Language Models: The Role of Toxic Data The effectiveness of large language models (LLMs) greatly depends on the quality of their training data. A common practice in developing these models is to filter out harmful…

AI News
Microsoft criticized by The Guardian for AI-generated poll

Microsoft is facing criticism from The Guardian for an AI-generated poll that accompanied a news story about a woman’s death. The poll prompted users to speculate on the cause of her death, with options including murder,…

AI Tech News
How AI taught Cassie the two-legged robot to run and jump

Boston Dynamics’ robots, though appearing highly agile in videos, are still manually coded and struggle with new obstacles. However, researchers have used reinforcement learning to teach a robot, Cassie, dynamic movements without explicit training. This approach…

AI Tech News
Qwen AI Introduces Qwen2.5-Max: A large MoE LLM Pretrained on Massive Data and Post-Trained with Curated SFT and RLHF Recipes

Qwen AI Introduces Qwen2.5-Max Overview The field of artificial intelligence is changing quickly. Developing powerful language models is a priority, but it comes with challenges like needing more computing power and complicated training processes. Researchers are…

AI Tech News
Advancing Time Series Forecasting: The Impact of Bi-Mamba4TS’s Bidirectional State Space Modeling on Long-Term Predictive Accuracy

AI Tech News
Amazon Unveils Q: A Generative AI Chatbot that can be Tailored Specifically to a Business

Amazon Q, an AI-powered assistant by AWS, offers customized support tailored to specific business needs and workflows, with high security and privacy standards. It assists developers with AWS insights, automates feature development, integrates with company systems,…

AI Tech News
Introduction to Data Manipulation in R with {dplyr}

The {dplyr} package in R is designed for data manipulation, offering functions to filter, sort, and summarize data. One can group data, count distinct values, and strategically create or modify variables with “if else” or “case…

AI Tech News
AWS AI Labs Introduce CodeSage: A Bidirectional Encoder Representation Model for Source Code

AWS AI Labs has unveiled CODE SAGE, a groundbreaking bidirectional encoder representation model for programming code. It uses a two-stage training scheme and a vast dataset to enhance comprehension and manipulation of code. This model outperforms…

AI Tech News
Beyond Deep Learning: Evaluating and Enhancing Model Performance for Tabular Data with XGBoost and Ensembles

Practical Solutions for Model Selection in AI Value of XGBoost and Deep Learning Models In solving real-world data science problems, model selection is crucial. Tree ensemble models like XGBoost are traditionally favored for classification and regression…

AI Tech News
Data center energy demands are outstripping what the grid can provide

The demand for AI is challenging environmental sustainability, as it significantly increases electricity consumption. Data centers, particularly those supporting generative AI, strain global energy infrastructure. The rising electricity demands from AI and data centers are creating…

AI Tech News
Google DeepMind’s AlphaProof and AlphaGeometry-2 Solves Advanced Reasoning Problems in Mathematics

Google DeepMind’s AlphaProof and AlphaGeometry-2 Achieve Success in Mathematical Reasoning Practical Solutions and Value In a groundbreaking achievement, AI systems developed by Google DeepMind have attained a silver medal-level score in the 2024 International Mathematical Olympiad…

AI Tech News
Google Launches Open-Source Agent Development Kit (ADK) for Multi-Agent Systems

Google’s Agent Development Kit (ADK): A Business Perspective Google’s Agent Development Kit (ADK): A Business Perspective Introduction to ADK Google has recently introduced the Agent Development Kit (ADK), an open-source framework designed to facilitate the development,…

AI Tech News