This AI Paper Dives into Embodied Evaluations: Unveiling the Tong Test as a Novel Benchmark for Progress Toward Artificial General Intelligence

Researchers at the National Key Laboratory of General Artificial Intelligence have proposed a new benchmark for evaluating Artificial General Intelligence (AGI) called the Tong Test. This test focuses on complex environments and emphasizes the importance of ability and value-oriented evaluation rather than task-oriented evaluation. The Tong Test includes features such as infinite tasks, self-driven task generation, value alignment, and causal understanding. The proposed virtual platform also supports embodied AI in training and testing. The Tong Test provides a practical pathway for developing AI algorithms. Source: MarkTechPost.

Review: Unveiling the Tong Test as a Novel Benchmark for Progress Toward Artificial General Intelligence

Unlike narrow or specialized AI systems designed for specific tasks, Artificial General Intelligence (AGI) can perform a wide range of functions that aim to replicate human intelligence’s broad cognitive abilities and adaptability. AGI can function autonomously by making decisions and taking actions independently. AGI can also comprehend ambiguous or incomplete information.

Achieving AGI is a complex and challenging endeavor, as it requires solving numerous difficult problems in machine learning, natural language processing, robotics, and other AI-related fields.

Researchers at the National Key Laboratory of General Artificial Intelligence propose a new way of evaluating AGI by introducing the Tong Test. “Tong” corresponds to the Chinese character of general in AGI.

They propose that AGI evaluation should be rooted in scenarios with the complex environments of DEPSI. They say that only through evaluations within DEPSI can the human-like abilities of AGI, such as commonsense reasoning, intention inference of social interactions, trust, and self-awareness, be promptly assessed. The Tong test offers a new perspective on AGI evaluation by emphasizing the importance of DEPSI as ability, value-oriented rather than a task-oriented evaluation.

The Tong test is a benchmark and evaluation system focusing on essential features such as infinite tasks, self-driven task generation, value alignment, and causal understanding. Their proposed virtual platform could also support embodied AI in training and testing. Embodied AI agents acquire information within this platform and continue to learn and finetune their values and abilities interactively.

To support infinite tasks, they follow a compositional graphical model as a basic form of knowledge representation that parses any given scene’s spatial, temporal, and causal relations. They define a fluent space for the time-varying variables; these represent all possible scene configurations that can be represented within a continuous DEPSI environment space.

The Tong test spans two domains called the U–V dual system. The U-system describes the agent’s understanding of extrinsic physical or social rules. In contrast, the V-system comprises the agent’s intrinsic values, defined as a set of value functions upon which the self-driven behaviors of the agent are built. The Tong test platform has modules for intermediate data visualization and a panel that displays the model’s performance, indicating how well the tested model performed.

Thus, the proposed Tong test based on DEPSI defines the five multidimensional levels of values and abilities and provides a practical pathway for building theoretical guidance for developing AI algorithms.

Check out the Paper and Reference Article. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 30k+ ML SubReddit, 40k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more. If you like our work, you will love our newsletter.

Based on the meeting notes, the action items and their assigned person are as follows:

1. Research and explore the Tong Test: All attendees
2. Investigate the National Key Laboratory of General Artificial Intelligence’s proposed evaluation method: All attendees
3. Determine the criteria for evaluating AGI based on the Tong Test: All attendees
4. Assess the feasibility of incorporating DEPSI into AGI evaluations: All attendees
5. Explore the application of the Tong test for evaluating embodied AI agents: All attendees
6. Investigate the compositional graphical model for knowledge representation: All attendees
7. Understand the U–V dual system of the Tong test and its implications for AGI evaluation: All attendees
8. Examine the modules for data visualization and performance display in the Tong test platform: All attendees
9. Analyze the five multidimensional levels of values and abilities defined by the Tong test: All attendees
10. Stay updated with the latest AI research news, cool AI projects, and more by subscribing to the newsletter: All attendees

List of Useful Links:

AI Scrum Bot – ask about AI scrum and agile

This AI Paper Dives into Embodied Evaluations: Unveiling the Tong Test as a Novel Benchmark for Progress Toward Artificial General Intelligence

MarkTechPost

Twitter – @itinaicom

Unleash Your Creative Potential with AI Agents

Competitors are already using AI Agents

Business Problems We Solve

Automation of internal processes.
Optimizing AI costs without huge budgets.
Training staff, developing custom courses for business needs
Integrating AI into client work, automating first lines of contact

Large and Medium Businesses

Startups

Offline Business

Get a plan to reduce routine and improve metrics

100% of clients report increased productivity and reduced operati

AI Agents

Localization Project Manager – Coordinating translation workflows, answering vendor or process-related questions.

Job Title: Localization Project Manager Overview The Localization Project Manager plays a vital role in coordinating translation workflows while addressing vendor and process-related queries. This position is crucial for ensuring that translation projects are executed efficiently…
AI Agents

Environmental Health & Safety Officer – Answering compliance-related questions, retrieving safety protocols or audit histories.

Professional Summary The AI-driven Environmental Health & Safety Officer is a reliable and effective digital team member that performs repetitive and time-consuming tasks with remarkable speed, accuracy, and stability. By automating these tasks, it frees up…
AI Agents

Legal Contract Reviewer – Auto-flagging clause inconsistencies or retrieving precedent cases for review.

Job Title: Legal Contract Reviewer – Auto-flagging Clause Inconsistencies or Retrieving Precedent Cases for Review The AI functions as a reliable and effective digital team member that excels in performing repetitive and time-consuming tasks. With remarkable…
AI Agents

Customer Retention Analyst – Creating customer summaries, identifying churn risk patterns, and suggesting retention steps.

Customer Retention Analyst Professional Summary A highly analytical and detail-oriented Customer Retention Analyst with a proven track record in creating comprehensive customer summaries, identifying churn risk patterns, and suggesting effective retention strategies. Adept at leveraging data-driven…

Itinai.com httpss.mj.runmrqch2uvtvo russian handsome charisma 9fdbb2d5 a55b 425d 8f3b 76d26f86710f 2

AI Business Accelerator

Start Your AI Business in Just a Week with itinai.com

You’re a great fit if you:

Have an audience (even 500+ followers in Instagram, email, etc.)
Have an idea, service, or product you want to scale
Can invest 2–3 hours a day
You’re motivated to earn with AI but don’t want to handle technical setup

AI news and solutions

Top 9 Open Source Cursor Alternatives for Developers in 2025

Introduction to Open Source Coding Tools The landscape of coding tools is rapidly evolving, especially with the rise of AI-powered solutions. In 2025, open-source alternatives are becoming increasingly competitive with commercial products like Cursor. These tools…

AI Tech News
SepLLM: A Practical AI Approach to Efficient Sparse Attention in Large Language Models

SepLLM: Enhancing Large Language Models with Efficient Sparse Attention Large Language Models (LLMs) are powerful tools for various natural language tasks, but their performance can be limited by complex computations, especially with long inputs. Researchers have…

AI Tech News
Greg Brockman, co-founder of OpenAI, has resigned as company president

OpenAI co-founder Greg Brockman has resigned as company president following the departure of CEO Sam Altman. In a statement, Brockman expressed pride in OpenAI’s achievements since its start eight years ago. The company has named Mira…

AI Tech News
LAION Presents BUD-E: An Open-Source Voice Assistant that Runs on a Gaming Laptop with Low Latency without Requiring an Internet Connection

LAION, in collaboration with the ELLIS Institute Tübingen, Collabora, and the Tübingen AI Center, is developing BUD-E, an innovative voice assistant aiming to revolutionize human-AI interaction. Their model prioritizes natural and empathetic responses with a low…

AI Tech News
End-to-End Robotics Learning: A Comprehensive Guide to Behavior Cloning with LeRobot

Understanding the Target Audience The primary audience for this tutorial includes data scientists, machine learning engineers, and robotics developers eager to implement behavior cloning policies in their robotic systems. These professionals often face challenges such as…

AI Tech News
Poplar: A Distributed Training System that Extends Zero Redundancy Optimizer (ZeRO) with Heterogeneous-Aware Capabilities

Practical Solutions for Distributed Training with Heterogeneous GPUs Challenges in Model Training Training large models requires significant memory and computing power, which can be addressed by effectively utilizing heterogeneous GPU resources. Introducing Poplar Poplar is a…

AI Tech News
NVIDIA AI Research Unveils ‘Star Attention’: A Novel AI Algorithm for Efficient LLM Long-Context Inference

Challenges of Transformer-based Large Language Models (LLMs) Transformer-based LLMs struggle with efficiently processing long sequences due to the complex self-attention mechanism, which leads to high computational and memory needs. This makes it difficult to use these…

AI Tech News
Deploy ML models built in Amazon SageMaker Canvas to Amazon SageMaker real-time endpoints

Amazon SageMaker Canvas now supports deploying ML models to real-time inferencing endpoints, eliminating the need for manual export, configuration, testing, and deployment. This feature enables users to easily consume model predictions and drive actions outside of…

AI Tech News
A Comprehensive Survey of Small Language Models: Architectures, Datasets, and Training Algorithms

Practical Solutions and Value of Small Language Models (SLMs) Democratizing AI for Everyday Devices Small language models (SLMs) aim to bring high-quality machine intelligence to smartphones, tablets, and wearables by operating directly on these devices, making…

AI Tech News
Meet CircleMind: An AI Startup that is Transforming Retrieval Augmented Generation with Knowledge Graphs and PageRank

Introducing CircleMind: Revolutionizing AI with Knowledge Graphs and PageRank In today’s world of information overload, CircleMind is transforming how AI processes and understands data. This innovative startup is enhancing Retrieval Augmented Generation (RAG) by combining knowledge…

AI Tech News
Meet the Agile2024 Program Team – Reese Schmit

Agile2024, scheduled for July 22-26 in Dallas, introduces the dedicated team responsible for curating a memorable conference experience. In this edition, meet Reese Schmit, a member of the Agile2024 Program Team. This update was originally posted…

Scrum Agile News
Microsoft’s Azure AI Model Catalog Expands with Groundbreaking Artificial Intelligence Models

Microsoft has expanded its Azure AI Model Catalog with various foundation and generative AI models. The addition of 40 new models, including text-to-image and image embedding capabilities, marks a major advancement in the field of artificial…

AI Tech News
AWS Q Developer vs Microsoft Azure AI: The Top AI Tools for Cloud-Native Product Teams

The Impact of Amazon Q Developer on Cloud-Based Development In the fast-evolving landscape of software development, the integration of artificial intelligence (AI) into coding practices has become a game-changer. Amazon Web Services (AWS) has introduced the…

Tools
Google AI Introduces LAuReL (Learned Augmented Residual Layer): Revolutionizing Neural Networks with Enhanced Residual Connections for Efficient Model Performance

Understanding Model Efficiency Challenges In today’s world of large language and vision models, achieving model efficiency is crucial. However, these models often struggle with efficiency in real-world use due to: High training costs for computing power.…

AI Tech News
Researchers at Stanford University Expose Systemic Biases in AI Language Models

AI Tech News
Tencent Hunyuan Releases State-of-the-Art Multilingual Translation Models: Hunyuan-MT-7B and Chimera-7B

Introduction Tencent’s Hunyuan team has made a significant leap in the field of multilingual machine translation with the release of two advanced models: Hunyuan-MT-7B and Hunyuan-MT-Chimera-7B. These models were showcased during the WMT2025 General Machine Translation…

AI Tech News
Hugging Face Releases Picotron: A Tiny Framework that Solves LLM Training 4D Parallelization

The Challenge of Training Large Language Models Training large language models (LLMs) like GPT and Llama is complex and resource-intensive. For example, training Llama-3.1-405B required about 39 million GPU hours, which is like running a single…

AI Tech News
FAQ

Unlocking Business Potential Through AI: Your Questions Answered At itinai.com, we specialize in transforming businesses through cutting-edge artificial intelligence solutions. Below, we address common questions about our services, expertise, and commitment to advancing AI technologies globally.…

Chief Editor Blog
BitNet b1.58: Pioneering the Future of Efficient Large Language Models

The development of Large Language Models (LLMs) has led to significant advancements in processing human-like text. However, the increased size and complexity of these models pose challenges in computational and environmental costs. BitNet b1.58, utilizing 1-bit…

AI Tech News
Why Random Forests Dominate: Insights from the University of Cambridge’s Groundbreaking Machine Learning Research!

This University of Cambridge research explores the exceptional performance of tree ensembles, particularly random forests, in machine learning. The study presents a nuanced perspective on their success, emphasizing their adaptive smoothing and the integration of randomness…

AI Tech News

This AI Paper Dives into Embodied Evaluations: Unveiling the Tong Test as a Novel Benchmark for Progress Toward Artificial General Intelligence

Review: Unveiling the Tong Test as a Novel Benchmark for Progress Toward Artificial General Intelligence

List of Useful Links:

AI Scrum Bot – ask about AI scrum and agile

This AI Paper Dives into Embodied Evaluations: Unveiling the Tong Test as a Novel Benchmark for Progress Toward Artificial General Intelligence

MarkTechPost

Twitter – @itinaicom

Unleash Your Creative Potential with AI Agents

Competitors are already using AI Agents

Business Problems We Solve

Localization Project Manager – Coordinating translation workflows, answering vendor or process-related questions.

Environmental Health & Safety Officer – Answering compliance-related questions, retrieving safety protocols or audit histories.

Legal Contract Reviewer – Auto-flagging clause inconsistencies or retrieving precedent cases for review.

Customer Retention Analyst – Creating customer summaries, identifying churn risk patterns, and suggesting retention steps.

Start Your AI Business in Just a Week with itinai.com

You’re a great fit if you:

AI news and solutions

Top 9 Open Source Cursor Alternatives for Developers in 2025

SepLLM: A Practical AI Approach to Efficient Sparse Attention in Large Language Models

Greg Brockman, co-founder of OpenAI, has resigned as company president

LAION Presents BUD-E: An Open-Source Voice Assistant that Runs on a Gaming Laptop with Low Latency without Requiring an Internet Connection

End-to-End Robotics Learning: A Comprehensive Guide to Behavior Cloning with LeRobot

Poplar: A Distributed Training System that Extends Zero Redundancy Optimizer (ZeRO) with Heterogeneous-Aware Capabilities

NVIDIA AI Research Unveils ‘Star Attention’: A Novel AI Algorithm for Efficient LLM Long-Context Inference

Deploy ML models built in Amazon SageMaker Canvas to Amazon SageMaker real-time endpoints

A Comprehensive Survey of Small Language Models: Architectures, Datasets, and Training Algorithms

Meet CircleMind: An AI Startup that is Transforming Retrieval Augmented Generation with Knowledge Graphs and PageRank

Meet the Agile2024 Program Team – Reese Schmit

Microsoft’s Azure AI Model Catalog Expands with Groundbreaking Artificial Intelligence Models

AWS Q Developer vs Microsoft Azure AI: The Top AI Tools for Cloud-Native Product Teams

Google AI Introduces LAuReL (Learned Augmented Residual Layer): Revolutionizing Neural Networks with Enhanced Residual Connections for Efficient Model Performance

Researchers at Stanford University Expose Systemic Biases in AI Language Models

Tencent Hunyuan Releases State-of-the-Art Multilingual Translation Models: Hunyuan-MT-7B and Chimera-7B

Hugging Face Releases Picotron: A Tiny Framework that Solves LLM Training 4D Parallelization

FAQ

BitNet b1.58: Pioneering the Future of Efficient Large Language Models

Why Random Forests Dominate: Insights from the University of Cambridge’s Groundbreaking Machine Learning Research!

Editorial Policy

Availability

Comment Policy

Subscription

Press releases

Terms of Use

This AI Paper Dives into Embodied Evaluations: Unveiling the Tong Test as a Novel Benchmark for Progress Toward Artificial General Intelligence

Review: Unveiling the Tong Test as a Novel Benchmark for Progress Toward Artificial General Intelligence

List of Useful Links:

AI Scrum Bot – ask about AI scrum and agile This AI Paper Dives into Embodied Evaluations: Unveiling the Tong Test as a Novel Benchmark for Progress Toward Artificial General Intelligence MarkTechPost Twitter – @itinaicom

Unleash Your Creative Potential with AI Agents

Competitors are already using AI Agents

Business Problems We Solve

Start Your AI Business in Just a Week with itinai.com

You’re a great fit if you:

AI news and solutions

AI Scrum Bot – ask about AI scrum and agile

This AI Paper Dives into Embodied Evaluations: Unveiling the Tong Test as a Novel Benchmark for Progress Toward Artificial General Intelligence

MarkTechPost

Twitter – @itinaicom