Itinai.com futuristic ui icon design 3d sci fi computer scree 5644fbaa d4d6 428f 950f 9cba83ba298d 2
Itinai.com futuristic ui icon design 3d sci fi computer scree 5644fbaa d4d6 428f 950f 9cba83ba298d 2

Build a Groundedness Verification Tool with Upstage API and LangChain for AI Developers

In today’s fast-paced digital landscape, ensuring the reliability of AI-generated content is crucial for businesses and developers alike. This article delves into how to build a Groundedness Verification Tool using Upstage API and LangChain, designed to help AI developers, data scientists, and business managers verify the accuracy of AI outputs.

Understanding the Target Audience

The primary audience for this tutorial includes AI developers, data scientists, and business managers who are focused on ensuring the reliability of AI-generated content. These professionals often face challenges related to the accuracy of AI outputs and the need for trustworthy information in their decision-making processes. They seek to enhance the credibility of their AI systems while maintaining efficiency in content generation. Thus, clear and concise communication, along with practical examples, is essential for this audience.

Introduction to Upstage’s Groundedness Check Service

Upstage’s Groundedness Check service offers a robust API that allows users to verify whether AI-generated responses are anchored in reliable source material. By submitting context–answer pairs to the Upstage endpoint, users can determine if the provided context supports a given answer and receive a confidence assessment of that grounding. This tutorial will walk you through utilizing Upstage’s core capabilities, including single-shot verification, batch processing, and multi-domain testing, to ensure that AI systems produce factual and trustworthy content across various subject areas.

Setting Up the Environment

To get started, you need to install the necessary packages:

pip install -qU langchain-core langchain-upstage

Next, set your Upstage API key in the environment to authenticate all subsequent groundedness check requests:

import os
os.environ["UPSTAGE_API_KEY"] = "Use Your API Key Here"

Creating the AdvancedGroundednessChecker Class

The AdvancedGroundednessChecker class wraps Upstage’s groundedness API into a reusable interface. This class allows for both single and batch context–answer checks while accumulating results. It includes methods to extract a confidence label from each response and compute overall accuracy statistics across all checks.

class AdvancedGroundednessChecker:
    def __init__(self):
        self.checker = UpstageGroundednessCheck()
        self.results = []
   
    def check_single(self, context: str, answer: str) -> Dict[str, Any]:
        request = {"context": context, "answer": answer}
        response = self.checker.invoke(request)
        result = {
            "context": context,
            "answer": answer,
            "grounded": response,
            "confidence": self._extract_confidence(response)
        }
        self.results.append(result)
        return result
   
    def batch_check(self, test_cases: List[Dict[str, str]]) -> List[Dict[str, Any]]:
        batch_results = []
        for case in test_cases:
            result = self.check_single(case["context"], case["answer"])
            batch_results.append(result)
        return batch_results
   
    def _extract_confidence(self, response) -> str:
        if hasattr(response, 'lower'):
            if 'grounded' in response.lower():
                return 'high'
            elif 'not grounded' in response.lower():
                return 'low'
        return 'medium'
   
    def analyze_results(self) -> Dict[str, Any]:
        total = len(self.results)
        grounded = sum(1 for r in self.results if 'grounded' in str(r['grounded']).lower())
        return {
            "total_checks": total,
            "grounded_count": grounded,
            "not_grounded_count": total - grounded,
            "accuracy_rate": grounded / total if total > 0 else 0
        }

Running Groundedness Checks

Here are examples of running single groundedness checks:

result1 = checker.check_single(
    context="Mauna Kea is an inactive volcano on the island of Hawai'i.",
    answer="Mauna Kea is 5,207.3 meters tall."
)
result2 = checker.check_single(
    context="Python is a high-level programming language created by Guido van Rossum in 1991.",
    answer="Python was made by Guido van Rossum & focuses on code readability."
)
result3 = checker.check_single(
    context="The Great Wall of China is approximately 13,000 miles long.",
    answer="The Great Wall of China is very long."
)
result4 = checker.check_single(
    context="Water boils at 100 degrees Celsius at sea level atmospheric pressure.",
    answer="Water boils at 90 degrees Celsius at sea level."
)

Batch Processing Example

Batch processing allows for multiple checks at once:

test_cases = [
    {
        "context": "Shakespeare wrote Romeo and Juliet in the late 16th century.",
        "answer": "Romeo and Juliet was written by Shakespeare."
    },
    {
        "context": "The speed of light is approximately 299,792,458 meters per second.",
        "answer": "Light travels at about 300,000 kilometers per second."
    },
    {
        "context": "Earth has one natural satellite called the Moon.",
        "answer": "Earth has two moons."
    }
]
batch_results = checker.batch_check(test_cases)

Results Analysis

After running the checks, you can analyze the results:

analysis = checker.analyze_results()
print(f"Total checks performed: {analysis['total_checks']}")
print(f"Grounded responses: {analysis['grounded_count']}")
print(f"Not grounded responses: {analysis['not_grounded_count']}")
print(f"Groundedness rate: {analysis['accuracy_rate']:.2%}") 

Multi-domain Testing

Conduct multi-domain validations to illustrate how Upstage handles groundedness across different subject areas:

domains = {
    "Science": {
        "context": "Photosynthesis is the process by which plants convert sunlight, carbon dioxide, & water into glucose and oxygen.",
        "answer": "Plants use photosynthesis to make food from sunlight and CO2."
    },
    "History": {
        "context": "World War II ended in 1945 after the surrender of Japan following the atomic bombings.",
        "answer": "WWII ended in 1944 with Germany's surrender."
    },
    "Geography": {
        "context": "Mount Everest is the highest mountain on Earth, located in the Himalayas at 8,848.86 meters.",
        "answer": "Mount Everest is the tallest mountain and is located in the Himalayas."
    }
}
for domain, test_case in domains.items():
    result = checker.check_single(test_case["context"], test_case["answer"])

Creating a Test Report

To generate a detailed test report summarizing the performance:

def create_test_report(checker_instance):
    report = {
        "summary": checker_instance.analyze_results(),
        "detailed_results": checker_instance.results,
        "recommendations": []
    }
    accuracy = report["summary"]["accuracy_rate"]
    if accuracy < 0.7:
        report["recommendations"].append("Consider reviewing answer generation process")
    if accuracy > 0.9:
        report["recommendations"].append("High accuracy - system performing well")
    return report

Conclusion

This tutorial demonstrated the importance of groundedness checking, batch processing capabilities, multi-domain testing, results analysis, and the implementation of an advanced wrapper. With Upstage’s Groundedness Check, users gain a scalable, domain-agnostic solution for real-time fact verification and confidence scoring. By integrating this service into their workflows, organizations can enhance the reliability of AI-generated outputs and maintain rigorous standards of factual integrity across all applications. For further exploration, check out the Upstage website for more resources and documentation.

FAQ

  • What is the purpose of the Groundedness Check service? The Groundedness Check service verifies if AI-generated responses are based on reliable sources.
  • Who can benefit from this tool? AI developers, data scientists, and business managers looking to ensure the accuracy of AI outputs can benefit from this tool.
  • How does batch processing work? Batch processing allows users to check multiple context-answer pairs at once, streamlining the verification process.
  • What should I do if the accuracy rate is low? If the accuracy rate is below 70%, it is advisable to review the answer generation process.
  • Can this tool be used across different domains? Yes, the tool is designed to handle groundedness checks across various subject areas effectively.
Itinai.com office ai background high tech quantum computing 0002ba7c e3d6 4fd7 abd6 cfe4e5f08aeb 0

Vladimir Dyachkov, Ph.D
Editor-in-Chief itinai.com

I believe that AI is only as powerful as the human insight guiding it.

Unleash Your Creative Potential with AI Agents

Competitors are already using AI Agents

Business Problems We Solve

  • Automation of internal processes.
  • Optimizing AI costs without huge budgets.
  • Training staff, developing custom courses for business needs
  • Integrating AI into client work, automating first lines of contact

Large and Medium Businesses

Startups

Offline Business

100% of clients report increased productivity and reduced operati

AI news and solutions