How Veriff decreased deployment time by 80% using Amazon SageMaker multi-model endpoints

Veriff is an identity verification platform partner for organizations in various industries. They use advanced technology, including AI-powered automation and human feedback, to verify user identities. Veriff standardized their model deployment workflow using Amazon SageMaker, reducing costs and development time. They use SageMaker multi-model endpoints and Triton Inference Server to manage and deploy ML models efficiently. This solution has led to a significant cost reduction and faster model deployment for Veriff.

Veriff: Streamlining Model Deployment with Amazon SageMaker

Veriff is an identity verification platform trusted by leading organizations in finance, gaming, and more. They combine AI-powered automation with human expertise to ensure trust in user identities throughout the customer journey.

Infrastructure and Development Challenges

Veriff faced challenges in deploying and managing their machine learning (ML) models, which ranged from lightweight to complex computer vision models. Their existing solution required manual provisioning of GPU instances and building REST API wrappers for each model, resulting in operational overhead and suboptimal cost profiles.

Solution Overview

To address these challenges, Veriff adopted Amazon SageMaker’s multi-model endpoints (MMEs) and NVIDIA’s Triton Inference Server. MMEs allowed them to deploy and manage a large number of models efficiently, reducing hosting costs and deployment overhead. Triton Inference Server simplified the process of building REST APIs from models and enabled the deployment of model ensembles.

Model Versioning and Continuous Deployment

Veriff implemented a monorepo approach for managing their models, using Pants for code management and applying code quality tools and unit tests. They integrated this monorepo with a continuous integration (CI) tool to automate the deployment process, ensuring model quality and versioning.

Cost and Deployment Speed Benefits

By leveraging SageMaker MMEs, Veriff reduced model development time from 10 days to an average of 2 days. They also achieved a 75% cost reduction in GPU model serving compared to their previous Kubernetes-based solution. The auto scaling features of SageMaker allowed them to optimize costs based on traffic patterns.

Conclusion

Veriff’s adoption of Amazon SageMaker MMEs streamlined their model deployment workflow, reducing costs, improving efficiency, and maintaining performance. Their CI/CD pipeline and model versioning mechanism serve as a reference implementation for combining software development best practices with SageMaker MMEs.

Unlock the Power of AI for Your Business

Discover how AI can revolutionize your company and stay competitive. Identify automation opportunities, define measurable KPIs, select the right AI solution, and implement gradually. For AI KPI management advice, contact us at hello@itinai.com. Stay updated on AI insights through our Telegram channel t.me/itinainews and Twitter @itinaicom.

Spotlight on a Practical AI Solution: AI Sales Bot

Explore itinai.com/aisalesbot, an AI-powered sales bot designed to automate customer engagement and manage interactions throughout the customer journey. Discover how AI can redefine your sales processes and customer engagement.

List of Useful Links:

AI Lab in Telegram @aiscrumbot – free consultation

How Veriff decreased deployment time by 80% using Amazon SageMaker multi-model endpoints

AWS Machine Learning Blog

Twitter – @itinaicom

Unleash Your Creative Potential with AI Agents

Competitors are already using AI Agents

Business Problems We Solve

Automation of internal processes.
Optimizing AI costs without huge budgets.
Training staff, developing custom courses for business needs
Integrating AI into client work, automating first lines of contact

Large and Medium Businesses

Startups

Offline Business

Get a plan to reduce routine and improve metrics

100% of clients report increased productivity and reduced operati

AI Agents

Localization Project Manager – Coordinating translation workflows, answering vendor or process-related questions.

Job Title: Localization Project Manager Overview The Localization Project Manager plays a vital role in coordinating translation workflows while addressing vendor and process-related queries. This position is crucial for ensuring that translation projects are executed efficiently…
AI Agents

Environmental Health & Safety Officer – Answering compliance-related questions, retrieving safety protocols or audit histories.

Professional Summary The AI-driven Environmental Health & Safety Officer is a reliable and effective digital team member that performs repetitive and time-consuming tasks with remarkable speed, accuracy, and stability. By automating these tasks, it frees up…
AI Agents

Legal Contract Reviewer – Auto-flagging clause inconsistencies or retrieving precedent cases for review.

Job Title: Legal Contract Reviewer – Auto-flagging Clause Inconsistencies or Retrieving Precedent Cases for Review The AI functions as a reliable and effective digital team member that excels in performing repetitive and time-consuming tasks. With remarkable…
AI Agents

Customer Retention Analyst – Creating customer summaries, identifying churn risk patterns, and suggesting retention steps.

Customer Retention Analyst Professional Summary A highly analytical and detail-oriented Customer Retention Analyst with a proven track record in creating comprehensive customer summaries, identifying churn risk patterns, and suggesting effective retention strategies. Adept at leveraging data-driven…

Itinai.com httpss.mj.runmrqch2uvtvo russian handsome charisma 9fdbb2d5 a55b 425d 8f3b 76d26f86710f 2

AI Business Accelerator

Start Your AI Business in Just a Week with itinai.com

You’re a great fit if you:

Have an audience (even 500+ followers in Instagram, email, etc.)
Have an idea, service, or product you want to scale
Can invest 2–3 hours a day
You’re motivated to earn with AI but don’t want to handle technical setup

AI news and solutions

Revolutionizing Code Efficiency: ByteDance’s Seed-Coder Trained on 6 Trillion Tokens

Understanding Seed-Coder and Its Impact on Coding Efficiency In the fast-evolving landscape of artificial intelligence, ByteDance researchers have introduced Seed-Coder, a groundbreaking model-centric code language model (LLM) trained on an astounding 6 trillion tokens. This innovation…

AI Tech News
AgentGen: Automating Environment and Task Generation to Enhance Planning Abilities in LLM-Based Agents with 592 Environments and 7,246 Trajectories

AgentGen: Automating Environment and Task Generation to Enhance Planning Abilities Practical Solutions and Value Large Language Models (LLMs) have revolutionized artificial intelligence, especially in agent-based systems. However, a major challenge is the labor-intensive process of creating…

AI Tech News
OpenAI Launches HealthBench: Open-Source Benchmark for Healthcare AI Performance

OpenAI Launches HealthBench: A New Standard for Evaluating AI in Healthcare Introduction to HealthBench OpenAI has introduced HealthBench, an open-source framework aimed at assessing the performance and safety of large language models (LLMs) specifically in healthcare…

AI News
Google integrates its Gemini models into coding and development tools

Google recently unveiled Duet AI for Developers, an AI-powered coding tool, and AI Studio for Gemini API development. Duet AI streamlines coding and integrates with Google’s services, facilitating a smoother coding experience. Additionally, AI Studio offers…

AI Tech News
Can We Overcome Prompt Brittleness in Large Language Models? Google AI Introduces Batch Calibration for Enhanced Performance

Large language models (LLMs) face challenges related to prompt brittleness and biases in the input. Google researchers have proposed a new method called Batch Calibration (BC) to address these issues. BC is a zero-shot approach that…

AI Tech News
Apple Unveils DiffuCoder: A Game-Changer in AI-Powered Code Generation

Apple has recently unveiled a groundbreaking development in the world of artificial intelligence and coding with the introduction of DiffuCoder, a 7 billion parameter diffusion model specially tailored for code generation. This innovation is poised to…

AI Tech News
The 5 Pillars of Trustworthy LLM Testing

This text discusses the 5 pillars of trustworthy large language model (LLM) testing: hallucination, bias, reasoning, generation quality, and model mechanics. It highlights the importance of understanding LLM behaviors and testing them in different scenarios. The…

AI Tech News
You’re Not Too Small for AI. You’re Too Busy to Avoid It.

You’re Not Too Small for AI. You’re Too Busy to Avoid It. Lost in a Sea of Documents? Imagine this: you’re a small business owner, and every day, you face the daunting task of managing a…

AI Document Assistant
Gemma: Introducing new state-of-the-art open models

Gemma is designed for ethical AI development using the research and technology utilized for creating Gemini models.

AI Tech News
OpenAI Introduces New Measures to Combat Election Misinformation

OpenAI unveils a comprehensive strategy to counter misinformation during elections using advanced AI tools. The company aims to prevent misuse of its technology by blocking creation of deceptive chatbots and pausing its use in political campaigning.…

AI Tech News
Researchers from Google and the University of Toronto Introduce Groundbreaking Zero-Shot Agent for Autonomous Learning and Task Execution in Live Computer Environments

Researchers from Google Research and the University of Toronto have developed a zero-shot agent for autonomous learning and task execution in live computer environments. The agent, built on top of PaLM2, a large language model, uses…

AI Tech News
SquirrelML: Predicting Squirrel Approach in NYC’s Central Park

Discover squirrel behavior in Central Park using machine learning. Analyze sightings, predict encounters, and gain interactive insights. Read more on Towards Data Science.

AI Tech News
Two-Tower Networks and Negative Sampling in Recommender Systems

Summary: The text discusses the key elements that power advanced recommendation engines, focusing on two-tower neural networks and the use of negative sampling. It explores the efficiency and effectiveness of two-tower networks in ranking, the impact…

AI Tech News
MMInference: Accelerating Long-Context Vision-Language Models with Dynamic Sparse Attention

Enhancing Vision-Language Models with MMInference Enhancing Vision-Language Models with MMInference Introduction to MMInference Microsoft Research has developed a groundbreaking method called MMInference, which significantly improves the efficiency of long-context vision-language models (VLMs). By integrating visual understanding…

AI Tech News
Turn Meeting Notes into Actionable Docs in One Click

Turn Meeting Notes into Actionable Docs in One Click Many businesses struggle with the common issue of lost documents and time-consuming document searches, leading to inefficient workflows and misaligned team collaboration. Imagine spending countless hours sifting…

AI Document Assistant
What Are Deepfakes: Everything You Want to Know (Research)

Deepfakes, a product of AI generative models, create convincing fake images and videos that can deceive and defraud people. They’ve advanced from trivial uses to more concerning applications, including misinformation and identity fraud. Understanding their creation…

AI Tech News
TFT-ID (Table/Figure/Text IDentifier): An Object Detection AI Model Finetuned to Extract Tables, Figures, and Text Sections in Academic Papers

The Value of Automating Data Extraction in Academic Research Challenges in Academic Research The increasing number of academic papers poses challenges for researchers to track the latest innovations. Manual data extraction from tables and figures is…

AI Tech News
Internet of Agents (IoA): A Novel Artificial Intelligence AI Framework for Agent Communication and Collaboration Inspired by the Internet

The Internet of Agents (IoA): Enhancing Multi-Agent Collaboration with AI Practical Solutions and Value The IoA framework offers a scalable and flexible platform for enhancing collaboration among autonomous agents, inspired by the success of the Internet…

AI Tech News
Weight Scope Alignment Method that Utilizes Weight Scope Regularization to Constrain the Alignment of Weight Scopes during Training

Model Fusion and Weight Scope Alignment in AI Practical Solutions and Value Model fusion involves merging multiple deep models into one, enhancing generalizability, efficiency, and robustness while preserving the original models’ capabilities. This process is crucial…

AI Tech News
Advancing Sustainability Through Automation and AI in Fungi-Based Bioprocessing

Advancing Sustainability Through Automation and AI in Fungi-Based Bioprocessing Integrating automation and AI in fungi-based bioprocesses is a significant step towards sustainable biomanufacturing. This approach enhances process efficiency, reduces human error, and enables predictive analytics and…

AI Tech News