How to Keep Foundation Models Up to Date with the Latest Data? Researchers from Apple and CMU Introduce the First Web-Scale Time-Continual (TiC) Benchmark with 12.7B Timestamped Img-Text Pairs for Continual Training of VLMs

Researchers from Apple and Carnegie Mellon University have developed a benchmark called TIC-DataComp to train foundation models like OpenAI’s CLIP models continuously. They found that starting training at the most recent checkpoint and replaying historical data delivers performance on par with an Oracle while being 2.7 times more computationally efficient. The findings highlight the need for continuous training of foundation models to adapt to shifting data distributions. The benchmarks and research will be made public for wider use.

 How to Keep Foundation Models Up to Date with the Latest Data? Researchers from Apple and CMU Introduce the First Web-Scale Time-Continual (TiC) Benchmark with 12.7B Timestamped Img-Text Pairs for Continual Training of VLMs

**How to Keep Foundation Models Up to Date with the Latest Data? Researchers from Apple and CMU Introduce the First Web-Scale Time-Continual (TiC) Benchmark with 12.7B Timestamped Img-Text Pairs for Continual Training of VLMs**

In the world of AI, there has been a paradigm shift in multimodal learning thanks to models like CLIP, Flamingo, and Stable Diffusion. These models have greatly improved image generation and zero-shot generalization. However, there is a challenge when it comes to keeping these models up to date with the latest data.

Researchers from Apple and Carnegie Mellon University have examined how OpenAI’s CLIP models compare to other models in terms of their robustness. They found that while OpenCLIP models maintain their performance, OpenAI models show a disparity in retrieval performance on recent data compared to older data. This highlights the need for models to adapt and evolve with shifting data distributions.

One method of accommodating changing data is to train a new CLIP model whenever fresh image-text data becomes available. However, this approach is impractical and time-consuming. Recent efforts have focused on perpetual learning techniques, but current benchmarks are limited in scope.

To address this issue, the researchers have introduced the Time-Continual (TiC) benchmark for training CLIP models. They have included “crawl time” data in the CommonPool dataset and curated new datasets from the internet. They have also tested various continuous learning approaches, including replay buffers and learning rate schedules.

The researchers found that by starting training at the most recent checkpoint and replaying all historical data, the cumulative technique delivers performance on par with an Oracle at 2.7x the computing efficiency. They also gained insights into learning rate schedules and buffer sizes for optimal performance.

The code and timing data collected will be made public for the wider community to use. This research paves the way for continuous training of foundation models.

If you’re interested in evolving your company with AI, it’s important to stay competitive and keep your foundation models up to date with the latest data. AI can redefine your way of work by automating customer interactions and improving business outcomes. To get started, identify automation opportunities, define measurable KPIs, select an AI solution that aligns with your needs, and implement gradually.

For AI KPI management advice and continuous insights into leveraging AI, you can connect with us at hello@itinai.com. We also offer practical AI solutions like the AI Sales Bot, which automates customer engagement and manages interactions across all customer journey stages. Visit itinai.com/aisalesbot to learn more about how AI can redefine your sales processes and customer engagement.

List of Useful Links:

AI Products for Business or Try Custom Development

AI Sales Bot

Welcome AI Sales Bot, your 24/7 teammate! Engaging customers in natural language across all channels and learning from your materials, it’s a step towards efficient, enriched customer interactions and sales

AI Document Assistant

Unlock insights and drive decisions with our AI Insights Suite. Indexing your documents and data, it provides smart, AI-driven decision support, enhancing your productivity and decision-making.

AI Customer Support

Upgrade your support with our AI Assistant, reducing response times and personalizing interactions by analyzing documents and past engagements. Boost your team and customer satisfaction

AI Scrum Bot

Enhance agile management with our AI Scrum Bot, it helps to organize retrospectives. It answers queries and boosts collaboration and efficiency in your scrum processes.