Meet CommonCanvas: An Open Diffusion Model That Has Been Trained Using Creative-Commons Images

Researchers have proposed building an image dataset under a Creative Commons license to overcome obstacles in text-to-image generation. They have used transfer learning to generate captions for CC photos and created a dataset called CommonCatalog to train Latent Diffusion Models (LDM). The CommonCanvas models perform competitively compared to the SD2-base baseline. The team has made their models and dataset freely available on GitHub.

 Meet CommonCanvas: An Open Diffusion Model That Has Been Trained Using Creative-Commons Images

Meet CommonCanvas: An Open Diffusion Model That Has Been Trained Using Creative-Commons Images

Artificial intelligence has made significant advancements in text-to-image generation. This technology has various practical applications, from content creation to assisting the visually impaired and storytelling. However, researchers have faced challenges due to the lack of high-quality data and copyright issues related to internet-scraped datasets.

Obstacle 1: Absence of Captions

Although high-resolution Creative Commons (CC) photos are open-licensed, they often lack the necessary textual descriptions (captions) for training text-to-image generative models. This makes it difficult for the model to understand and produce visuals based on textual input.

Obstacle 2: Scarcity of CC Photos

Compared to larger proprietary datasets, CC photos are scarcer despite being a valuable resource. This scarcity raises concerns about whether there is enough data to successfully train high-quality models.

To overcome these obstacles, a team of researchers has used transfer learning techniques to create synthetic captions for CC photos. They have compiled a dataset of photos and made-up captions, which can be used to train generative models that translate words into visuals.

The team has also developed a compute- and data-efficient training recipe to tackle the scarcity of data. They have shown that using just 3% of the data used to train previous models is sufficient to achieve the same quality. This suggests that there are enough CC photos available to train high-quality models effectively.

The team has trained several text-to-image models using this data and training procedure. Together, these models form the CommonCanvas family and represent a significant advancement in generative models. They can generate visual outputs of comparable quality to existing models.

The largest model in the CommonCanvas family, trained on a CC dataset less than 3% the size of the LAION dataset, performs similarly to existing models in human evaluations. Despite the dataset size constraints and the use of artificial captions, the method is effective in generating high-quality results.

The team’s primary contributions are:

  • Using transfer learning to produce captions for CC photos without captions initially
  • Providing the CommonCatalog dataset, which includes around 70 million CC photos released under an open license
  • Training a series of Latent Diffusion Models (LDM) using the CommonCatalog dataset, collectively known as CommonCanvas, which perform competitively compared to existing models
  • Applying training optimizations that make the models train almost three times faster
  • Making the trained CommonCanvas model, CC photos, artificial captions, and the CommonCatalog dataset freely available on GitHub to encourage cooperation and further research

If you want to evolve your company with AI and stay competitive, consider using CommonCanvas. It can transform your way of work and redefine your sales processes and customer engagement. To get started, identify automation opportunities, define measurable KPIs, select an AI solution that aligns with your needs, and implement gradually. For AI KPI management advice, connect with us at hello@itinai.com. Stay updated on the latest AI research news and projects by joining our ML SubReddit, Facebook Community, Discord Channel, and Email Newsletter.

For more information, check out the original post on MarkTechPost.

Spotlight on a Practical AI Solution

If you’re looking for a practical AI solution to automate customer engagement and manage interactions across all customer journey stages, consider the AI Sales Bot from itinai.com/aisalesbot. This AI-powered bot is designed to work 24/7 and can significantly streamline your sales processes and customer engagement. To learn more about how AI can redefine your sales processes, visit itinai.com.

List of Useful Links:

AI Products for Business or Try Custom Development

AI Sales Bot

Welcome AI Sales Bot, your 24/7 teammate! Engaging customers in natural language across all channels and learning from your materials, it’s a step towards efficient, enriched customer interactions and sales

AI Document Assistant

Unlock insights and drive decisions with our AI Insights Suite. Indexing your documents and data, it provides smart, AI-driven decision support, enhancing your productivity and decision-making.

AI Customer Support

Upgrade your support with our AI Assistant, reducing response times and personalizing interactions by analyzing documents and past engagements. Boost your team and customer satisfaction

AI Scrum Bot

Enhance agile management with our AI Scrum Bot, it helps to organize retrospectives. It answers queries and boosts collaboration and efficiency in your scrum processes.