Large language models (LLMs) like GPT-4 and PaLM 2 struggle with mathematical problem-solving due to the need for imagination, reasoning, and computation. However, with multiple attempts, LLMs show potential for improvement. Fine-tuning techniques such as supervised step-by-step solution fine-tuning, solution-cluster reranking, and sequential multi-tasking fine-tuning can enhance LLMs’ ability to generate and evaluate solutions. The…
Danish urban oasis, JOE & THE JUICE, has expanded to over 250 European locations and is now making its mark in the US and the Middle East. They turned to Pixis, an AI solution, to streamline their marketing efforts and target audiences effectively. By leveraging Pixis’ AI infrastructure, Joe & The Juice achieved a 14%…
A recent collaborative study by IBM Research, Princeton University, and Virginia Tech highlights the security risks associated with fine-tuning large language models (LLMs). The research reveals that even a small number of harmful entries in a seemingly benign dataset can compromise the security of LLMs. The study emphasizes the need for developers to balance customization…
According to Bill Gates, Generative AI like ChatGPT has reached its peak and may not see significant improvements, even with the release of GPT-5. However, Gates acknowledges that he could be wrong. He believes AI will become more accurate and affordable in the next 2-5 years, benefiting people in poor countries. Gates is particularly excited…
Generative models are advancing in the field of Artificial Intelligence (AI). The concept of intelligent interaction with the physical environment requires planning at low and high levels. A research team from Google Deepmind, MIT, and UC Berkeley has proposed Video Language Planning (VLP) to combine text-to-video and vision-language models. VLP aims to facilitate visual planning…
Researchers have developed a few-shot-based tuning framework called LAMP for text-to-video (T2V) generation. Existing methods for T2V either require extensive data or result in aligning with template videos. LAMP addresses this challenge by using a few-shot approach, allowing a text-to-image diffusion model to learn motion patterns. It significantly improves video quality and generation freedom. LAMP…
Woodpecker is a new approach that aims to fix hallucinations in Multimodal Large Language Models (MLLM), such as GPT-4V. By connecting the MLLM to the internet, Woodpecker allows the model to validate its generated descriptions using relevant internet data, leading to self-correction. It builds a visual knowledge base from the image and uses it to…
This article discusses the evolution of Data/ML platforms and their support for complex MLOps practices. It explains how data infrastructures have evolved from simple systems like online services and OLTP/OLAP databases to more sophisticated setups like data lakes and real-time data/ML infrastructures. The challenges and solutions at each stage are described, as well as the…
Entropy regularization is a technique used in reinforcement learning (RL) to encourage exploration. By adding an entropy bonus to the reward function, RL algorithms strive to maximize the entropy or randomness of the actions taken. This helps to explore new possibilities and avoid premature convergence to suboptimal actions. Entropy regularization offers benefits such as improved…
CLIN (Continually Learning Language Agent) is an innovative architecture that allows language agents to adapt and improve their performance over time. It introduces a dynamic textual memory system that focuses on causal abstractions and enables the agent to learn and refine its performance. CLIN exhibits rapid adaptation and efficient generalization across diverse tasks and environments,…
Researchers from Google Research and the University of Toronto have developed a zero-shot agent for autonomous learning and task execution in live computer environments. The agent, built on top of PaLM2, a large language model, uses a single set of instruction prompts for all activities and demonstrates high task completion rates on the MINIWOB++ benchmark.…
A new technique called Meta-learning for Compositionality improves the capability of tools like ChatGPT to make compositional generalizations. It surpasses current methods and even matches or exceeds human performance in some cases.
This post describes the implementation of text-to-image search and image-to-image search using a pre-trained model called uform, which is inspired by Contrastive Language Image Pre-Training (CLIP). The post provides code snippets for implementing these search functions and explains how cosine similarity is used to calculate similarity between text and images. The results of the searches…
Nvidia has been instructed by the US government to halt its sales of AI computer chips to China. The ban, which was expected in November, will take immediate effect. Nvidia, however, claims that it does not anticipate a significant impact on its financial results due to the strong global demand for its products. The new…
The Internet Watch Foundation (IWF) has warned of the alarming rate at which AI is being used to create child sexual abuse images, posing a significant threat to internet safety. The UK-based watchdog has identified nearly 3,000 AI-generated images violating UK laws, including images of actual abuse victims and underage celebrities. The use of AI…
The latest motion estimation method extracts long-term motion trajectories for each pixel, even in fast movements and complex scenes. OmniMotion explores this exciting technology and discusses the future of motion analysis.
Anthropic, Google, Microsoft, and OpenAI have established the Frontier Model Forum, with goals to set AI safety standards, evaluate frontier models, and ensure responsible development. Chris Meserole, the former Director of the Artificial Intelligence and Emerging Technology Initiative at the Brookings Institution, has been appointed as the Executive Director. The Forum aims to advance AI…
PLMs have transformed Natural Language Processing, but their computational and memory needs pose challenges. The authors propose LoftQ, a quantization framework for pre-trained models. They combine low-rank approximation and quantization to approximate high-precision weights. Results show LoftQ outperforms QLoRA in various tasks, with improved performance in Rouge-1 for XSum and CNN/DailyMail using 4-bit quantization. Further…
YouTube Music has introduced a new feature that enables users to create custom cover art for their playlists using AI. Users can select from different categories, such as animals and nature, and ask the AI to create artwork based on specific prompts. The feature is currently only available to users in the US, but YouTube…
Generative AI systems are becoming more common and are being used in various fields. There is a growing need to assess the potential risks associated with their use, particularly in terms of public safety. Google DeepMind researchers have developed a framework to evaluate social and ethical hazards of AI systems. This framework considers the system’s…