Google AR & VR and University of Central Florida collaborated on a study to validate VALID, a virtual avatar library comprising 210 fully rigged avatars representing seven races. The study, which involved a global participant pool, revealed consistent recognition for some races and own-race bias effects. The team highlighted implications for virtual avatar applications and…
The study introduces “Vary,” a method to expand the vision vocabulary in Large Vision-Language Models (LVLMs) for enhanced perception tasks. This method aims to improve fine-grained perception, particularly in document-level OCR and chart understanding. Experimental results demonstrate Vary’s effectiveness, outperforming other LVLMs in certain tasks. For more information, visit the Paper and Project.
Meta’s introduction of Emu as a generative AI for movies signifies a pivotal moment where technology and culture merge. Emu promises to revolutionize access to information and entertainment, offering unprecedented personalization. However, the potential drawbacks of oversimplification and reinforcement of biases call for a vigilant and balanced approach to utilizing this powerful tool.
LLM360 is a groundbreaking initiative promoting comprehensive open-sourcing of Large Language Models. It releases two 7B parameter LLMs, AMBER and CRYSTALCODER, with full training code, data, model checkpoints, and analyses. The project aims to enhance transparency and reproducibility in the field by making the entire LLM training process openly available to the community.
Numerical simulations used for climate policy face limitations in accurately representing cloud physics and heavy precipitation due to computational constraints. Integrating machine learning (ML) can potentially enhance climate simulations by effectively modeling small-scale physics. Challenges include obtaining sufficient training data and addressing code complexity. ClimSim, a comprehensive dataset, aims to bridge this gap by facilitating…
The MIT-Pillar AI Collective has selected three fellows for fall 2023. They are pursuing research in AI, machine learning, and data science, with the goal of commercializing their innovations. The Fellows include Alexander Andonian, Daniel Magley, and Madhumitha Ravichandra, each working on innovative projects in their respective fields as part of the program’s mission to…
LLMLingua is a novel compression technique launched by Microsoft AI to address challenges in processing lengthy prompts for Large Language Models (LLMs). It leverages strategies like dynamic budget control, token-level iterative compression, and instruction tuning-based approach to significantly reduce prompt sizes, proving to be both effective and affordable for LLM applications. For more details, refer…
Differential privacy (DP) in machine learning safeguards individuals’ data privacy by ensuring model outputs are not influenced by individual data. Google researchers introduced an auditing scheme for assessing privacy guarantees, emphasizing the connection between DP and statistical generalization. The scheme offers quantifiable privacy guarantees with reduced computational costs, suitable for various DP algorithms. [49 words]
MIT researchers have found that modern computational models derived from machine learning are approaching the goal of mimicking the human auditory system. The study, led by Josh McDermott, emphasizes the importance of training these models with auditory input, including background noise, to closely match the activation patterns of the human auditory cortex. The research aims…
Oxford University encourages Economics and Management students to use AI tools like ChatGPT for essay drafting, emphasizing the need for critical thinking and fact-checking. Educators express concerns about AI’s potential influence and students’ tendency to use it regardless of guidelines. The university cautiously embraces AI, recognizing its growing relevance while also setting clear boundaries for…
Mistral AI introduces the Mixtral 8x7b language model, revolutionizing the domain with its unique architecture featuring a sparse Mixture of Expert (MoE) layer. Boasting 8 expert models within a single framework, it demonstrates exceptional performance and a remarkable context capacity of 32,000 tokens. Mixtral 8x7b’s versatile multilingual fluency, extensive parameter count, and performance across diverse…
Large Language Models (LLMs) are powerful in language tasks but struggle with high-quality human data. A study proposes a self-training technique, ReST𝐃𝑀, using model-generated synthetic data, which enhances language models’ performance. ReST𝐃𝑀 improves math and code generation skills significantly, surpassing the effectiveness of human-provided data but risks overfitting after multiple cycles. The study is credited…
Google recently unveiled Duet AI for Developers, an AI-powered coding tool, and AI Studio for Gemini API development. Duet AI streamlines coding and integrates with Google’s services, facilitating a smoother coding experience. Additionally, AI Studio offers a user-friendly platform for developing apps and chatbots with Gemini model APIs. Both tools demonstrate Google’s commitment to AI…
This post outlines a solution for using Amazon Transcribe and Amazon Bedrock to automatically generate concise summaries of video or audio recordings. By leveraging a combination of speech-to-text capability and generative AI models, the solution aims to simplify and automate the note-taking process, enhancing collaboration and saving time. The post provides instructions for deploying, running,…
This post showcases fine-tuning a large language model (LLM) using Parameter-Efficient Fine-Tuning (PEFT) and deploying the fine-tuned model on AWS Inferentia2. It discusses using the AWS Neuron SDK to access the device and deploying the model with DJLServing. It also details the necessary steps, including prerequisites, a walkthrough for fine-tuning the LLM, and hosting it…
The text describes the importance of Machine Learning Operations (MLOps) in integrating ML models into production systems. It explains Amazon SageMaker MLOps features like Projects, Pipelines, and Model Registry. The process of creating a custom project template for CI/CD pipelines using AWS services and GitHub is detailed, along with a summary of the implementation.
Axel Springer is the first global publishing house to collaborate with us on deepening the integration of journalism in AI technologies.
Snapchat has introduced a new feature for its Plus subscribers, allowing them to create AI-generated snaps. This update, available to $3.99 plan users, offers innovative ways to generate and edit images. Additionally, subscribers can access AI selfie features and extend photo backgrounds. These enhancements demonstrate Snapchat’s commitment to integrating AI into its platform.
LimeWire, known for music piracy in the early 2000s, shut down in 2010 due to copyright violations. Now, it’s returned as an AI music generation platform. It allows users to create music and images and enables them to share in the ad revenue in $LMWR crypto token. However, its controversial history raises concerns about its…
Diffusion models are successfully used in text-to-picture production, with unCLIP models gaining attention. While unCLIP models surpass other models in composition benchmarks, they require more parameters and training data. Arizona State University introduces ECLIPSE, a contrastive learning technique, enabling efficient training with fewer parameters, improving text-to-image models. This innovative approach shows promising results.