Researchers from Washington University in St. Louis’s McKelvey School of Engineering have developed the Visual Active Search (VAS) framework, leveraging computer vision and adaptive learning to enhance geospatial exploration for combating illegal poaching and human trafficking. The framework has shown superior capabilities in detection and offers promise for broader applications in various domains.
“VMamba” is a new visual representation learning architecture developed by a team of researchers at UCAS, Huawei Inc., and Pengcheng Lab. It addresses the limitations of Vision Transformers (ViTs) and Convolutional Neural Networks (CNNs) by combining their strengths without inheriting their computational and representational inefficiencies. The model’s innovative Cross-Scan Module (CSM) and selective scan mechanism…
Zhipu AI unveiled GLM-4 in Beijing, a new model addressing challenges in Large Language Models. It supports a 128k token context length, achieving nearly 100% accuracy with long inputs and introducing the GLM-4 All Tools for autonomous complex task execution. Its multimodal capabilities and versatility make it a competitive choice for businesses, challenging existing models…
The rise of AI-generated deep fakes, known as “liar’s dividend,” is troubling as it impacts politics, society, and individuals. Deep fakes can distort truth and manipulate public perception, with experts struggling to reliably differentiate real from fake content. Efforts to curb deep fakes have been ineffective, raising concerns about the destabilization of truth.
CognoSpeak, developed by the University of Sheffield, is an AI tool for faster dementia and Alzheimer’s diagnosis. It analyzes speech patterns and cognitive tests, demonstrating accuracy comparable to traditional assessments. The tool is undergoing broader trials in UK memory clinics and shows potential to reduce waiting times and provide early treatment. AI supports neurological disorders…
MathVista is introduced as a comprehensive benchmark for mathematical reasoning in visual contexts. It amalgamates challenges from various multimodal datasets, aiming to refine mathematical reasoning in AI systems. Researchers from UCLA, University of Washington, and Microsoft extensively evaluate foundation models and highlight the potential of GPT-4V in achieving a state-of-the-art accuracy of 49.9%.
This text discusses the advancements in language modeling through the use of large language models (LLMs) and the challenges faced in optimizing these models for distributed training. It introduces an innovative asynchronous method that combines delayed Nesterov momentum updates and dynamic local updates, showcasing significant improvements in training efficiency for language models.
New York City enacted Law 144, regulating automated employment decision tools (AEDTs) to combat biases in hiring. The law requires auditing for bias, transparency notices, and sets fines for non-compliance. However, researchers from Cornell University found low compliance due to vague definitions and employer discretion. This raises questions about its effectiveness in addressing bias in…
The promise of robotaxis seemed imminent in 2023, but it came crashing down after tragic accidents involving Cruise, suspending its operations in California. While other companies like Waymo and Baidu continue their robotaxi services, challenges such as high costs, scalability issues, and safety concerns persist. The industry is poised for significant changes in 2024, but…
Google DeepMind recently created AlphaGeometry, an AI system combining a language model and a symbolic engine to solve complex geometry problems, demonstrating progress in AI reasoning skills. However, human understanding of technology is crucial to harness AI’s potential, as argued by Conrad Wolfram. AI is also being deployed to address racial segregation in South Africa…
The FDA approved DermaSensor’s AI-powered handheld skin cancer detector for US sale. Skin cancer, a common and fatal disease, often goes undetected. DermaSensor’s non-invasive device uses ESS to detect skin cancer with 96% accuracy and will be available through a subscription model. It aims to aid PCPs in making referrals to dermatologists and reduce unnecessary…
Model Predictive Control (MPC) is widely used in fields such as power systems and robotics. A recent study from Carnegie Mellon University focused on the convergence characteristics of a sampling-based MPC technique called Model Predictive Path Integral Control (MPPI). The research led to the development of a new method called CoVariance-Optimal MPC (CoVO-MPC), which outperformed…
Researchers from Meta and NYU introduce Self-Rewarding Language Models, addressing limitations in traditional reward models by training a self-improving reward model. Utilizing LLM-as-a-Judge prompting and Iterative DPO, the model iteratively improves instruction-following and reward-modeling abilities, outperforming existing models. This novel approach signifies promising progress in language model training beyond human-preference-based reward models.
Researchers from Google, Carnegie Mellon University, and Bosch Center for AI have developed a pioneering method to enhance adversarial robustness of deep learning models. The innovative approach achieves top-tier adversarial robustness using pretrained models, without the need for complex fine-tuning. The groundbreaking research has significant implications for various domains, including autonomous vehicles, cybersecurity, healthcare, and…
After human annotation, a machine-learning model automatically replicates the same annotations from tagged pictures, aiming to meet defined standards. Image annotation categorizes and labels images for object identification, crucial for computer vision, robotics, and autonomous driving. Notable image annotation tools for 2024 include Markup Hero, Keylabs, Labelbox, Scale, Supervisely, and others, each offering unique features…
AI and ML have advanced in various fields, including chemistry. However, challenges persist for smaller datasets. PythiaCHEM, an ML toolkit, addresses this with tailored tools for predictive models in chemistry. It’s implemented in Python, organizes modules and integrates with other toolkits. Researchers showcased its effectiveness in classifying anion transporters and predicting enantioselectivity, highlighting its flexibility…
Researchers have developed PEPSI (Protein Expression Polarity Subtyping in Immunostains) to analyze subcellular protein localization in tumor microenvironments, crucial for understanding immune responses in cancer. It identifies distinct immune cell states by computing cell surface biomarker polarity from immunofluorescence imaging data and has shown potential for predicting patient survival outcomes, revolutionizing precision medicine.
The state space model (SSM) is gaining interest due to advancements, benefiting from concurrent training to capture long-range dependencies. Vision Mamba (Vim) aims to overcome obstacles in visual backbone design. It combines position embeddings and bidirectional SSMs for global context modeling. Vim shows promise for image modeling and dense prediction with efficient computation. For more…
The New Hampshire attorney general’s office is investigating an AI-generated robocall impersonating President Biden, aiming to dissuade voter participation in the primary election. The incident is described as illegal, with concerns about AI being weaponized in elections. This follows Pennsylvania candidate Shamaine Daniels’s use of AI-driven robocalls, indicating the growing concern about AI in election…
A study by MIT’s Computer Science and Artificial Intelligence Laboratory assessed AI’s potential to replace human jobs, focusing on computer vision. It found AI can automate 1.6% of US worker wages, but economically replace only 23%. Customizing AI for specific tasks is costly, while language models like GPT-4 may have broader economic adoption potential. AI’s…