-
This AI Research from China Provides an Exhaustive Evaluation of the Latest SOTA Visual Language Model GPT-4V(ision) and Its Application in Autonomous Driving Scenarios
Researchers from Shanghai Artificial Intelligence Laboratory, GigaAI, East China Normal University, and The Chinese University of Hong Kong evaluated GPT-4V(ision), a Visual Language Model, in autonomous driving scenarios. GPT-4V demonstrates superior performance in scene understanding and causal reasoning, but challenges remain in direction discernment and traffic light recognition. Further research and development are needed. Source:…
-
Can Language Models Reason Beyond Words? Exploring Implicit Reasoning in Multi-Layer Hidden States for Complex Tasks
Large Language Models (LLMs) have shown impressive capabilities in language understanding and reasoning. To enhance their proficiency, researchers have employed the chain of thought (CoT) technique but it delays the generation of the desired answer. In this paper, the authors propose an implicit CoT reasoning approach that allows the model to produce the final answer…
-
This 3D printer can watch itself fabricate objects
Engineers have created a fast and precise 3D inkjet printer that uses computer vision to regulate material deposition in real time. The printer can handle multiple materials, allowing for a diverse range of fabrication possibilities.
-
Principal Financial Group uses AWS Post Call Analytics solution to extract omnichannel customer insights
Principal, a global investment management leader, is using AWS CCI Post Call Analytics to gain insights into their contact center interactions and enhance the customer experience. They are leveraging AI capabilities to transcribe voice calls, analyze interactions, and identify call drivers. Principal has successfully deployed the PCA solution, processed over 1 million customer calls, and…
-
Are we heading towards an algocracy?
The concept of algocracy, or governance by algorithm, is becoming increasingly prevalent as algorithmic and machine learning systems are implemented in government and public sectors. This form of governance utilizes AI, blockchain, and algorithms to make decisions. While there are potential benefits in terms of efficiency, there are also concerns regarding transparency, bias, and the…
-
This 3D printer can watch itself fabricate objects
Researchers from MIT, the MIT spinout Inkbit, and ETH Zurich have developed a new 3D inkjet printing system that uses computer vision to adjust the amount of resin each nozzle deposits in real-time. This contactless system allows for the use of materials that cure more slowly than traditional acrylates, enabling the fabrication of complex devices…
-
Behind Microsoft CEO Satya Nadella’s push to get AI tools in developers’ hands
Microsoft CEO Satya Nadella recently made surprise appearances at two developer conferences in San Francisco to showcase new AI-powered tools. He emphasized the company’s focus on developers and its aim to make AI tools more accessible and promote creativity. Nadella discussed the partnership with OpenAI and the platform shift to natural language AI tools. He…
-
a16z invests in AI startup linked to nonconsensual porn
Venture capital firm Andreessen Horowitz, or a16z, has invested in the generative AI platform Civitai, which allows users to share AI-generated art and resources. However, some resources on Civitai are being used to create nonconsensual porn. The platform’s “Bounties” option has led to requests for AI models to generate explicit images of celebrities and private…
-
The stories of underage workers in the AI and data services industry
The AI industry has a history of labor exploitation, with young individuals from impoverished backgrounds being drawn to online platforms for flexible work and higher wages. However, this exposes them to harmful content, leading to mental health problems. Underage workers have been found in Pakistan and Kenya joining platforms like Appen under false pretenses, highlighting…
-
Google AI Proposes Easy End-to-End Diffusion-based Text to Speech E3-TTS: A Simple and Efficient End-to-End Text-to-Speech Model Based on Diffusion
The E3 TTS model developed by Google utilizes diffusion models to generate high-quality audio waveforms directly from plain text input. It eliminates the need for sequential processing and intermediate features, improving upon traditional text-to-speech (TTS) systems. The model combines a pre-trained BERT model for text extraction and a diffusion UNet model for waveform refinement, resulting…