Large language model
The UK AI Safety Summit and Biden’s executive order have brought AI regulation into focus, but questions remain about the specifics. The Bletchley Declaration, endorsed by 28 countries, emphasizes international consensus on AI oversight. The US and EU have proposed their own regulations, while other countries consider their own initiatives. The implementation of regulations across…
ChipNeMo explores the use of domain adaptation techniques to improve the performance of language models (LLMs) in chip design. The study evaluates three LLM applications in chip design and highlights the potential for further refinement in domain-adapted LLM approaches. The goal is to enhance LLM performance and reduce model size while maintaining or improving performance…
Atom is a new low-bit quantisation technique developed by researchers to increase the serving throughput of Large Language Models (LLMs). By using low-bit operators and quantisation, Atom reduces memory usage without sacrificing precision, resulting in improved end-to-end throughput by up to 7.73 times compared to existing approaches. Atom addresses the need for more efficient LLM…
US export restrictions on Nvidia have created a growing market in China for Huawei’s new AI chips, specifically the Ascend 910B. Chinese AI companies are turning to Huawei’s chip as a viable alternative to Nvidia’s high-end chips. The export controls, intended to slow Chinese AI innovation, may have inadvertently accelerated China’s path to self-reliance. As…
This article discusses various methods to style plots using Matplotlib. It covers topics such as changing runtime configuration parameters, creating and using style files, applying style sheets, and limiting styling to code blocks. These techniques allow for customization and consistency in plotting styles.
Chinese researchers have developed a deep learning model called circ2CBA that can predict binding sites between circular RNAs and RNA-binding proteins. This has significant implications for understanding diseases, particularly cancer. The model uses sequence information and a unique process to accurately identify these critical interactions, surpassing existing methods. The results validate the effectiveness of circ2CBA…
Researchers at the University of Oxford have introduced DynPoint, an artificial intelligence algorithm that enables the rapid synthesis of novel views for unconstrained monocular videos. DynPoint employs explicit estimation of consistent depth and scene flow for surface points, creating a hierarchical neural point cloud to generate views of the target frame. The proposed model demonstrates…
This research paper introduces a method called “codebook features” that aims to enhance the interpretability and control of neural networks. By leveraging vector quantization, the method transforms the dense and continuous computations of neural networks into a more interpretable form by discretizing the network’s hidden states. The experiments conducted demonstrate the effectiveness of codebook features…
Researchers from the University of Tokyo have developed a deep learning model called 3D-Memory In Memory (3D-MIM) to accurately predict the expansion of supernova (SN) shells in galaxy simulations. By combining the model with the Hamiltonian splitting method, the researchers can integrate SN-affected particles separately. The 3D-MIM model shows strong generalization capabilities and offers a…
Big language models (LLMs) are becoming skilled in programming and refactoring code to create libraries for software developers. Researchers from MIT CSAIL, MIT Brain and Cognitive Sciences, and Harvey Mudd College present LILO, a neurosymbolic framework that integrates LLMs with automatic refactoring to learn libraries of reusable function abstractions. LILO demonstrates improved performance compared to…
The ControlLLM framework, developed by researchers from The Hong Kong University of Science and Technology, OpenGVLab, Shanghai AI Laboratory, Tsinghua University, and SenseTime, enables large language models (LLMs) to utilize multi-modal tools for solving complex real-world tasks. ControlLLM excels in accuracy, efficiency, and versatility, surpassing existing methods in various tasks involving image, audio, and video…
Cohere’s Embed v3 model is a valuable solution for finding relevant and informative content in text data. It outperforms other models in benchmark tests and offers efficient navigation through vast amounts of information. Supporting over 100 languages, Embed v3 enhances search applications and retrieval-augmented generative AI systems.
AI-generated counterfeit nudes of students from Westfield High School in New Jersey, US, were distributed among peers. The school has not disclosed specific details or taken disciplinary action, citing confidentiality concerns. Similar incidents have occurred in Spain and involved public figures and online influencers. New Jersey lacks legislation to penalize the creation and distribution of…
Hollywood’s Screen Actors Guild-American Federation of Television and Radio Artists (SAG-AFTRA) is dissatisfied with the latest proposal from the Alliance of Motion Picture and Television Producers (AMPTP) in ongoing labor discussions. The sticking point is the use of AI in the industry. SAG-AFTRA remains uncertain if AMPTP will re-enter negotiations or cease discussions entirely. The…
Researchers have developed a framework called Language Models for Motion Control (LaMo) that incorporates Large Language Models (LLMs) for offline reinforcement learning. LaMo combines pre-trained LLMs with Decision Transformers (DT) and introduces innovations like LoRA fine-tuning and auxiliary language loss. It outperforms existing methods in sparse-reward tasks and narrows the gap between value-based offline RL…
Recommender systems are crucial in helping users navigate the vast amount of choices available on the internet. However, accurately predicting user preferences and providing personalized recommendations remains challenging. One emerging approach is the use of knowledge graphs, which encode diverse contextual information and relationships between entities. Language models like GPT-3 can augment knowledge graphs by…
Connectomics, the study of mapping animal brains, is experiencing significant growth. Researchers from MIT and Harvard have developed SmartEM, an electron microscopy technique that utilizes machine learning to analyze brain synapses and neurons at nanometer precision. This integration of hardware and software allows for rapid understanding of complex brain images. SmartEM has the potential to…
Researchers from the University of Surrey have developed an AI-driven model to optimize the allocation of computing power in Open Radio Access Networks (O-RANs). By minimizing VNF computational costs and reducing overhead associated with reconfigurations, the model has the potential to significantly enhance bandwidth utilization efficiency. The study showcased up to a 76% reduction in…
Diffusion models are powerful and versatile models used in various generation tasks such as image, speech, video, and music generation. They employ a Markov Chain to gradually add random noise to images, then learn to reverse the process to generate high-quality images. This article introduces a new framework called DiffEnf that increases the flexibility of…
Carnegie Mellon University’s College of Engineering replicated a soft robot based on fossil evidence of pleurocystitids. The marine organism, which lived 450 million years ago, was one of the earliest echinoderms that could move using a muscular stem.