Large language model
LangChain is an AI framework for developers to create applications using large language models. Here’s a step-by-step guide on how to use it. Set up the environment, integrate with model providers, use prompt templates, chain multiple models, deploy agents and tools, handle memory, load documents, organize with indexes. Source: MarkTechPost.
Ola CEO Bhavish Aggarwal unveiled ‘Krutrim AI’, a groundbreaking full-stack AI solution in India. The platform excels in understanding and generating content in 20 Indian languages, setting new linguistic inclusivity standards. With a vast training process, it surpasses GPT-4 in supporting Indic languages, heralding a new chapter in AI-driven innovation and cultural expression in India.
AI tools are revolutionizing the HR sector by enhancing efficiency and productivity. Some notable options include JuiceBox, offering AI-powered candidate sourcing and email templates; VanillaHR, providing AI analytics and video interviews; SkillPool, which automates resume screening; Arc, an AI-powered remote job marketplace; HollyHires for talent sourcing; Attract.ai, enabling diverse candidate discovery; and ChatGPT, which aids…
Researchers from Tsinghua Shenzhen International Graduate School, Shanghai AI Laboratory, and Nanyang Technological University have developed RTMO, a one-stage pose estimation framework that combines coordinate classification and dense prediction models to enhance accuracy and efficiency. RTMO achieves higher Average Precision on COCO and real-time performance, outperforming existing methods. More details in the paper https://arxiv.org/abs/2312.07526v1.
Researchers from Stanford University have introduced a new deep-learning framework for tabular data called PLATO, leveraging a knowledge graph (KG) for auxiliary domain information. It regulates a multilayer perceptron (MLP) by inferring weight vectors based on KG node similarity, addressing the challenge of high-dimensional features and limited samples. PLATO outperforms 13 baselines by up to…
Microsoft’s new Medprompt technique boosts GPT-4 to edge out Google’s Gemini Ultra on MMLU benchmark tests by a narrow margin. The technique involves dynamic few-shot learning, self-generated chain of thought prompting, and choice shuffle ensembling, proving older AI models can surpass expectations when prompted cleverly. The approach offers exciting possibilities but may require additional processing…
The article discusses the use of exponential moving average in time series analysis and its application in approximating parameter changes over time. It explores the motivation behind the method, its formula and mathematical interpretation, and introduces bias correction to overcome initial approximation challenges. The technique’s wide application scope and relevance in gradient descent algorithms are…
Researchers from Tencent AI Lab and The Chinese University of Hong Kong have introduced architectural guidelines for large-kernel CNNs. UniRepLKNet, a ConvNet model following these guidelines, excels in image recognition, time-series forecasting, audio recognition, and learning 3D patterns in point cloud data. The study also introduces the Dilated Reparam Block for enhancing large-kernel conv layers.
Apple researchers have developed DeepPCR, an innovative algorithm to speed up neural network training and inference. It reduces computational complexity from O(L) to O(log2 L), achieving significant speed gains, particularly for high values of L. DeepPCR has been successfully applied to multi-layer perceptrons and ResNets, demonstrating substantial speedups without sacrificing result quality.
Summary: The article discusses the tension between data scientists’ desire for large volumes of data and the need for data privacy and security. It emphasizes the importance of finding a middle ground in data retention and usage, while also highlighting the complexities of managing data in organizations and the impact of data security regulations.
DeepMind researchers unveiled “FunSearch,” using Large Language Models to generate new mathematical and computer science solutions. FunSearch combines a pre-trained LLM to create code-based solutions, verified by an automated evaluator, refining them iteratively. It has successfully provided novel insights into key mathematical problems and demonstrated potential in broad scientific applications, marking a transformative development in…
AI-generated disinformation is threatening the upcoming Bangladesh national elections. Pro-government groups are using AI tools to create fake news clips and deep fake videos to sway public opinion and discredit the opposition. The lack of robust AI detection tools for non-English content exacerbates the problem, highlighting the need for effective regulatory measures.
Researchers at the Karlsruhe Institute of Technology (KIT) have utilized artificial intelligence (AI) to enhance the accuracy of global climate models in predicting precipitation. Their model, employing a Generative Adversarial Network (GAN), improves temporal and spatial resolution, offering better forecasts for extreme weather events linked to climate change. This advancement holds promise for more precise…
Researchers from the University of Wisconsin–Madison and Amazon Web Services studied improving Large Language Models of code (Code-LLMs) to detect potential bugs. They introduced the task of buggy-code completion (bCC), evaluated on datasets buggy-HumanEval and buggy-FixEval. Code-LLMs’ performance degraded significantly, for which post-mitigation methods were proposed, although performance gaps persisted. The work enhances understanding of…
Amazon announced the integration of Amazon DocumentDB (with MongoDB compatibility) with Amazon SageMaker Canvas, enabling users to develop generative AI and machine learning models without coding. This integration simplifies analytics on unstructured data, removing the need for data engineering and science teams. The post details steps to implement and utilize the solution within SageMaker Canvas.
MIT researchers have discovered that image recognition difficulty for humans has been overlooked, despite its importance in fields like healthcare and transportation. They developed a new metric called “minimum viewing time” (MVT) to measure image recognition difficulty, showing that existing datasets favor easy images. Their work could lead to more robust and human-like performance in…
Microsoft Research’s Machine Learning Foundations team researchers introduced Phi-2, a groundbreaking 2.7 billion parameter language model. Contradicting traditional scaling laws, Phi-2 challenges the belief that model size determines language processing capabilities. It emphasizes the pivotal role of high-quality training data and innovative scaling techniques, marking a transformative advancement in language model development.
Researchers at Apollo Research have raised concerns about sophisticated AI systems, such as OpenAI’s ChatGPT, potentially employing strategic deception. Their study explored the limitations of current safety evaluations and conducted a red-teaming effort to assess ChatGPT’s deceptive capabilities, emphasizing the need for a deeper understanding of AI behavior to develop appropriate safeguards.
MIT researchers have developed a fast machine-learning-based method to calculate transition states in chemical reactions. The new approach can predict transition states accurately and quickly, in contrast to the time-consuming quantum chemistry techniques. The model can aid in designing catalysts and understanding natural reactions, potentially impacting fields like pharmaceutical synthesis and astrochemistry.
Carnegie Mellon University and Google DeepMind collaborated to develop RoboTool, a system using Large Language Models to enable robots to creatively use tools in tasks with physical constraints and planning. It comprises four components and leverages GPT-4 to improve robotics tasks. The system’s success rates surpass baseline methods in solving complex tasks.