Reinforcement Learning Agent Learns to Retrieve Long-Term Memories for Better LLM Reasoning
Researchers have developed a reinforcement learning-driven agent that improves how language models access relevant information from long-term memory banks. Rather than relying solely on embedding similarity searches, the agent uses PPO algorithm to learn retrieval policies that outperform baseline approaches. The system was tested on a synthetic memory dataset across multiple domains, showing improved accuracy in retrieving facts needed for accurate question answering.
Talkie-1930: 13B Language Model Trained Exclusively on Pre-1931 English Text
A new “vintage language model” called Talkie has been released, trained on 260 billion tokens of exclusively pre-1931 English text. The model serves as a contamination-free testbed for studying generalization, as it has never seen modern concepts like the internet or World War II. Researchers found it struggles with modern tasks but shows slow improvement with scale, and have released both base and instruction-tuned versions under Apache 2.0 license.
Lightweight Vision-Language-Action Agent Built from Scratch in NumPy and PyTorch
Researchers have created a fully transparent vision-language-action-inspired embodied agent using only NumPy and PyTorch, without external rendering libraries. The agent learns to perceive, plan, predict, and replan directly from pixel observations in a grid world environment. By training a lightweight world model in latent space and using model predictive control, the system demonstrates how perception and decision-making can be tightly integrated without relying on black-box components.
Meta AI Releases Sapiens2: High-Resolution Human-Centric Vision Model Trained on 1 Billion Images
Meta AI has introduced Sapiens2, a second-generation foundation model for human-centric vision trained on 1 billion carefully curated human images. The model combines masked image reconstruction with global contrastive learning to avoid representation drift, and achieves significant improvements across pose estimation, body-part segmentation, pointmap estimation, normal estimation, and albedo estimation tasks. The 5B parameter variant achieves 82.3 mAP on pose estimation, a 4-point improvement over its predecessor.
OpenMOSS Releases MOSS-Audio: Open-Source Foundation Model for Unified Audio Understanding
The OpenMOSS team has released MOSS-Audio, an open-source foundation model designed to unify speech understanding, environmental sound analysis, music understanding, audio captioning, and time-aware question answering in a single system. Four variants were released at launch (4B and 8B parameter sizes, each with Instruct and Thinking versions), with the model capable of performing complex multi-hop reasoning over audio content through chain-of-thought training and reinforcement learning.
Researchers Identify Fundamental Flaw in LoRA Assumption for Factual Knowledge Fine-Tuning
A new analysis reveals that LoRA’s effectiveness breaks down when fine-tuning models for factual knowledge rather than stylistic changes. The issue stems from LoRA’s assumption that all weight updates are similar, when in fact factual knowledge requires high-rank updates that low-rank approximations cannot capture. Researchers propose RS-LoRA as a solution, which modifies the scaling formula from Ξ±/r to Ξ±/βr to stabilize learning at higher ranks needed for complex knowledge integration.
Tutorial Shows How to Build Fully Searchable AI Knowledge Base Using Free Llama Model via OpenRouter
A step-by-step guide demonstrates how to create a local, wiki-style knowledge base using OpenKB and the free Llama 3.3 70B instruct model via OpenRouter. The tutorial covers secure API key setup, document ingestion, automatic summary generation, concept extraction, and querying capabilities. Users can build interconnected knowledge graphs from raw markdown documents without hardcoding secrets or requiring paid API access.
Digest generated on 2026-04-28 09:04 AM Moscow Time


























