LM-Guided CoT: A Novel Machine Learning Framework Introduction Chain-of-thought (CoT) prompting is a method to improve language models’ reasoning abilities. However, it has limitations, especially for smaller models. Recent research proposes LM-guided CoT, a framework that enhances CoT prompting by decomposing it into rationale generation and answer prediction steps optimized with reinforcement learning (RL). Practical…