Authors: || Published: 2025-01-18T20:44:00 || Updated: 2025-01-18T20:44:00
Recent AI Reading [18 January 2025]
Papers
Agentic AI
AI Alignment with Human Feedback and Preferences, and other methods
- Offline Reinforcement Learning for LLM Multi-Step Reasoning
- RLEF: Grounding Code LLMs in Execution Feedback with Reinforcement Learning
- From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge
- Alignment faking in large language models
Retrieval-Augmented Generation
- Don’t Do RAG: When Cache-Augmented Generation is All You Need for Knowledge Tasks
- GeAR: Graph-enhanced Agent for Retrieval-augmented Generation
Synthetic Data
Miscellaneous
- Training Large Language Models to Reason in a Continuous Latent Space
- Titans: Learning to Memorize at Test Time
- Multiagent Finetuning: Self Improvement with Diverse Reasoning Chains
- Superhuman performance of a large language model on the reasoning tasks of a physician
- Byte Latent Transformer: Patches Scale Better Than Tokens
- OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models
- LLM Pruning and Distillation in Practice: The Minitron Approach
- Compact Language Models via Pruning and Knowledge Distillation
- Schrodinger’s Memory: Large Language Models
- xLSTM: Extended Long Short-Term Memory [xLSTMs]
- Mamba: Linear-Time Sequence Modeling with Selective State Spaces [Mamba SSM]
- RWKV: Reinventing RNNs for the Transformer Era [RWKV]
- Assisting in Writing Wikipedia-like Articles From Scratch with Large Language Models
- BABILong: Testing the Limits of LLMs with Long Context Reasoning-in-a-Haystack
- Model Collapse Demystified: The Case of Regression