Authors: || Published: 2024-09-07T20:06:00 || Updated: 2024-09-07T20:06:00 || 2 min read
Recent AI Reading [07 September 2024]
Papers
Synthetic Data
- Best Practices and Lessons Learned on Synthetic Data
- On LLMs-Driven Synthetic Data Generation, Curation, and Evaluation: A Survey
Alignment with Human Feedback / Preferences
- RLHF Can Speak Many Languages: Unlocking Multilingual Preference Optimization for LLMs
- Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMs
- A Comprehensive Survey of LLM Alignment Techniques: RLHF, RLAIF, PPO, DPO and More
- MAPO: Advancing Multilingual Reasoning through Multilingual Alignment-as-Preference Optimization
- Judging the Judges: Evaluating Alignment and Vulnerabilities in LLMs-as-Judges
- Direct Preference Optimization: Your Language Model is Secretly a Reward Model
- InstructVideo: Instructing Video Diffusion Models with Human Feedback
- VideoScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation
- RLHF-Blender: A Configurable Interactive Interface for Learning from Diverse Human Feedback
Retrieval-Augmented Generation
- Multi-Meta-RAG: Improving RAG for Multi-Hop Queries using Database Filtering with LLM-Extracted Metadata
- HybridRAG: Integrating Knowledge Graphs and Vector Retrieval Augmented Generation for Efficient Information Extraction
- EfficientRAG: Efficient Retriever for Multi-Hop Question Answering
Other
- Tree of Thoughts: Deliberate Problem Solving with Large Language Models
- Bias-Aware Low-Rank Adaptation: Mitigating Catastrophic Inheritance of Large Language Models
- How Far Are We From AGI
- Consent in Crisis: The Rapid Decline of the AI Data Commons
- Cognitively Inspired Energy-Based World Models
- LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs
- The Need for a Leaderboard: A Survey of LLM as a Judge in NLP
Articles and Blog Posts
- Prompt caching with Claude
- Delving into “delve”
- Gorilla: Large Language Model Connected with Massive APIs
- Direct Preference Optimization with Synthetic Data on Anyscale
- Using LLM-as-a-judge 🧑⚖️ for an automated and versatile evaluation
- Databricks announces significant improvements to the built-in LLM judges in Agent Evaluation