Authors: || Published: 2025-08-31T10:48:00 || Updated: 2025-08-31T10:48:00 || 1 min read
Categories: || Tags: || Post-format: link
Recent AI Reading [31 August 2025]
Papers
Agentic AI
- AWorld: Dynamic Multi-Agent System with Stable Maneuvering for Robust GAIA Problem Solving
- MCPEval: Automatic MCP-based Deep Evaluation for AI Agent Models
- A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems
- ComputerRL: Scaling End-to-End Online Reinforcement Learning for Computer Use Agents
AI Alignment with Human Feedback and Preferences, and other methods
- Reinforcement Learning with Rubric Anchors
- Rubrics as Rewards: Reinforcement Learning Beyond Verifiable Domains
- Checklists Are Better Than Reward Models For Aligning Language Models
- LiveMCPBench: Can Agents Navigate an Ocean of MCP Tools?
- MCPEval: Automatic MCP-based Deep Evaluation for AI Agent Models
- MCPBench - evaluation benchmark on MCP servers
- MCP-RADAR: A Multi-Dimensional Benchmark for Evaluating Tool Use Capabilities in Large Language Models
- MCP-Universe: Benchmarking Large Language Models with Real-World Model Context Protocol Servers