LLMs&Agents - a RandomHakkaDude Collection

RandomHakkaDude 's Collections

LLMs&Agents

updated May 30, 2025

InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU

Paper • 2502.08910 • Published Feb 13, 2025 • 148
From Hours to Minutes: Lossless Acceleration of Ultra Long Sequence Generation up to 100K Tokens

Paper • 2502.18890 • Published Feb 26, 2025 • 30
MPO: Boosting LLM Agents with Meta Plan Optimization

Paper • 2503.02682 • Published Mar 4, 2025 • 28
SWE-rebench: An Automated Pipeline for Task Collection and Decontaminated Evaluation of Software Engineering Agents

Paper • 2505.20411 • Published May 26, 2025 • 92
DeepResearchGym: A Free, Transparent, and Reproducible Evaluation Sandbox for Deep Research

Paper • 2505.19253 • Published May 25, 2025 • 32
Universal Reasoner: A Single, Composable Plug-and-Play Reasoner for Frozen LLMs

Paper • 2505.19075 • Published May 25, 2025 • 21
Text2Grad: Reinforcement Learning from Natural Language Feedback

Paper • 2505.22338 • Published May 28, 2025 • 8
ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows

Paper • 2505.19897 • Published May 26, 2025 • 104
Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers

Paper • 2505.21497 • Published May 27, 2025 • 109
Beyond Prompt Engineering: Robust Behavior Control in LLMs via Steering Target Atoms

Paper • 2505.20322 • Published May 23, 2025 • 14
VideoGameBench: Can Vision-Language Models complete popular video games?

Paper • 2505.18134 • Published May 23, 2025 • 6
Alita: Generalist Agent Enabling Scalable Agentic Reasoning with Minimal Predefinition and Maximal Self-Evolution

Paper • 2505.20286 • Published May 26, 2025 • 8
ARM: Adaptive Reasoning Model

Paper • 2505.20258 • Published May 26, 2025 • 45
Flex-Judge: Think Once, Judge Anywhere

Paper • 2505.18601 • Published May 24, 2025 • 27
Lifelong Safety Alignment for Language Models

Paper • 2505.20259 • Published May 26, 2025 • 23
Vibe Coding vs. Agentic Coding: Fundamentals and Practical Implications of Agentic AI

Paper • 2505.19443 • Published May 26, 2025 • 15
InfantAgent-Next: A Multimodal Generalist Agent for Automated Computer Interaction

Paper • 2505.10887 • Published May 16, 2025 • 10