RandomHakkaDude
's Collections
LLMs&Agents
updated
InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on
a Single GPU
Paper
•
2502.08910
•
Published
•
148
From Hours to Minutes: Lossless Acceleration of Ultra Long Sequence
Generation up to 100K Tokens
Paper
•
2502.18890
•
Published
•
30
MPO: Boosting LLM Agents with Meta Plan Optimization
Paper
•
2503.02682
•
Published
•
28
SWE-rebench: An Automated Pipeline for Task Collection and
Decontaminated Evaluation of Software Engineering Agents
Paper
•
2505.20411
•
Published
•
92
DeepResearchGym: A Free, Transparent, and Reproducible Evaluation
Sandbox for Deep Research
Paper
•
2505.19253
•
Published
•
32
Universal Reasoner: A Single, Composable Plug-and-Play Reasoner for
Frozen LLMs
Paper
•
2505.19075
•
Published
•
21
Text2Grad: Reinforcement Learning from Natural Language Feedback
Paper
•
2505.22338
•
Published
•
8
ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic
Scientific Workflows
Paper
•
2505.19897
•
Published
•
104
Paper2Poster: Towards Multimodal Poster Automation from Scientific
Papers
Paper
•
2505.21497
•
Published
•
109
Beyond Prompt Engineering: Robust Behavior Control in LLMs via Steering
Target Atoms
Paper
•
2505.20322
•
Published
•
14
VideoGameBench: Can Vision-Language Models complete popular video games?
Paper
•
2505.18134
•
Published
•
6
Alita: Generalist Agent Enabling Scalable Agentic Reasoning with Minimal
Predefinition and Maximal Self-Evolution
Paper
•
2505.20286
•
Published
•
8
ARM: Adaptive Reasoning Model
Paper
•
2505.20258
•
Published
•
45
Flex-Judge: Think Once, Judge Anywhere
Paper
•
2505.18601
•
Published
•
27
Lifelong Safety Alignment for Language Models
Paper
•
2505.20259
•
Published
•
23
Vibe Coding vs. Agentic Coding: Fundamentals and Practical Implications
of Agentic AI
Paper
•
2505.19443
•
Published
•
15
InfantAgent-Next: A Multimodal Generalist Agent for Automated Computer
Interaction
Paper
•
2505.10887
•
Published
•
10