3 10 14

AIRobotZ

AI & ML interests

None yet

Recent Activity

upvoted a paper 3 days ago

Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies

liked a dataset 4 months ago

Solaris99/AgentBank

liked a dataset 5 months ago

r2e-edits/sonnet-end2end-r2e-dev_100pr_combined-maxstep40_context32k-sft_exitreason-agent-v1

View all activity

Organizations

None yet

upvoted a paper 3 days ago

Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies

Paper • 2512.19673 • Published 4 days ago • 59

liked a dataset 4 months ago

Solaris99/AgentBank

Viewer • Updated Oct 10, 2024 • 53.2k • 1.73k • 19

liked 3 datasets 5 months ago

r2e-edits/sonnet-end2end-r2e-dev_100pr_combined-maxstep40_context32k-sft_exitreason-agent-v1

Viewer • Updated Apr 13 • 6.44k • 18 • 1

PersonalAILab/AFM-MHQA-Agent-SFT-Dataset

Viewer • Updated Aug 20 • 8.83k • 84 • 4

agent-eto/eto-sft-trajectory

Preview • Updated Apr 9, 2024 • 58 • 16

liked a model 7 months ago

Qwen/WorldPM-72B

Text Classification • 73B • Updated May 17 • 111 • 80

upvoted a paper 8 months ago

Think on your Feet: Adaptive Thinking via Reinforcement Learning for Social Agents

Paper • 2505.02156 • Published May 4 • 18

upvoted a paper 9 months ago

S1-Bench: A Simple Benchmark for Evaluating System 1 Thinking Capability of Large Reasoning Models

Paper • 2504.10368 • Published Apr 14 • 21

commented 2 papers 9 months ago

S1-Bench: A Simple Benchmark for Evaluating System 1 Thinking Capability of Large Reasoning Models

Paper • 2504.10368 • Published Apr 14 • 21 •

IOPO: Empowering LLMs with Complex Instruction Following via Input-Output Preference Optimization

Paper • 2411.06208 • Published Nov 9, 2024 • 21 •

upvoted 2 papers 10 months ago

DeepSolution: Boosting Complex Engineering Solution Design via Tree-based Exploration and Bi-point Thinking

Paper • 2502.20730 • Published Feb 28 • 38

SoFA: Shielded On-the-fly Alignment via Priority Rule Following

Paper • 2402.17358 • Published Feb 27, 2024 • 1

liked a dataset 12 months ago

RZ412/PokerBench

Viewer • Updated Feb 15 • 574k • 892 • 27

upvoted a paper about 1 year ago

DEMO: Reframing Dialogue Interaction with Fine-grained Element Modeling

Paper • 2412.04905 • Published Dec 6, 2024 • 9

liked 6 datasets about 1 year ago

AIRobotZ

AI & ML interests

Recent Activity

Organizations

AIRobotZ's activity