SEGAgentRL

non-profit

AI & ML interests

We target improved agent reinforcement learning in terms of stability (S), efficiency (E), and generalization (G).

Recent Activity

yushu-li updated a model about 3 hours ago

SEGAgentRL/LLDS-R-GRPO-Qwen2.5-3B-Base

yushu-li updated a model about 3 hours ago

SEGAgentRL/LLDS-A-GSPO-Qwen2.5-3B-Ins

yushu-li updated a model about 3 hours ago

SEGAgentRL/LLDS-R-GSPO-Qwen2.5-3B-Ins

View all activity

models 8

SEGAgentRL/LLDS-R-GRPO-Qwen2.5-3B-Base

Reinforcement Learning • 3B • Updated about 3 hours ago

SEGAgentRL/LLDS-A-GSPO-Qwen2.5-3B-Ins

Reinforcement Learning • 3B • Updated about 3 hours ago

SEGAgentRL/LLDS-R-GSPO-Qwen2.5-3B-Ins

Reinforcement Learning • 3B • Updated about 3 hours ago

SEGAgentRL/LLDS-R-GRPO-Qwen2.5-3B-Ins

Reinforcement Learning • 3B • Updated about 3 hours ago

SEGAgentRL/LLDS-A-GRPO-Qwen2.5-3B-Base-MA

Reinforcement Learning • 3B • Updated about 4 hours ago

SEGAgentRL/LLDS-A-GRPO-Qwen2.5-3B-Base

Reinforcement Learning • 3B • Updated about 4 hours ago

SEGAgentRL/LLDS-A-GRPO-Qwen2.5-7B-Base

Reinforcement Learning • 8B • Updated about 6 hours ago

SEGAgentRL/LLDS-A-GRPO-Qwen2.5-7B-Ins

Reinforcement Learning • 8B • Updated about 7 hours ago

datasets 0

None public yet