4 542

M Saad Salman

MSS444

MSS444

AI & ML interests

None yet

Recent Activity

upvoted a paper about 2 hours ago

Convergent Evolution: How Different Language Models Learn Similar Number Representations

upvoted a paper about 2 hours ago

Abstain-R1: Calibrated Abstention and Post-Refusal Clarification via Verifiable RL

upvoted a paper about 2 hours ago

Scaling Test-Time Compute for Agentic Coding

View all activity

Organizations

None yet

upvoted 6 papers about 2 hours ago

Convergent Evolution: How Different Language Models Learn Similar Number Representations

Paper • 2604.20817 • Published 6 days ago • 7

Abstain-R1: Calibrated Abstention and Post-Refusal Clarification via Verifiable RL

Paper • 2604.17073 • Published 10 days ago • 9

Scaling Test-Time Compute for Agentic Coding

Paper • 2604.16529 • Published 12 days ago • 10

Expert Upcycling: Shifting the Compute-Efficient Frontier of Mixture-of-Experts

Paper • 2604.19835 • Published 7 days ago • 17

Reward Hacking in the Era of Large Models: Mechanisms, Emergent Misalignment, Challenges

Paper • 2604.13602 • Published 13 days ago • 29

Near-Future Policy Optimization

Paper • 2604.20733 • Published 6 days ago • 67

upvoted 12 papers 6 days ago

DiPO: Disentangled Perplexity Policy Optimization for Fine-grained Exploration-Exploitation Trade-Off

Paper • 2604.13902 • Published 13 days ago • 61

Where does output diversity collapse in post-training?

Paper • 2604.16027 • Published 11 days ago • 22

QuantCode-Bench: A Benchmark for Evaluating the Ability of Large Language Models to Generate Executable Algorithmic Trading Strategies

Paper • 2604.15151 • Published 12 days ago • 15

Cut Your Losses! Learning to Prune Paths Early for Efficient Parallel Reasoning

Paper • 2604.16029 • Published 11 days ago • 23

Maximal Brain Damage Without Data or Optimization: Disrupting Neural Networks via Sign-Bit Flips

Paper • 2502.07408 • Published 12 days ago • 57

MathNet: a Global Multimodal Benchmark for Mathematical Reasoning and Retrieval

Paper • 2604.18584 • Published 8 days ago • 14

Training LLM Agents for Spontaneous, Reward-Free Self-Evolution via World Knowledge Exploration

Paper • 2604.18131 • Published 8 days ago • 9

Stratagem: Learning Transferable Reasoning via Trajectory-Modulated Game Self-Play

Paper • 2604.17696 • Published 8 days ago • 6

When Can LLMs Learn to Reason with Weak Supervision?

Paper • 2604.18574 • Published 8 days ago • 24

SkillFlow:Benchmarking Lifelong Skill Discovery and Evolution for Autonomous Agents

Paper • 2604.17308 • Published 9 days ago • 22

GFT: From Imitation to Reward Fine-Tuning with Unbiased Group Advantages and Dynamic Coefficient Rectification

Paper • 2604.14258 • Published 13 days ago • 23

Agent-World: Scaling Real-World Environment Synthesis for Evolving General Agent Intelligence

Paper • 2604.18292 • Published 8 days ago • 80

upvoted 2 papers 10 days ago

What do Language Models Learn and When? The Implicit Curriculum Hypothesis

Paper • 2604.08510 • Published 19 days ago • 4

Self-Sovereign Agent

Paper • 2604.08551 • Published Mar 4 • 5

M Saad Salman

AI & ML interests

Recent Activity

Organizations

MSS444's activity