McAuley-Lab/Reddit2Deezer
Preview • Updated • 1.02k • 6
We're the McAuley Lab at UC San Diego with PI Prof. Julian McAuley, focusing on cool machine learning and natural language processing applications!
F-GRPO: Factorized Group-Relative Policy Optimization for Unified Candidate Generation and Ranking
Generate, Filter, Control, Replay: A Comprehensive Survey of Rollout Strategies for LLM Reinforcement Learning