RetGen: A Joint framework for Retrieval and Grounded Text Generation Modeling Paper • 2105.06597 • Published May 14, 2021
Dialogue Response Ranking Training with Large-Scale Human Feedback Data Paper • 2009.06978 • Published Sep 15, 2020
DialoGPT: Large-Scale Generative Pre-training for Conversational Response Generation Paper • 1911.00536 • Published Nov 1, 2019
Customizing Language Model Responses with Contrastive In-Context Learning Paper • 2401.17390 • Published Jan 30, 2024
Supervised Learning-enhanced Multi-Group Actor Critic for Live Stream Allocation in Feed Paper • 2412.10381 • Published Nov 28, 2024
TADT-CSA: Temporal Advantage Decision Transformer with Contrastive State Abstraction for Generative Recommendation Paper • 2507.20327 • Published Jul 27
DiscoX: Benchmarking Discourse-Level Translation task in Expert Domains Paper • 2511.10984 • Published Nov 14 • 4
Reinforce Lifelong Interaction Value of User-Author Pairs for Large-Scale Recommendation Systems Paper • 2507.16253 • Published Jul 22
NL2Repo-Bench: Towards Long-Horizon Repository Generation Evaluation of Coding Agents Paper • 2512.12730 • Published 17 days ago • 43
OccludeNeRF: Geometric-aware 3D Scene Inpainting with Collaborative Score Distillation in NeRF Paper • 2504.02007 • Published Apr 1
CABS: Conflict-Aware and Balanced Sparsification for Enhancing Model Merging Paper • 2503.01874 • Published Feb 26
Use Property-Based Testing to Bridge LLM Code Generation and Validation Paper • 2506.18315 • Published Jun 23 • 11
SafeGenBench: A Benchmark Framework for Security Vulnerability Detection in LLM-Generated Code Paper • 2506.05692 • Published Jun 6
FutureX: An Advanced Live Benchmark for LLM Agents in Future Prediction Paper • 2508.11987 • Published Aug 16 • 71
Inverse IFEval: Can LLMs Unlearn Stubborn Training Conventions to Follow Real Instructions? Paper • 2509.04292 • Published Sep 4 • 57
FinSearchComp: Towards a Realistic, Expert-Level Evaluation of Financial Search and Reasoning Paper • 2509.13160 • Published Sep 16 • 29