Rajdeep Borgohain
rbgo
AI & ML interests
Solving language barriers.
Organizations
LLM-Alignment Papers
-
Concrete Problems in AI Safety
Paper β’ 1606.06565 β’ Published β’ 1 -
The Off-Switch Game
Paper β’ 1611.08219 β’ Published β’ 1 -
Learning to summarize from human feedback
Paper β’ 2009.01325 β’ Published β’ 4 -
Truthful AI: Developing and governing AI that does not lie
Paper β’ 2110.06674 β’ Published β’ 1
Finetuning
LLM-Alignment Papers
-
Concrete Problems in AI Safety
Paper β’ 1606.06565 β’ Published β’ 1 -
The Off-Switch Game
Paper β’ 1611.08219 β’ Published β’ 1 -
Learning to summarize from human feedback
Paper β’ 2009.01325 β’ Published β’ 4 -
Truthful AI: Developing and governing AI that does not lie
Paper β’ 2110.06674 β’ Published β’ 1
models 8
rbgo/Qwen3-gsm8k-GRPO
Text Generation β’ 4B β’ Updated β’ 1
rbgo/SmolLM2-1.7B-R1-Distilled-GRPO
Text Generation β’ 2B β’ Updated β’ 6
rbgo/SmolLM2-1.7B-R1-Distilled
Text Generation β’ 2B β’ Updated β’ 8
rbgo/SmolLM2-1-7B-Distill
Updated
rbgo/inferless-Llama-3-8B
Text Generation β’ 8B β’ Updated β’ 9 β’ 2
rbgo/infer-Llama-3-8B
Text Generation β’ 8B β’ Updated β’ 4
rbgo/gemma
Text Generation β’ 9B β’ Updated β’ 7
rbgo/Super-phi-2-dpo
Text Generation β’ 3B β’ Updated β’ 6 β’ 1