Apply for community grant: Personal project (gpu and storage)

#1
by Hankbeasley - opened

I would like to attempt to train the Qwen32B-r1 distilled model for better coding. Using a rewards training. I am going to attempt to get the Qwen32B model to out preform Deep Seek R1 on code test via reinforcement learning. I have gathered hundreds of accept/reject samples via automated testing of the coding exercises at https://github.com/exercism/ . I feel like I will need around 150GB vram in order to complete the training.

Sign up or log in to comment