polycrest-coder

Sleeping

Apply for community grant: Personal project (gpu and storage)

by Hankbeasley - opened Feb 12, 2025

Owner Feb 12, 2025

I would like to attempt to train the Qwen32B-r1 distilled model for better coding. Using a rewards training. I am going to attempt to get the Qwen32B model to out preform Deep Seek R1 on code test via reinforcement learning. I have gathered hundreds of accept/reject samples via automated testing of the coding exercises at https://github.com/exercism/ . I feel like I will need around 150GB vram in order to complete the training.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment