tm23
tm23hgf
AI & ML interests
None yet
Recent Activity
updated a model 10 days ago
tm23hgf/anime-sdxl-lora published a model 10 days ago
tm23hgf/anime-sdxl-lora commentedon an article 12 days ago
Strand-Rust-Coder-v1: Rust Coding Model Fine-Tuned on Peer-Ranked Synthetic DataOrganizations
None yet
commented on Strand-Rust-Coder-v1: Rust Coding Model Fine-Tuned on Peer-Ranked Synthetic Data 12 days ago
awesome work, i am going to start some research on reasoning SLM on rust wanted to know is the dataset publicly released?
Not a good dataset
2
#2 opened 5 months ago
by
tm23hgf
commented on Mixture of Experts Explained 6 months ago
Chinchilla paper actually shows that for a fixed compute budget, it is better to train a smaller model on more data rather than training a larger model for fewer steps.
upvoted an article 6 months ago
Article
Mixture of Experts Explained


- +4
osanseviero, lewtun, philschmid, smangrul, ybelkada, pcuenq
• • 1.13k