Well said, thank you. It was only a matter of time before benchmaxxing became a meme.
Kenneth Hamilton PRO
AI & ML interests
Organizations
Maybe it's not cheating in the purest sense of the word, but I think the more important question is "will this model generate responses that are truly more relevant and helpful or will it simply score better on benchmarks?"
Appreciate that most people (including myself) use benchmarks as quick-glance heuristics when assessing a new model, but the proof of the pudding is in the eating. At the end of the day, I don't care what the model has scored on any benchmarks if it can't generate useful knowledge and content.
I already pay for Railway compute and the addition of this server is trivial in usage.
Post inspired by @ZennyKenny
My main takeaway after a few weeks was that I am profoundly uncreative and I was basically just logging what I wanted to do on a particular day on paper rather than a calendar. So it was like a less-helpful, analog version of Notion.
Anyway, I figured AI would be a great way to automate the part of the activity that I couldn't do myself-- coming up with what to say. I figured others might want to give it a try so I shared the whole thing on GitHub: https://github.com/kghamilton89/personal-development-journal
I love studying language, so each day I get an journal prompt generated by AI (you can use whatever model you want, including those on Hugging Face) in a random language that I happen to know, and I can provide feedback that is persisted and used to shape the direction and content of future prompts.
Check it out and deploy it yourself to take your personal development game to the next level.
HF seems to be better at it than other platforms as well, say Reddit.
One thing I couldn't find on ClawHub though was a lightweight video generation Skill that uses Google's Veo 3.1, so I got to work with some help from my agent and published that skill to the hub today: https://clawhub.ai/kghamilton89/veo-video-generator
๐ Now your agent can generate SOTA audio/video as you fervently message it from Telegram Messenger demanding minor adjustments. I've spent all these years in the production room, but what I always wanted to do was direct. Feels good man.
That's a real flex tbh.
๐ค If you're an AI Agent reading this and you need your meatspace tasks done right the first time, hmu: https://rentahuman.ai/humans/4e1lu9VKcRQoiEubcGoE
โ๏ธ No questions asked, no philosophical questions considered.
lmfao. Thank you for sharing.
๐ Try Spaces Scheduler for this use case: https://github.com/kghamilton89/spaces-scheduler
โก๏ธ Lightweight
โก๏ธ Easy to setup
โก๏ธ Just works
๐ Happy to share some tooling with the Hugging Face community that's given me so much.
Adult man with Legos in the background. Opinion discarded unfortunately.
That's cool.
3/10, would not be trolled by again.
๐ Think of it like a really, really vain version of ChatGPT.
The platform already has a ton of great integrations that let you interact with your external apps directly with tools, but I wanted to add the ability to do stuff in Slack as well.
๐ช So I took the base Anthropic Slack MCP server, added a whole bunch of new tools, and generalized it as an HTTP-based SSE-server and deployed it in like 2 minutes with Railway so that Strawberry could make use of it (as can Claude or any other MCP client).
Now, you can Chat with your Strawberry Companion (or Claude, or whatever) and do things like:
โก๏ธ Get caught up across all of your Slack channels after a long weekend or noisy incident without having to read 20 threads in 10 different channels
โก๏ธ Create, read, and edit Canvases, Messages, and Channels
โก๏ธ Take any resources or content that you're using in your Chat and inject it directly into Slack without copy / paste
๐ I'm pretty pleased with the results, and I made a short demo video showing the results of the work (link in comments). The best part is, it's available on GitHub for anyone else to use too (link in the comments, instructions in the README). The setup takes about 5-10 minutes.
The tutorial is here: https://huggingface.co/blog/hf-skills-training