When will this leaderboard update the evaluations of the new models?

#4
by JaxonJP - opened

thank you

HPAI@BSC (High Performance Artificial Intelligence at Barcelona Supercomputing Center) org

Hi @JaxonJP

We try to update the leaderboard on a monthly basis. Which new models are you interested in seeing added, and on which benchmarks?

Thank you for maintaining this project. I’ve been following it closely, especially because I’m very interested in the application of SOTA models to IC design.

For future updates, I’d especially like to see evaluations of the following models:

Open-source models:

Kimi 2.5

MiniMax 2.5

GLM 5

Step-3.5-Flash

Closed-source models:

GPT-5.4 / 5.3 / 5.2

It would be great to compare their performance on the current benchmark suite in the leaderboard.

This month is almost over, will the usual monthly update still proceed as scheduled?

HPAI@BSC (High Performance Artificial Intelligence at Barcelona Supercomputing Center) org

FYI @JaxonJP

We just added the following models:

  • Kimi-K2.5
  • GLM-5-FP8
  • IndustrialCoder 32B
  • IndustrialCoder 32B Thinking

onto the following benchmarks:

  • Spec-to-RTL task:
    • VerilogEval 2.0
    • RTLLM 2.0
  • Code Completion task:
    • VerilogEval 2.0
    • VeriGen
  • Line Completion task:
    • RTL-Repo

Both Kimi-K2.5 and GLM-5 are SOTA wrt the previous models we had, surpassing (finally) DeepSeek-R1-0528. IndustrialCoder 32B is the brand new best RTL-Specialized model.
All models were ran on vLLM latest stable release 0.18.
We are working to add all above models onto "NotSoTiny" benchmark, after that we will close up this release.

Thanks for noticing our work,

Thanks for the update — this is great progress.

One quick question: when do you expect testing for Gemma 4 31B-it to get on the roadmap?

It was just released on April 2, and based on the official information so far, its performance looks very strong — potentially competitive with Kimi 2.5. I’m especially eager to see how it performs on IC design tasks.

Sign up or log in to comment