[NEW] HuggingChat Omni

#764
by victor - opened
HuggingChat org
β€’
edited Oct 16, 2025

Introducing: HuggingChat Omni πŸ’«

HuggingChat

HuggingChat returns and it's smarter and faster than ever πŸš€

Stop picking models. Start chatting.

Available now for all Hugging Face users. Free users can use their inference credits, PRO users get 20x more credits to use.

🧭 Omni: the new default routing model

When you send a message, Omni analyzes what you need and routes you to the best model for that specific task.
Each route uses the best model for its task. You see which model handled your request while it streams.

πŸ“Š Examples

What you ask Route Model
"Help me decide between two job offers. One pays 20% more but requires relocation." decision_support deepseek-ai/DeepSeek-R1-0528
"Create a React component for an image carousel with lazy loading" code_generation Qwen/Qwen3-Coder-480B-A35B-Instruct
"Write a short mystery story set in a lighthouse during a storm" creative_writing moonshotai/Kimi-K2-Instruct-0905
"Translate this to French: The meeting has been rescheduled to next Tuesday" translation CohereLabs/command-a-translate-08-2025

βš™οΈ Under the hood

Omni uses a policy-based routing system. Each route has:

  • A clear description of what it handles
  • A primary model best suited for that task
  • Fallback models if the primary is unavailable

The router model analyzes your conversation and picks the matching route. Fast (10 second timeout) and runs on every message. Credits to Katanemo for their routing model: katanemo/Arch-Router-1.5B

✨ What else is new

  • Background generation tracking: Multiple conversations can generate at the same time. Switch between tabs and the app tracks what's still generating. Updates appear automatically when responses finish.
  • Better streaming: Text renders faster and smoother. The app only updates what changed instead of re-rendering everything. Less flickering, especially in long responses with code blocks.
  • Better UX: UX was refined throughout the app. Fewer bugs and rough edges. Preview for code, beautiful streaming and more polish and attention to detail everywhere.
  • Speed optimizations: Sessions stay active longer with automatic token refresh. Response times improved across the board. The whole app feels faster.

πŸ› οΈ Run it yourself

HuggingChat is of course still 100% open source. It has never been easier to self-host your own instance.

Quick setup:

git clone https://github.com/huggingface/chat-ui
cd chat-ui
npm install
npm run dev

Only 3 env variables to set to get it working in .env:

  • MONGODB_URL - Your MongoDB connection
  • OPENAI_API_KEY - Your API key
  • OPENAI_BASE_URL - Your endpoint URL

You can also configure your own routes in a JSON file. Each route defines which models to use for specific tasks.

Check out the repo: github.com/huggingface/chat-ui

Hope you are as excited as we are about HuggingChat Omni! Please share your feedback and ideas in this thread πŸ€—

victor pinned discussion

Is it possible to import my conversations from the previous version of HuggingChat?

Yeah this dumbing down the system was totally worth nuking everyone's logs and assistants...? The performance improvements are nice if true, but how can you call this a better UX when so many basic features are missing from the last version? Even simple settings are gone, like no options to delete or edit output? There isn't even a way to tweak temperature/repetition minimizing settings, or give different chats different system prompts??

wow, I'm kind of surprised it's back. feels like a tad bit of a downgrade, but I'm assuming that it was a complete rework? hoping that more QoL features will be reintroduced again.

This comment has been hidden (marked as Off-Topic)

we're so back

edit:
nevermind, cant delete the conversation branch like before😒

edit 2:
and it now has a limit. Its been over six hours and i still cant continue the conversation 😭

deleted

Thanks for getting this running

deleted

❌ Can't use assistants
❌ Can't generate images
❌ Can't edit conversations
❌ Can't search the web
❌ Can't change temperature
❌ Can't import your old conversations
βœ… You now have to pay to use it πŸ˜‚

@

What determines the listing order of the models? It would be nice to see what models are newest without checking their individual pages.

Most trending models on the hub should appear first

shouldn't that list be filtered based on chosen providers from settings?
it seems rather intuitive to be ...

for those seeking a workaround to this rather counter-intuitive ui ...
pick a specific provider for the models available of the chosen providers, that way the provider's logo will appear in the list, making the selection more easy ...
ending with something like this
image

just answer me this one question, could you add a /chat/legacy page?

We gonna get any answers to our questions, orrr?

Will there be a way to store most used models in some kind of favorites area, and create more than one instance of them with different system prompts for different tasks?

Hi, I'm an active Hugging Chat user and I originally subscribed mainly to use this product because when it launched I really liked the workflow that the models have for formulating their responses through the chat's design, in the form of a sequential hybrid flow between chat and agent. However, I now feel it's falling behind on basic features like the option to organize chats into folders that contain the context of an entire project. There are a few ideas that I believe, once implemented, would make this chat interface one of the best:

Chat organization into projects or folders
Automatic organization mode (voluntary only) with suggestions that users can accept or reject, powered by vector similarity based on each chat's content. Plus, a separate graphical visualization mode like Obsidian offers.

More ambitious approaches that would likely be delegated to other organizations to maintain their model development merits and business model options:

Optional integration with Cline (HF's own CLI) with security sandbox

Ability to integrate local model usage into the chat with the HF inference ecosystem like js transformer and webgpu

Integration with smol agents in local or remote mode

An OpenRouter-style payment management ecosystem and credit administration so that organizations developing models and offering hosting can have a revenue source, making the open model a commercially solid system.

Why is it telling me a model doesn't not available anymore but I can still see it and pick it in the all models available page?

Also if a model has been removed is there any way of picking a new one without having to start a whole new chat?

HuggingChat org

Why is it telling me a model doesn't not available anymore but I can still see it and pick it in the all models available page?

example? @MadderHatterMax

Dear Team,

To ensure our server environment remains clean, stable, and that usage quotas are maintained consistently, please implement the following measures:

  • Block and delete all account-washing bots.
  • Block accounts using virtual credit cards or prepaid cards.
  • Block accounts with irregular IP hopping.
  • Block accounts running automated scripts.
  • Block accounts operating 24/7 without human activity.

All such accounts must be blocked, deleted, and disabled to safeguard the platform’s proper operation.

Thank you for your cooperation! πŸ™

Best regards,

Sign up or log in to comment