Instructions to use mlx-community/dbrx-instruct-4bit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use mlx-community/dbrx-instruct-4bit with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir dbrx-instruct-4bit mlx-community/dbrx-instruct-4bit
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- LM Studio
Does this work in Linux?
#1
by MLDataScientist - opened
Hi,
Thanks for uploading this. I see the model files are safe tensors. Can we run the model in Linux with Nvidia GPUs? Can we use the same script you provided to run it?
Let me know.
Thanks!
I don't think it works, it's converted for MLX usage, it will most likely error when trying to load with transformers. I think the best way would be to load_in_4bit in transformers via bits and bytes. Also, you will need 70GB of VRAM to load the 4bit model.
I think the best way would be to follow the discussions here: https://huggingface.co/databricks/dbrx-instruct/discussions/10
And I think you can already try / it might already work with load_in_4bit=True
eek changed discussion status to closed