Does this work in Linux?

by MLDataScientist - opened Mar 29, 2024

Mar 29, 2024

Hi,
Thanks for uploading this. I see the model files are safe tensors. Can we run the model in Linux with Nvidia GPUs? Can we use the same script you provided to run it?
Let me know.
Thanks!

eek

MLX Community org Mar 29, 2024

•

edited Mar 29, 2024

I don't think it works, it's converted for MLX usage, it will most likely error when trying to load with transformers. I think the best way would be to load_in_4bit in transformers via bits and bytes. Also, you will need 70GB of VRAM to load the 4bit model.

I think the best way would be to follow the discussions here: https://huggingface.co/databricks/dbrx-instruct/discussions/10

And I think you can already try / it might already work with load_in_4bit=True

eek changed discussion status to closed Apr 9, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment