Instructions to use martossien/mistrall-24b-css_Instruct-2501_GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use martossien/mistrall-24b-css_Instruct-2501_GGUF with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="martossien/mistrall-24b-css_Instruct-2501_GGUF", filename="mistral-24b-css-q4_k_m.gguf", )
llm.create_chat_completion( messages = "\"I like you. I love you\"" )
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use martossien/mistrall-24b-css_Instruct-2501_GGUF with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf martossien/mistrall-24b-css_Instruct-2501_GGUF:Q4_K_M # Run inference directly in the terminal: llama-cli -hf martossien/mistrall-24b-css_Instruct-2501_GGUF:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf martossien/mistrall-24b-css_Instruct-2501_GGUF:Q4_K_M # Run inference directly in the terminal: llama-cli -hf martossien/mistrall-24b-css_Instruct-2501_GGUF:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf martossien/mistrall-24b-css_Instruct-2501_GGUF:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf martossien/mistrall-24b-css_Instruct-2501_GGUF:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf martossien/mistrall-24b-css_Instruct-2501_GGUF:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf martossien/mistrall-24b-css_Instruct-2501_GGUF:Q4_K_M
Use Docker
docker model run hf.co/martossien/mistrall-24b-css_Instruct-2501_GGUF:Q4_K_M
- LM Studio
- Jan
- Ollama
How to use martossien/mistrall-24b-css_Instruct-2501_GGUF with Ollama:
ollama run hf.co/martossien/mistrall-24b-css_Instruct-2501_GGUF:Q4_K_M
- Unsloth Studio new
How to use martossien/mistrall-24b-css_Instruct-2501_GGUF with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for martossien/mistrall-24b-css_Instruct-2501_GGUF to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for martossien/mistrall-24b-css_Instruct-2501_GGUF to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for martossien/mistrall-24b-css_Instruct-2501_GGUF to start chatting
- Docker Model Runner
How to use martossien/mistrall-24b-css_Instruct-2501_GGUF with Docker Model Runner:
docker model run hf.co/martossien/mistrall-24b-css_Instruct-2501_GGUF:Q4_K_M
- Lemonade
How to use martossien/mistrall-24b-css_Instruct-2501_GGUF with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull martossien/mistrall-24b-css_Instruct-2501_GGUF:Q4_K_M
Run and chat with the model
lemonade run user.mistrall-24b-css_Instruct-2501_GGUF-Q4_K_M
List all available models
lemonade list
Model Card for Mistral-Small-24B-Instruct-2501
Mistral Small 3 ( 2501 ) sets a new benchmark in the "small" Large Language Models category below 70B, boasting 24B parameters and achieving state-of-the-art capabilities comparable to larger models! This model is an instruction-fine-tuned version of the base model: Mistral-Small-24B-Base-2501.
Mistral Small can be deployed locally and is exceptionally "knowledge-dense", fitting in a single RTX 4090 or a 32GB RAM MacBook once quantized. Perfect for:
Fast response conversational agents.
Low latency function calling.
Subject matter experts via fine-tuning.
Local inference for hobbyists and organizations handling sensitive data.
For enterprises that need specialized capabilities (increased context, particular modalities, domain specific knowledge, etc.), we will be releasing commercial models beyond what Mistral AI contributes to the community.
This release demonstrates our commitment to open source, serving as a strong base model.
Learn more about Mistral Small in our blog post.
Model developper: Mistral AI Team
this version have a finetuning of dataset : louisbrulenaudet / code-securite-sociale
It's my firts version
- Downloads last month
- 40
4-bit
8-bit
Model tree for martossien/mistrall-24b-css_Instruct-2501_GGUF
Base model
mistralai/Mistral-Small-24B-Base-2501