🚨 IMPORTANT — RE-DOWNLOAD tokenizer_config.json IF YOU GOT INFINITE LOOPS

Fixed 2026-04-13. Earlier versions of this repo had a tokenizer bug that caused the model to loop forever in stock mlx_lm and other loaders.

Gemma-4 emits <end_of_turn> (id 106) at the end of an assistant turn, but the original tokenizer_config.json only listed <eos> (id 1) as the stop token. Stock loaders never detected the actual end-of-turn marker → infinite loop.

How to fix:

Option A — Re-download just the tokenizer config (fastest)
huggingface-cli download JANGQ-AI/Gemma-4-31B-it-JANG_4M tokenizer_config.json --local-dir ./your-model-dir
Option B — Re-download the whole repo
huggingface-cli download JANGQ-AI/Gemma-4-31B-it-JANG_4M --local-dir ./your-model-dir
Option C — Pass the stop tokens manually
stop_token_ids = [1, 106, 50]  # <eos>, <end_of_turn>, <end_of_image>
The model weights are unchanged — you only need to update tokenizer_config.json.

Gemma-4-31B-it-JANG_4M

JANG quantized Gemma-4 MoE for Apple Silicon. See JANGQ-AI for the full collection.

Downloads last month: 2,247

Safetensors

Model size

6B params

Tensor type

U32

F16

MLX

Hardware compatibility

Quantized

JANGQ-AI
/

Gemma-4-31B-it-JANG_4M

🚨 IMPORTANT — RE-DOWNLOAD `tokenizer_config.json` IF YOU GOT INFINITE LOOPS

How to fix:

Option A — Re-download just the tokenizer config (fastest)

Option B — Re-download the whole repo

Option C — Pass the stop tokens manually

Gemma-4-31B-it-JANG_4M

🚨 IMPORTANT — RE-DOWNLOAD tokenizer_config.json IF YOU GOT INFINITE LOOPS

How to fix:

Option A — Re-download just the tokenizer config (fastest)

Option B — Re-download the whole repo

Option C — Pass the stop tokens manually

Gemma-4-31B-it-JANG_4M

🚨 IMPORTANT — RE-DOWNLOAD `tokenizer_config.json` IF YOU GOT INFINITE LOOPS