Tokenizer configuration - MLX

11:50 18 Jan 2026

I'm currently trying to load a model on MLX. But when I'm loading the model and use the tokenizer there is the next issue:

with an incorrect regex pattern: https://huggingface.co/mistralai/Mistral-Small-3.1-24B-Instruct-2503/discussions/84#69121093e8b480e709447d5e. This will lead to incorrect tokenization. You should set the `fix_mistral_regex=True` flag when loading this tokenizer to fix this issue.

But I don't understand how to do so on my code:

from mlx_lm import load, generate

out = load("mlx-community/translategemma-12b-it-4bit")
if len(out) == 2:
    model, tokenizer = out
else:
    model, tokenizer, struct = out

prompt = "Write a story about Einstein"
messages = [{"role": "user", "content": prompt}]
prompt = tokenizer.apply_chat_template(
    messages, 
    add_generation_prompt=True
)

text = generate(model, tokenizer, prompt=prompt, verbose=True)

mlx

Your Answer

Privacy & Cookie Consent