Instructions to use LumiOpen/Llama-Poro-2-8B-Instruct with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use LumiOpen/Llama-Poro-2-8B-Instruct with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="LumiOpen/Llama-Poro-2-8B-Instruct") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("LumiOpen/Llama-Poro-2-8B-Instruct") model = AutoModelForCausalLM.from_pretrained("LumiOpen/Llama-Poro-2-8B-Instruct") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use LumiOpen/Llama-Poro-2-8B-Instruct with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "LumiOpen/Llama-Poro-2-8B-Instruct" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "LumiOpen/Llama-Poro-2-8B-Instruct", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/LumiOpen/Llama-Poro-2-8B-Instruct
- SGLang
How to use LumiOpen/Llama-Poro-2-8B-Instruct with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "LumiOpen/Llama-Poro-2-8B-Instruct" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "LumiOpen/Llama-Poro-2-8B-Instruct", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "LumiOpen/Llama-Poro-2-8B-Instruct" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "LumiOpen/Llama-Poro-2-8B-Instruct", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use LumiOpen/Llama-Poro-2-8B-Instruct with Docker Model Runner:
docker model run hf.co/LumiOpen/Llama-Poro-2-8B-Instruct
On the mtbench settings
Hi,
I am trying to get the results for this model per category in mtbench as it was not published in this repo but I got some issues:
When I run the mtbench on this model I quess there is something wrong as this model mostly speaks in English for Finnish mtbench. Did you modify some settings?
My command from latest LumiOpen/FastChat
!python gen_model_answer.py --model-path /home/jovyan/work/FastChat/Llama-Poro-2-8B-Instruct --model-id LumiOpen/Llama-Poro-2-8B-Instruct --lang fi --max-new-token 2048
Testing starts like this:
CONV:
Conversation(name='poro', system_template='<|im_start|>system\n{system_message}', system_message='', system_message_vision='', roles=('<|im_start|>user', '<|im_start|>assistant'), messages=[], offset=0, sep_style=<SeparatorStyle.CHATML: 10>, sep='<|im_end|>', sep2=None, stop_str=None, stop_token_ids=[4, 5], max_image_size_mb=None)
PROMPT:
<|im_start|>system
<|im_end|>
<|im_start|>user
Miten Alpit ja Rein-joki ovat vaikuttaneet Länsi-Euroopan asutukseen ja maatalouteen? Luettele kolme vaikutusta.<|im_end|>
<|im_start|>assistant
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask to obtain reliable results.
Setting pad_token_id to eos_token_id:128001 for open-end generation.
The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask to obtain reliable results.
OUTPUT:
The Alps and the Rhine River have had a significant impact on the settlement and agriculture of Western Europe. Here are three key effects:
1 Natural Barrier and Corridor: The Alps acted as a natural barrier, separating Western Europe from Eastern Europe and the rest of Central Europe. However, they also created a corridor for trade and migration along the Rhine River, which flows from the Swiss Alps through Germany and into the North Sea. This corridor facilitated the movement of people, goods, and ideas, leading to the development of trade networks and cultural exchange.
2 Agricultural Limitations and Opportunities: The Alps presented significant agricultural challenges due to their rugged terrain and high altitudes. However, the valleys and plains along the Rhine River offered fertile land suitable for farming. This led to the development of settled agriculture and the establishment of communities in these areas. The availability of arable land and water resources supported the growth of populations and the emergence of complex societies.
3 Settlement Patterns and Cultural Diversity: The Alps and the Rhine River influenced settlement patterns in Western Europe, leading to a diverse range of cultural traditions. The mountainous regions were often inhabited by isolated communities with distinct cultures, while the river valleys attracted larger populations and facilitated cultural exchange between different groups. This diversity contributed to the rich cultural heritage of Western Europe.
Hi,
Poro2 uses the Llama-3 chat template, this is different from the ChatML template we used for the original Poro. You can use the llama-3 template specified in conversation.py:
https://github.com/LumiOpen/FastChat/blob/main/fastchat/conversation.py#L1592
https://github.com/lm-sys/FastChat/blob/main/fastchat/conversation.py#L1752
We will update the Poro2 model cards for the SFT and Instruct models with the per-category and per-turn MTBench scores. Thank you for pointing this out!