Google TPUs documentation
Supported Models
Optimum-TPU
🤗 Optimum-TPUSupported ModelsInstallationOptimum TPU Containers
Tutorials
First TPU Setup on Google CloudFirst TPU Inference on Google CloudFirst TPU Training on Google Cloud
How-To Guides
Deploying and Connecting to Google TPU Instances via GCloud CLIDeploying a TGI server on a Google Cloud TPU instanceTraining on a Google Cloud TPU instanceHow to Deploy a Model on Inference Endpoint for Serving using TPUsAdvanced TGI Server ConfigurationInstalling Optimum-TPU inside a Docker ContainerGemma Fine-Tuning ExampleLlama Fine-Tuning ExampleFind More Examples on the Optimum-TPU GitHub Repository
Conceptual Guides
Reference
Contributing
Supported Models
Inference
The following LLMs have been tested and validated for inference on TPU v5e and v6e for text generation:
- 🦙 LLaMA Family
- LLaMA-2 7B
- LLaMA-3 8B, 70B
- LlaMa3.1 8B, 70B
- LLaMA-3.2 1B, 3B (text-only models)
- LlaMa-3.3 70B
- 💎 Gemma Family
- Gemma 2B, 7B
- 💨 Mistral Family
- Mistral 7B
- Mixtral 8x7B
Fine-tuning
The following models have been tested and validated for fine-tuning on TPU v5e and v6e:
- 🦙 LLaMA Family
- LLaMA-2 7B
- LLaMA-3 8B
- LLaMA-3.2 1B
- 💎 Gemma Family
- Gemma 2B
- Gemma 7B