styletts2-community/multilingual-phonemes-10k-alpha
Viewer • Updated • 259k • 420 • 37
The checkpoint open-sourced here is trained by Papercup using the open-source PL-BERT model found here https://github.com/yl4579/PL-BERT. It is trained to be supported by StyleTTS2, which can be found here: https://github.com/yl4579/StyleTTS2. You can see in the model card the languages that it has been trained on (the languages correspond to the crowdsourced dataset found here https://huggingface.co/datasets/styletts2-community/multilingual-phonemes-10k-alpha).
Notable differences compared to the default PL-BERT checkpoint and config available here:
bert-base-multilingual-cased.styletts2-community/multilingual-phonemes-10k-alpha for 1.1M iterations.token_maps.pkl file has changed (also open-sourced here).util.py file to deal with an error when loading new_state_dict["embeddings.position_ids"].Utils in your StyleTTS2 repository. Call it, for example, PLBERT_all_languages. config.yml, step_1100000.t7 and util.py.PLBERT_dir to Utils/PLBERT_all_languages. You will also need to change your import as such:from Utils.PLBERT.util import load_plbertfrom Utils.PLBERT_all_languages.util import load_plbertUtils/PLBERT and not have to change any code.espeak to create a file in the same format as the ones that exist in the Data folder of the StyleTTS2 repository. Careful! You will need to change the language argument to phonemise your text if it's not in English. You can find the correct language codes here. For example, Latin American Spanish is es-419Voila, you can now train a multilingual StyleTTS2 model!
Thank you to Aaron (Yinghao) Li for these contributions.