Wan't to share my Training-Set

by havok2 - opened Jun 30, 2025

Jun 30, 2025

Hey,
I am from Leipzig and I created around 12.000 German samples with Elevenlabs. I would like to share these Dataset (wav+txt) with you. I also able to create some more with specific texts, like mathematical formulary and so on. Furthermore, I trained my model with this set on my 5090 RTX for around 5 Days and still had some problems in order of the words, but the words itself sounds german enough for me. I then merged it with your model and was satisfied :D

SebastianBodza

Owner Jul 2, 2025

Hey, that would be super helpful 😊. I am right now in the process of preparing a bigger dataset.

What length are the audio files? And how could you share it?

havok2

Jul 2, 2025

Hello,
I have contacted you via LinkedIn. The samples vary in length. I have generated samples from one word like “Hello!” to 2 minutes in some cases. I wasn't sure at first how long the samples should be. I trained my model for almost 7 days on my 5090 - around 1mio passes. I don't know if the weights are of any use to you? I can send it to you if you like. I found my model with your 65% merged quite good. Best regards.

cmp-nct

Jul 8, 2025

Multilingual:
https://www.openslr.org/94/ (audiobook based libritts)
https://github.com/freds0/CML-TTS-Dataset (more than 3000 hours, CS licensed)
German: TTS dataset from a university (high quality, 6 main speakers, I think 40-50 hours of studio quality recordings)
https://opendata.iisys.de/dataset/hui-audio-corpus-german/ (https://github.com/iisys-hof/HUI-Audio-Corpus-German)
https://github.com/thorstenMueller/Thorsten-Voice (11 hours, one person)

There is a lot of german spoken data, maybe something is useful

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment