Voice Activity Detection
pyannote.audio
PyTorch
pyannote
pyannote-audio-model
audio
voice
speech
speaker
speaker-diarization
speaker-change-detection
speaker-segmentation
overlapped-speech-detection
resegmentation
Instructions to use objects76/speaker-diarization-v1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- pyannote.audio
How to use objects76/speaker-diarization-v1 with pyannote.audio:
from pyannote.audio import Model, Inference model = Model.from_pretrained("objects76/speaker-diarization-v1") inference = Inference(model) # inference on the whole file inference("file.wav") # inference on an excerpt from pyannote.core import Segment excerpt = Segment(start=2.0, end=5.0) inference.crop("file.wav", excerpt) - Notebooks
- Google Colab
- Kaggle
| task: | |
| _target_: pyannote.audio.tasks.SpeakerDiarization | |
| duration: 10.0 | |
| max_speakers_per_chunk: 3 | |
| max_speakers_per_frame: 2 | |
| model: | |
| _target_: pyannote.audio.models.segmentation.PyanNet | |
| sample_rate: 16000 | |
| num_channels: 1 | |
| sincnet: | |
| stride: 10 | |
| lstm: | |
| hidden_size: 128 | |
| num_layers: 4 | |
| bidirectional: true | |
| monolithic: true | |
| linear: | |
| hidden_size: 128 | |
| num_layers: 2 | |