Refactor audio processing: Simplified the handling of audio chunks in prepare_and_save_audio_for_model and updated preprocess_from_task_json to support both single and multiple chunk tasks, enhancing flexibility in audio preparation.
6c3a671
liuyangcommited on
fix field
64397b6
liuyangcommited on
fix field key
9e14752
liuyangcommited on
Refactor transcription methods to return results: Updated the transcribe_chunk and transcribe_segments methods to return their results instead of processing them directly, improving the flow of data handling in the WhisperTranscriber class.
646c8e8
liuyangcommited on
update params
ba746a9
liuyangcommited on
add fields
9b80850
liuyangcommited on
fix typo
3dae8f9
liuyangcommited on
add fields
5d0a1ef
liuyangcommited on
fix bug
d45c437
liuyangcommited on
fix bug
9fc1e97
liuyangcommited on
download all models on startup
d29acc5
liuyangcommited on
fix value
6bea290
liuyangcommited on
Add audio diarization task to Gradio interface: Introduced a new button and function for audio diarization, allowing users to process audio with speaker separation. Updated existing button labels for clarity.
e79159f
liuyangcommited on
Refactor model management and transcription process: Introduced a model registry for easier management of Whisper models, added functionality to download models on startup, and streamlined the audio processing pipeline to support both chunk and segment transcriptions with improved error handling and cleanup.
e3d9c9e
liuyangcommited on
unmatched_diarization_segments
a4568c6
liuyangcommited on
disable clip_timestamps
b68d580
liuyangcommited on
disable unmatched_diarization_segments
f425ecd
liuyangcommited on
update threshold
78d61ea
liuyangcommited on
try use diarization as clip_timestamp
0b6cc7c
liuyangcommited on
try use diarization as clip_timestamp
6475331
liuyangcommited on
update threshold
c8b690c
liuyangcommited on
fix bug
a4d86b2
liuyangcommited on
Enhance transcription segment coverage calculation: Updated the overlap check to consider total coverage from all transcription segments, ensuring segments are re-transcribed if less than 85% of their duration is covered. This improves accuracy in identifying segments needing attention.
3a6e3af
liuyangcommited on
Refine interval overlap calculation in transcription: Adjusted the overlap threshold to be based on segment duration, improving accuracy in detecting overlapping segments during speaker assignment.
928c477
liuyangcommited on
Refactor speaker assignment logic in transcription: Enhanced the `assign_speakers_to_transcription` method to detect unmatched diarization segments and introduced a second pass for splitting segments with speaker changes. Improved handling of speaker transitions and added functionality to re-process unmatched segments.
7bde45c
liuyangcommited on
logs
36812ab
liuyangcommited on
Enhance speaker assignment in transcription: Introduced interval overlap calculations and smoothing techniques for improved accuracy in speaker labeling. Added methods for determining dominant speakers and stabilizing segment boundaries.
f800f63
liuyangcommited on
enable result printing and comment out text cleanup regex
aa984fe
liuyangcommited on
model control
caef0e2
liuyangcommited on
remove prompt
744c18a
liuyangcommited on
update prompt
ceb7ebf
liuyangcommited on
add prompt
c543860
liuyangcommited on
add prompt
c59adf8
liuyangcommited on
fix bug
2861a47
liuyangcommited on
disable batch
6a522dd
liuyangcommited on
speech duration
2a0543d
liuyangcommited on
less speech duration
6159f83
liuyangcommited on
enable vad
51ab2c6
liuyangcommited on
Refine VAD parameters and transcription options in WhisperTranscriber for improved audio processing. Adjust max speech duration, min speech duration, and silence duration, and set chunk length to 12 seconds.
5411f5d
liuyangcommited on
update log, no vad
9f7c374
liuyangcommited on
print log
d947708
liuyangcommited on
add log
04482af
liuyangcommited on
disable checksum
24dd7ba
liuyangcommited on
update
d5d2af9
liuyangcommited on
fix upload issue
e84487c
liuyangcommited on
upload data
731f4bf
liuyangcommited on
Add job_id and task_id handling in WhisperTranscriber to improve metadata management during audio processing. Update file key generation for intermediate uploads.