Commit History

modify params
c97acaf

liuyang commited on

Refactor audio processing: Simplified the handling of audio chunks in prepare_and_save_audio_for_model and updated preprocess_from_task_json to support both single and multiple chunk tasks, enhancing flexibility in audio preparation.
6c3a671

liuyang commited on

fix field
64397b6

liuyang commited on

fix field key
9e14752

liuyang commited on

Refactor transcription methods to return results: Updated the transcribe_chunk and transcribe_segments methods to return their results instead of processing them directly, improving the flow of data handling in the WhisperTranscriber class.
646c8e8

liuyang commited on

update params
ba746a9

liuyang commited on

add fields
9b80850

liuyang commited on

fix typo
3dae8f9

liuyang commited on

add fields
5d0a1ef

liuyang commited on

fix bug
d45c437

liuyang commited on

fix bug
9fc1e97

liuyang commited on

download all models on startup
d29acc5

liuyang commited on

fix value
6bea290

liuyang commited on

Add audio diarization task to Gradio interface: Introduced a new button and function for audio diarization, allowing users to process audio with speaker separation. Updated existing button labels for clarity.
e79159f

liuyang commited on

Refactor model management and transcription process: Introduced a model registry for easier management of Whisper models, added functionality to download models on startup, and streamlined the audio processing pipeline to support both chunk and segment transcriptions with improved error handling and cleanup.
e3d9c9e

liuyang commited on

unmatched_diarization_segments
a4568c6

liuyang commited on

disable clip_timestamps
b68d580

liuyang commited on

disable unmatched_diarization_segments
f425ecd

liuyang commited on

update threshold
78d61ea

liuyang commited on

try use diarization as clip_timestamp
0b6cc7c

liuyang commited on

try use diarization as clip_timestamp
6475331

liuyang commited on

update threshold
c8b690c

liuyang commited on

fix bug
a4d86b2

liuyang commited on

Enhance transcription segment coverage calculation: Updated the overlap check to consider total coverage from all transcription segments, ensuring segments are re-transcribed if less than 85% of their duration is covered. This improves accuracy in identifying segments needing attention.
3a6e3af

liuyang commited on

Refine interval overlap calculation in transcription: Adjusted the overlap threshold to be based on segment duration, improving accuracy in detecting overlapping segments during speaker assignment.
928c477

liuyang commited on

Refactor speaker assignment logic in transcription: Enhanced the `assign_speakers_to_transcription` method to detect unmatched diarization segments and introduced a second pass for splitting segments with speaker changes. Improved handling of speaker transitions and added functionality to re-process unmatched segments.
7bde45c

liuyang commited on

logs
36812ab

liuyang commited on

Enhance speaker assignment in transcription: Introduced interval overlap calculations and smoothing techniques for improved accuracy in speaker labeling. Added methods for determining dominant speakers and stabilizing segment boundaries.
f800f63

liuyang commited on

enable result printing and comment out text cleanup regex
aa984fe

liuyang commited on

model control
caef0e2

liuyang commited on

remove prompt
744c18a

liuyang commited on

update prompt
ceb7ebf

liuyang commited on

add prompt
c543860

liuyang commited on

add prompt
c59adf8

liuyang commited on

fix bug
2861a47

liuyang commited on

disable batch
6a522dd

liuyang commited on

speech duration
2a0543d

liuyang commited on

less speech duration
6159f83

liuyang commited on

enable vad
51ab2c6

liuyang commited on

Refine VAD parameters and transcription options in WhisperTranscriber for improved audio processing. Adjust max speech duration, min speech duration, and silence duration, and set chunk length to 12 seconds.
5411f5d

liuyang commited on

update log, no vad
9f7c374

liuyang commited on

print log
d947708

liuyang commited on

add log
04482af

liuyang commited on

disable checksum
24dd7ba

liuyang commited on

update
d5d2af9

liuyang commited on

fix upload issue
e84487c

liuyang commited on

upload data
731f4bf

liuyang commited on

Add job_id and task_id handling in WhisperTranscriber to improve metadata management during audio processing. Update file key generation for intermediate uploads.
5dddf57

liuyang commited on

fix
4417549

liuyang commited on

add upload
06b904d

liuyang commited on