Commits · nsfwalex/whisper-transcribe-new

modify params

c97acaf

liuyang commited on Oct 7, 2025

Refactor audio processing: Simplified the handling of audio chunks in prepare_and_save_audio_for_model and updated preprocess_from_task_json to support both single and multiple chunk tasks, enhancing flexibility in audio preparation.

6c3a671

liuyang commited on Oct 7, 2025

fix field

64397b6

liuyang commited on Oct 7, 2025

fix field key

9e14752

liuyang commited on Oct 7, 2025

Refactor transcription methods to return results: Updated the transcribe_chunk and transcribe_segments methods to return their results instead of processing them directly, improving the flow of data handling in the WhisperTranscriber class.

646c8e8

liuyang commited on Sep 20, 2025

update params

ba746a9

liuyang commited on Sep 20, 2025

add fields

9b80850

liuyang commited on Sep 20, 2025

fix typo

3dae8f9

liuyang commited on Sep 20, 2025

add fields

5d0a1ef

liuyang commited on Sep 20, 2025

fix bug

d45c437

liuyang commited on Sep 20, 2025

fix bug

9fc1e97

liuyang commited on Sep 20, 2025

download all models on startup

d29acc5

liuyang commited on Sep 19, 2025

fix value

6bea290

liuyang commited on Sep 19, 2025

Add audio diarization task to Gradio interface: Introduced a new button and function for audio diarization, allowing users to process audio with speaker separation. Updated existing button labels for clarity.

e79159f

liuyang commited on Sep 19, 2025

Refactor model management and transcription process: Introduced a model registry for easier management of Whisper models, added functionality to download models on startup, and streamlined the audio processing pipeline to support both chunk and segment transcriptions with improved error handling and cleanup.

e3d9c9e

liuyang commited on Sep 19, 2025

unmatched_diarization_segments

a4568c6

liuyang commited on Sep 16, 2025

disable clip_timestamps

b68d580

liuyang commited on Sep 16, 2025

disable unmatched_diarization_segments

f425ecd

liuyang commited on Sep 16, 2025

update threshold

78d61ea

liuyang commited on Sep 16, 2025

try use diarization as clip_timestamp

0b6cc7c

liuyang commited on Sep 16, 2025

try use diarization as clip_timestamp

6475331

liuyang commited on Sep 16, 2025

update threshold

c8b690c

liuyang commited on Sep 16, 2025

fix bug

a4d86b2

liuyang commited on Sep 16, 2025

Enhance transcription segment coverage calculation: Updated the overlap check to consider total coverage from all transcription segments, ensuring segments are re-transcribed if less than 85% of their duration is covered. This improves accuracy in identifying segments needing attention.

3a6e3af

liuyang commited on Sep 16, 2025

Refine interval overlap calculation in transcription: Adjusted the overlap threshold to be based on segment duration, improving accuracy in detecting overlapping segments during speaker assignment.

928c477

liuyang commited on Sep 16, 2025

Refactor speaker assignment logic in transcription: Enhanced the `assign_speakers_to_transcription` method to detect unmatched diarization segments and introduced a second pass for splitting segments with speaker changes. Improved handling of speaker transitions and added functionality to re-process unmatched segments.

7bde45c

liuyang commited on Sep 16, 2025

logs

36812ab

liuyang commited on Sep 14, 2025

Enhance speaker assignment in transcription: Introduced interval overlap calculations and smoothing techniques for improved accuracy in speaker labeling. Added methods for determining dominant speakers and stabilizing segment boundaries.

f800f63

liuyang commited on Sep 11, 2025