Skip to content

Commit 98dfe8d

Browse files
authored
vad : revisit timestamp alignment/mapping (#3173)
* vad : revisit timestamp alignment/mapping This commit improving the timestamp alignment by introducing a mapping table, adding intermediate reference points for longer segments, and binary search for lookups. The motivation for this changes is to address issues with the currently solution where zero-length segments are possible, and also to improve the precision of the VAD timestamps. Refs: #3162 * vad : use uint64_t for time mapping This commit changes the type of the `processed_time` and `original_time` fields in the `vad_time_mapping` struct from `double` to `uint64_t`. The motivation for this change is made to improve precision and avoid floating-point inaccuracies and also be consistent with other part of the code base that use `uint64_t` for time representation. This is a part of a refactoring where I'm also going to change the vad_segment_info struct to use `uint64_t` for the start and end times. This is the reason for the not so pleasant conversion and casts in the code at the moment. * vad : change vad_segment_info and whisper_vad_segment to use uint64_t * vad : use int64_t instead of uint64_t for timestamps To be consistent with other timestamps in the codebase. * vad : add centisecond conversion functions * vad : extract vad processing from whisper_full_with_state This commit extracts the VAD processing from the `whisper_full_with_state` function into the `whisper_full` and `whisper_full_parallel` functions. The motivation for this is that I did not take into account that when `whisper_full_parallel` is called with `n_processors > 1`, then the vad processing would not be applied correctly. Instead the VAD processing should be done prior to processing in the case of `whisper_full_parallel`. * vad : remove filtered_n_samples from whisper_vad The commit removes the parameter `filtered_n_samples` from the `whisper_vad` function signature and its usage, as it is no longer needed since filtered samples is now a vector (previously it was a float*) The motivation for this is to simplify the usage of this function. * vad : remove vad_mapping_table_initialized flag * vad : fix leaning (none) of pointer/references
1 parent e5e900d commit 98dfe8d

File tree

1 file changed

+195
-144
lines changed

1 file changed

+195
-144
lines changed

0 commit comments

Comments
 (0)