Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion doc/AmpSlice.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
:discussion:
FluidAmpSlice is based on two envelop followers on a highpassed version of the signal: one slow that gives the trend, and one fast. Each have features that will interact. The example code below is unfolding the various possibilites in order of complexity.

The process will return an audio steam with sample-long impulses at estimated starting points of the different slices.
The process will return an audio steam with single sample impulses at estimated starting points of the different slices.

:output: An audio stream with square envelopes around the slices. The latency between the input and the output is **max(minLengthAbove + lookBack, max(minLengthBelow,lookAhead))**.

Expand Down
44 changes: 22 additions & 22 deletions doc/BufTransientSlice.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,75 +5,75 @@
:see-also: TransientSlice, BufOnsetSlice, BufNoveltySlice
:description: Transient-based slice extractor on buffers
:discussion:
This relies on the same algorithm as BufTransients using clicks/transients/derivation/anomalies in the signal to estimate the slicing points.

The process will return a buffer which contains indices (in sample) of estimated starting points of the different slices.
BufTransientSlice identifies slice points in a buffer by implementing a "de-clicking" algorithm based on the assumption that a transient is a sample or series of samples that are anomalous when compared to surrounding samples. It creates a model of the time series of samples, so that when a given sample doesn't fit the model (its "error" or anomalous-ness goes above ``threshFwd``) it is determined to be a transient and a slice point is identified.

The series of samples determined to be a transient will continue until the error goes below ``threshBack``, indicating that the samples are again more in-line with the model.

The process will return an ``indices`` buffer which contains the indices (in samples) of estimated starting points of the different slices.

The algorithm implemented is from chapter 5 of "Digital Audio Restoration" by Godsill, Simon J., Rayner, Peter J.W. with some bespoke improvements on the detection function tracking.

:process: This is the method that calls for the slicing to be calculated on a given source buffer.
:output: Nothing, as the destination buffer is declared in the function call.


:control source:

The index of the buffer to use as the source material to be sliced through transient identification. The different channels of multichannel buffers will be summed.
The |buffer| to use as the source material to detect transients in. The different channels of multichannel buffers will be summed.

:control startFrame:

Where in the srcBuf should the slicing process start, in sample.
Where in ``source`` the process should begin, in samples. The default is 0.

:control numFrames:

How many frames should be processed.
How many frames of ``source`` should be process. The default of -1 indicates to process through the end of the buffer.

:control startChan:

For multichannel srcBuf, which channel should be processed.
For multichannel ``source``, which channel to begin processing from.

:control numChans:

For multichannel srcBuf, how many channel should be summed.
For multichannel ``source``, how many channels to process. The default of -1 indicates to process through the last channel in the buffer. Multichannel analyses are summed to mono before processing.

:control indices:

The index of the buffer where the indices (in sample) of the estimated starting points of slices will be written. The first and last points are always the boundary points of the analysis.
The buffer where the indices (in samples) of the estimated starting points of slices will be written.

:control order:

The order in samples of the impulse response filter used to model the estimated continuous signal. It is how many previous samples are used by the algorithm to predict the next one as reference for the model. The higher the order, the more accurate is its spectral definition, not unlike fft, improving low frequency resolution, but it differs in that it is not conected to its temporal resolution.
The number of previous samples used by the algorithm to create the model of the signal within the ``blockSize`` window of analysis ``order`` must be less than ``blockSize``.

:control blockSize:

The size in samples of frame on which it the algorithm is operating. High values are more cpu intensive, and also determines the maximum transient size, which will not be allowed to be more than half that lenght in size.
The size of audio chunk (in samples) on which the process is operating. This determines the maximum duration (in samples) of a detected transient, which cannot be more than than half of ``blockSize - order``.

:control padSize:

The size of the handles on each sides of the block simply used for analysis purpose and avoid boundary issues.
The size (in samples) of analysis on each side of ``blockSize`` used to provide some historical context for analysis so that each ``blockSize`` isn't modelled completely independently of its predecessor.

:control skew:

The nervousness of the bespoke detection function with values from -10 to 10. It allows to decide how peaks are amplified or smoothed before the thresholding. High values increase the sensitivity to small variations.
The nervousness of the bespoke detection function. It ranges from -10 to 10 (it has no units) representing the strength and direction of some nonlinearity applied to the detection signal which controls how peaks are amplified or smoothed before the thresholding. Positive values increase the sensitivity to small variations.

:control threshFwd:

The threshold of the onset of the smoothed error function. It allows tight start of the identification of the anomaly as it proceeds forward.
The threshold applied to the smoothed forward prediction error for determining an onset. The units are roughly in standard deviations, thus can be considered how "deviant", or anomalous, the signal must be to be detected as a transient. It allows tight start of the identification of the anomaly as it proceeds forward.

:control threshBack:

The threshold of the offset of the smoothed error function. As it proceeds backwards in time, it allows tight ending of the identification of the anomaly.
The threshold applied to the smoothed backward prediction error for determining an offset. The units are roughly in standard deviations, thus can be considered how "deviant", or anomalous, the signal must be to be considered transient. When the smoothed error function goes below ``threshBack`` an offset is identified. As it proceeds backwards in time, it allows tight ending of the identification of the anomaly.

:control windowSize:

The averaging window of the error detection function. It needs smoothing as it is very jittery. The longer the window, the less precise, but the less false positives.
The averaging window of the error detection function. It needs smoothing as it is very jittery. The longer the window, the less precise, but the less false positive.

:control clumpLength:

The window size in sample within which positive detections will be clumped together to avoid overdetecting in time.
The window size in samples within which anomalous samples will be clumped together to avoid over detecting in time. This is similar to setting a minimum slice length.

:control minSliceLength:

The minimum duration of a slice in samples.

:control action:

A Function to be evaluated once the offline process has finished and indices instance variables have been updated on the client side. The function will be passed indices as an argument.


51 changes: 24 additions & 27 deletions doc/BufTransients.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,81 +3,78 @@
:sc-categories: Libraries>FluidDecomposition, UGens>Buffer
:sc-related: Guides/FluidCorpusManipulationToolkit
:see-also: Transients, BufHPSS, BufSines
:description: A transient extractor on buffers
:description: Separate Transients from a Signal in a Buffer
:discussion:
It implements declicking algorithm from chapter 5 of 'Digital Audio Restoration' by Godsill, Simon J., Rayner, Peter J.W. with some bespoke improvements on the detection function tracking.

The algorithm will take a buffer in, and will divide it in two outputs:
* the transients, estimated from the signal and extracted from it;
* the remainder of the material, as estimated by the algorithm.
This implements a "de-clicking" algorithm based on the assumption that a transient is a sample or series of samples that are anomalous when compared to surrounding samples. It creates a model of the time series of samples, so that when a given sample doesn't fit the model (its "error" or anomalous-ness goes above ``threshFwd``) it is determined to be a transient. The series of samples determined to be a transient will continue until the error goes below ``threshBack``, indicating that the samples are again more in-line with the model.

The algorithm then estimates what should have happened during the transient period if the signal had followed its non-anomalous path, and resynthesises this estimate to create the residual output. The transient output is ``input signal - residual signal``, such that summed output of the object (``transients + residual``) can still null-sum with the input.

The whole process is based on the assumption that a transient is an element that is deviating from the surrounding material, as sort of click or anomaly. The algorithm then estimates what should have happened if the signal had followed its normal path, and resynthesises this estimate, removing the anomaly which is considered as the transient.
The algorithm will return two outputs:
* the transients, estimated from the signal and extracted from it
* the residual of the material with the transients replaced with an estimate.

The algorithm implemented is from chapter 5 of "Digital Audio Restoration" by Godsill, Simon J., Rayner, Peter J.W. with some bespoke improvements on the detection function tracking.

:process: This is the method that calls for the transient extraction to be performed on a given source buffer.
:output: Nothing, as the various destination buffers are declared in the function call.


:control source:

The index of the buffer to use as the source material to be decomposed through the NMF process. The different channels of multichannel buffers will be processing sequentially.
The |buffer| to use as the source material to detect transients in. The different channels of multichannel buffers will be processing sequentially.

:control startFrame:

Where in the srcBuf should the NMF process start, in sample.
Where in ``source`` the process should begin, in samples. The default is 0.

:control numFrames:

How many frames should be processed.
How many frames of ``source`` should be process. The default of -1 indicates to process through the end of the buffer.

:control startChan:

For multichannel srcBuf, which channel should be processed first.
For multichannel ``source``, which channel to begin processing from.

:control numChans:

For multichannel srcBuf, how many channel should be processed.
For multichannel ``source``, how many channels to process. The default of -1 indicates to process through the last channel in the buffer.

:control transients:

The index of the buffer where the extracted transient component will be reconstructed.
The buffer where the extracted transients component will written.

:control residual:

The index of the buffer where the estimated continuous component will be reconstructed.
The buffer where the residual component with the transients replaced by estimates will be written.

:control order:

The order in samples of the impulse response filter used to model the estimated continuous signal. It is how many previous samples are used by the algorithm to predict the next one as reference for the model. The higher the order, the more accurate is its spectral definition, not unlike fft, improving low frequency resolution, but it differs in that it is not conected to its temporal resolution.
The number of previous samples used by the algorithm to create the model of the signal within the ``blockSize`` window of analysis ``order`` must be less than ``blockSize``.

:control blockSize:

The size in samples of frame on which it the algorithm is operating. High values are more cpu intensive, and also determines the maximum transient size, which will not be allowed to be more than half that lenght in size.
The size of audio chunk (in samples) on which the process is operating. This determines the maximum duration (in samples) of a detected transient, which cannot be more than than half of ``blockSize - order``.

:control padSize:

The size of the handles on each sides of the block simply used for analysis purpose and avoid boundary issues.
The size (in samples) of analysis on each side of ``blockSize`` used to provide some historical context for analysis so that each ``blockSize`` isn't modelled completely independently of its predecessor.

:control skew:

The nervousness of the bespoke detection function with values from -10 to 10. It allows to decide how peaks are amplified or smoothed before the thresholding. High values increase the sensitivity to small variations.
The nervousness of the bespoke detection function. It ranges from -10 to 10 (it has no units) representing the strength and direction of some nonlinearity applied to the detection signal which controls how peaks are amplified or smoothed before the thresholding. High values increase the sensitivity to small variations.

:control threshFwd:

The threshold of the onset of the smoothed error function. It allows tight start of the identification of the anomaly as it proceeds forward.
The threshold applied to the smoothed forward prediction error for determining an onset. The units are roughly in standard deviations, thus can be considered how "deviant", or anomalous, the signal must be to be detected as a transient. It allows tight start of the identification of the anomaly as it proceeds forward.

:control threshBack:

The threshold of the offset of the smoothed error function. As it proceeds backwards in time, it allows tight ending of the identification of the anomaly.
The threshold applied to the smoothed backward prediction error for determining an offset. The units are roughly in standard deviations, thus can be considered how "deviant", or anomalous, the signal must be to be considered transient. When the smoothed error function goes below ``threshBack`` an offset is identified. As it proceeds backwards in time, it allows tight ending of the identification of the anomaly.

:control windowSize:

The averaging window of the error detection function. It needs smoothing as it is very jittery. The longer the window, the less precise, but the less false positive.
The averaging window of the error detection function. It needs smoothing as it is very jittery. The longer the window, the less precise, but the less false positive.

:control clumpLength:

The window size in sample within which positive detections will be clumped together to avoid overdetecting in time.

:control action:

A Function to be evaluated once the offline process has finished and all Buffer's instance variables have been updated on the client side. The function will be passed [transients, residual] as an argument.

The window size in samples within which anomalous samples will be clumped together to avoid over detecting in time. This is like setting a minimum transient length.
2 changes: 1 addition & 1 deletion doc/NoveltySlice.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
:discussion:
A novelty curve is derived from running a kernel across the diagonal of the similarity matrix, and looking for peak of changes. It implements the algorithm published in 'Automatic Audio Segmentation Using a Measure of Audio Novelty' by J Foote.

The process will return an audio steam with sample-long impulses at estimated starting points of the different slices.
The process will return an audio steam with single sample impulses at estimated starting points of the different slices.

:output: An audio stream with impulses at detected transients. The latency between the input and the output is hopSize * (((kernelSize+1)/2).asInteger + ((filterSize + 1) / 2).asInteger + 1) samples at maximum.

Expand Down
Loading