Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion doc/AmpGate.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
:sc-categories: Libraries>FluidDecomposition
:sc-related: Guides/FluidCorpusManipulation
:see-also: BufAmpGate, AmpSlice, OnsetSlice, NoveltySlice, TransientSlice
:description: Absolute amplitude threshold gate detector on a real-time signal
:description: Absolute amplitude threshold gate detector on a realtime signal

:discussion:
AmpGate outputs a audio-rate, single-channel signal that is either 0, indicating the gate is closed, or 1, indicating the gate is open. The gate detects an onset (opens) when the internal envelope follower (controlled by ``rampUp`` and ``rampDown``) goes above a specified ``onThreshold`` (in dB) for at least ``minLengthAbove`` samples. The gate will stay open until the envelope follower goes below ``offThreshold`` (in dB) for at least ``minLengthBelow`` samples, which triggers an offset.
Expand Down
4 changes: 2 additions & 2 deletions doc/AmpSlice.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
:see-also: BufAmpSlice, AmpGate, OnsetSlice, NoveltySlice, TransientSlice
:description: Implements an amplitude-based slicer, with various customisable options and conditions to detect relative amplitude changes as onsets.
:discussion:
FluidAmpSlice is based on two envelope followers on a highpassed version of the signal: one slow that gives the trend, and one fast. Each have features that will interact. The example code below is unfolding the various possibilites in order of complexity.
FluidAmpSlice is based on two envelope followers on a high-passed version of the signal: one slow that gives the trend, and one fast. Each has features that will interact. The example code below is unfolding the various possibilities in order of complexity.

The process will return an audio stream with single sample impulses at estimated starting points of the different slices.

Expand Down Expand Up @@ -34,7 +34,7 @@

:control offThreshold:

The threshold in dB of the relative envelope follower to reset, aka to allow the differential envelop to trigger again.
The threshold in dB of the relative envelope follower to reset, aka to allow the differential envelope to trigger again.

:control floor:

Expand Down
2 changes: 1 addition & 1 deletion doc/AudioTransport.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@

:discussion:
Interpolates between the spectra of two sounds using the optimal transport algorithm. This enables morphing and hybridisation of the perceptual qualities of each source linearly.
See Henderson and Solomonm (2019) AUDIO TRANSPORT: A GENERALIZED PORTAMENTO VIA OPTIMAL TRANSPORT, DaFx
See Henderson and Solomon (2019) AUDIO TRANSPORT: A GENERALIZED PORTAMENTO VIA OPTIMAL TRANSPORT, DaFx

https://arxiv.org/abs/1906.06763

Expand Down
2 changes: 1 addition & 1 deletion doc/BufAmpFeature.rst
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@

:control numChans:

For multichannel sources, how many channel should be summed.
For multichannel sources, how many channels should be summed.

:control features:

Expand Down
4 changes: 2 additions & 2 deletions doc/BufAmpGate.rst
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
:digest: Gate Detection on a Bfufer
:digest: Gate Detection on a Buffer
:species: buffer-proc
:sc-categories: Libraries>FluidDecomposition
:sc-related: Guides/FluidCorpusManipulation
Expand Down Expand Up @@ -79,7 +79,7 @@

:control highPassFreq:

The frequency of the fourth-order Linkwitz-Riley high-pass filter (https://en.wikipedia.org/wiki/Linkwitz%E2%80%93Riley_filter) applied to the signal signal to minimise low frequency intermodulation with very short ramp lengths. A frequency of 0 bypasses the filter.
The frequency of the fourth-order Linkwitz-Riley high-pass filter (https://en.wikipedia.org/wiki/Linkwitz%E2%80%93Riley_filter) applied to the signal to minimise low frequency intermodulation with very short ramp lengths. A frequency of 0 bypasses the filter.

:control maxSize:

Expand Down
6 changes: 3 additions & 3 deletions doc/BufAmpSlice.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
:see-also: AmpSlice, BufAmpGate, BufOnsetSlice, BufNoveltySlice, BufTransientSlice
:description: Implements an amplitude-based slicer, with various customisable options and conditions to detect relative amplitude changes as onsets.
:discussion:
FluidBufAmpSlice is based on two envelope followers on a highpassed version of the signal: one slow that gives the trend, and one fast. Each have features that will interact. The example code below is unfolding the various possibilites in order of complexity.
FluidBufAmpSlice is based on two envelope followers on a high-passed version of the signal: one slow that gives the trend, and one fast. Each has features that will interact. The example code below is unfolding the various possibilities in order of complexity.

The process will return a buffer which contains indices (in sample) of estimated starting points of different slices.

Expand All @@ -30,7 +30,7 @@

:control numChans:

For multichannel sources, how many channel should be summed.
For multichannel sources, how many channels should be summed.

:control indices:

Expand Down Expand Up @@ -58,7 +58,7 @@

:control offThreshold:

The threshold in dB of the relative envelope follower to reset, aka to allow the differential envelop to trigger again.
The threshold in dB of the relative envelope follower to reset, aka to allow the differential envelope to trigger again.

:control floor:

Expand Down
2 changes: 1 addition & 1 deletion doc/BufAudioTransport.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
:discussion:
Interpolates between the spectra of two sounds using the optimal transport algorithm. This enables morphing and hybridisation of the perceptual qualities of each source linearly.

See Henderson and Solomonm (2019) AUDIO TRANSPORT: A GENERALIZED PORTAMENTO VIA OPTIMAL TRANSPORT, DaFx
See Henderson and Solomon (2019) AUDIO TRANSPORT: A GENERALIZED PORTAMENTO VIA OPTIMAL TRANSPORT, DaFx

https://arxiv.org/abs/1906.06763

Expand Down
4 changes: 2 additions & 2 deletions doc/BufChroma.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@

:control source:

The index of the buffer to use as the source material to be analysed. The different channels of multichannel buffers will be processing sequentially.
The index of the buffer to use as the source material to be analysed. The different channels of multichannel buffers will be processed sequentially.

:control startFrame:

Expand All @@ -31,7 +31,7 @@

:control numChans:

For multichannel srcBuf, how many channel should be processed.
For multichannel srcBuf, how many channels should be processed.

:control features:

Expand Down
8 changes: 4 additions & 4 deletions doc/BufCompose.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
A utility for manipulating the contents of buffers.

:discussion:
This object is the swiss army knife for manipulating buffers and their contents. By specifing ranges of samples and channels to copy, as well as destination and source gains it can provide a powerful interface for performing actions such as a Left/Right to Mid/Side conversion and mixing down multichannel audio
This object is the swiss army knife for manipulating buffers and their contents. By specifying ranges of samples and channels to copy, as well as destination and source gains it can provide a powerful interface for performing actions such as a Left/Right to Mid/Side conversion and mixing down multichannel audio

:process: This method triggers the compositing.

Expand All @@ -24,15 +24,15 @@

:control numFrames:

The duration (in samples) to copy from the source buffer. The default (-1) copies the full lenght of the buffer.
The duration (in samples) to copy from the source buffer. The default (-1) copies the full length of the buffer.

:control startChan:

The first channel from which to copy in the source buffer.

:control numChans:

The number of channels from which to copy in the source buffer. This parameter will wrap around the number of channels in the source buffer. The default (-1) copies all of the buffer's channel.
The number of channels from which to copy in the source buffer. This parameter will wrap around the number of channels in the source buffer. The default (-1) copies all of the buffer's channels.

:control gain:

Expand All @@ -48,7 +48,7 @@

:control destStartChan:

The channel offest in the destination buffer to start writing the source at. The destination buffer will be resized if the number of channels to copy is overflowing.
The channel offset in the destination buffer to start writing the source at. The destination buffer will be resized if the number of channels to copy is overflowing.

:control destGain:

Expand Down
2 changes: 1 addition & 1 deletion doc/BufFlatten.rst
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@

:control startChan:

For multichannel ``source`` buffers, which which channel to begin the processing. The default is 0.
For multichannel ``source`` buffers, which channel to begin the processing. The default is 0.

:control numChans:

Expand Down
8 changes: 4 additions & 4 deletions doc/BufHPSS.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,13 +9,13 @@
HPSS takes in audio and divides it into two or three outputs, depending on the ``maskingMode``
* an harmonic component
* a percussive component
* a residual of the previous two if ``maskingMode`` is set to 2 (inter-dependant thresholds). See below.
* a residual of the previous two if ``maskingMode`` is set to 2 (interdependent thresholds). See below.

HPSS works by using median filters on the magnitudes of a spectrogram. It makes certain assumptions about what it is looking for in a sound: that in a spectrogram “percussive” elements tend to form vertical “ridges” (tall in frequency band, narrow in time), while stable “harmonic” elements tend to form horizontal “ridges” (narrow in frequency band, long in time). By using median filters across time and frequency respectively, we get initial estimates of the "harmonic-ness" and "percussive-ness" for every spectral bin of every spectral frame in the spectrogram. These are then combined into 'masks' that are applied to the original spectrogram in order to produce a harmonic and percussive output (and residual if ``maskingMode`` = 2).

The maskingMode parameter provides different approaches to combining estimates and producing masks. Some settings (especially in modes 1 & 2) will provide better separation but with more artefacts.

Driedger (2014) suggests that the size of the median filters don't affect the outcome as much as the ``fftSize``. with large FFT sizes, short percussive sounds have less representation, therefore the harmonic component is more strongly represented. The result is that many of the percussive sounds leak into the harmonic component. Small FFT sizes have less resolution in the frequency domain and often lead to a blurring of horizontal structures, therefore harmonic sounds tend to leak into the percussive component. As with all FFT based-processes, finding an FFT size that balances spectral and temporal resolution for a given source sound will benefit the use of this object.
Driedger (2014) suggests that the size of the median filters don't affect the outcome as much as the ``fftSize``. With large FFT sizes, short percussive sounds have less representation, therefore the harmonic component is more strongly represented. The result is that many of the percussive sounds leak into the harmonic component. Small FFT sizes have less resolution in the frequency domain and often lead to a blurring of horizontal structures, therefore harmonic sounds tend to leak into the percussive component. As with all FFT based-processes, finding an FFT size that balances spectral and temporal resolution for a given source sound will benefit the use of this object.

For more details visit https://learn.flucoma.org/reference/hpss

Expand Down Expand Up @@ -74,13 +74,13 @@
:enum:

:0:
Soft masks provide the fewest artefacts, but the weakest separation. Complimentary, soft masks are made for the harmonic and percussive parts by allocating some fraction of every magnitude in the spectrogram to each mask. The two resulting buffers will sum to exactly the original material. This mode uses soft mask in Fitzgerald's (2010) original method of 'Wiener-inspired' filtering.
Soft masks provide the fewest artefacts, but the weakest separation. Complimentary, soft masks are made for the harmonic and percussive parts by allocating some fraction of every magnitude in the spectrogram to each mask. The two resulting buffers will sum to exactly the original material. This mode uses a soft mask in Fitzgerald's (2010) original method of 'Wiener-inspired' filtering.

:1:
Binary masks provide better separation, but with more artefacts. The harmonic mask is constructed using a binary decision, based on whether a threshold is exceeded for every magnitude in the spectrogram (these are set using ``harmThreshFreq1``, ``harmThreshAmp1``, ``harmThreshFreq2``, ``harmThreshAmp2``, see below). The percussive mask is then formed as the inverse of the harmonic one, meaning that as above, the two components will sum to the original sound.

:2:
Soft masks (with a third stream containing a residual component). First, binary masks are made separately for the harmonic and percussive components using different thresholds (set with the respective ``harmThresh-`` and ``percThresh-`` parameters below). Because these masks aren't guaranteed to represent the entire spectrogram, any residual energy is considered as a third output. The independently created binary masks are converted to soft masks at the end of the process so that everything null sums.
Soft masks (with a third stream containing a residual component). First, binary masks are made separately for the harmonic and percussive components using different thresholds (set with the respective ``harmThresh-`` and ``percThresh-`` parameters below). Because these masks aren't guaranteed to represent the entire spectrogram, any residual energy is considered as a third output. The independently created binary masks are converted to soft masks at the end of the process so that everything null-sums.

:control harmThresh:

Expand Down
4 changes: 2 additions & 2 deletions doc/BufLoudness.rst
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@

:control source:

The index of the buffer to use as the source material to be described. The different channels of multichannel buffers will be processing sequentially.
The index of the buffer to use as the source material to be described. The different channels of multichannel buffers will be processed sequentially.

:control startFrame:

Expand All @@ -33,7 +33,7 @@

:control numChans:

For multichannel srcBuf, how many channel should be processed.
For multichannel srcBuf, how many channels should be processed.

:control features:

Expand Down
10 changes: 5 additions & 5 deletions doc/BufMFCC.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,19 +6,19 @@
:description: A classic timbral spectral descriptor, the Mel-Frequency Cepstral Coefficients (MFCCs).
:discussion:

MFCC stands for Mel-Frequency Cepstral Coefficients ("cepstral" is pronounced like "kepstral"). This analysis is often used for timbral description and timbral comparison. It compresses the overall spectrum into a smaller number of coefficients that, when taken together, describe the general contour the the spectrum.
MFCC stands for Mel-Frequency Cepstral Coefficients ("cepstral" is pronounced like "kepstral"). This analysis is often used for timbral description and timbral comparison. It compresses the overall spectrum into a smaller number of coefficients that, when taken together, describe the general contour of the spectrum.

The MFCC values are derived by first computing a mel-frequency spectrum, just as in :fluid-obj:`MelBands`. ``numCoeffs`` coefficients are then calculated by using that mel-frequency spectrum as input to the discrete cosine transform. This means that the shape of the mel-frequency spectrum is compared to a number of cosine wave shapes (different cosines shapes created from different different frequencies). Each MFCC value (i.e., "coefficient") represents how similar the mel-frequency spectrum is to one of these cosine shapes.
The MFCC values are derived by first computing a mel-frequency spectrum, just as in :fluid-obj:`MelBands`. ``numCoeffs`` coefficients are then calculated by using that mel-frequency spectrum as input to the discrete cosine transform. This means that the shape of the mel-frequency spectrum is compared to a number of cosine wave shapes (different cosine shapes created from different frequencies). Each MFCC value (i.e., "coefficient") represents how similar the mel-frequency spectrum is to one of these cosine shapes.

Other that the 0th coefficient, MFCCs are unchanged by differences in the overall energy of the spectrum (which relates to how we perceive loudness). This means that timbres with similar spectral contours, but different volumes, will still have similar MFCC values, other than MFCC 0. To remove any indication of loudness but keep the information about timbre, we can ignore MFCC 0 by setting the parameter ``startCoeff`` to 1.
Other than the 0th coefficient, MFCCs are unchanged by differences in the overall energy of the spectrum (which relates to how we perceive loudness). This means that timbres with similar spectral contours, but different volumes, will still have similar MFCC values, other than MFCC 0. To remove any indication of loudness but keep the information about timbre, we can ignore MFCC 0 by setting the parameter ``startCoeff`` to 1.

For more information visit https://learn.flucoma.org/reference/mfcc/.

For an interactive explanation of this relationship, visit https://learn.flucoma.org/reference/mfcc/explain.

:control source:

The index of the buffer to use as the source material to be analysed. The different channels of multichannel buffers will be processing sequentially.
The index of the buffer to use as the source material to be analysed. The different channels of multichannel buffers will be processed sequentially.

:control startFrame:

Expand Down Expand Up @@ -74,4 +74,4 @@

:control padding:

Controls the zero-padding added to either end of the source buffer or segment. Possible values are 0 (no padding), 1 (default, half the window size), or 2 (window size - hop size). Padding ensures that all input samples are completely analysed: with no padding, the first analysis window starts at time 0, and the samples at either end will be tapered by the STFT windowing function. Mode 1 has the effect of centring the first sample in the analysis window and ensuring that the very start and end of the segment are accounted for in the analysis. Mode 2 can be useful when the overlap factor (window size / hop size) is greater than 2, to ensure that the input samples at either end of the segment are covered by the same number of analysis frames as the rest of the analysed material.
Controls the zero-padding added to either end of the source buffer or segment. Possible values are 0 (no padding), 1 (default, half the window size), or 2 (window size - hop size). Padding ensures that all input samples are completely analysed: with no padding, the first analysis window starts at time 0, and the samples at either end will be tapered by the STFT windowing function. Mode 1 has the effect of centering the first sample in the analysis window and ensuring that the very start and end of the segment are accounted for in the analysis. Mode 2 can be useful when the overlap factor (window size / hop size) is greater than 2, to ensure that the input samples at either end of the segment are covered by the same number of analysis frames as the rest of the analysed material.
Loading