Here is an example usage of the extracted dynamics data in
Sonic Visualiser. The orange
display below is the waveform of an audio file. The red dots are the
sampled loudnesses from the smoothed power curve plugin extracted
at the specified points using this webpage. The red labels attached
to the dynamic points indicate the measure:beat locations of the dynamic
in this case.
Note that the dynamic values can be measured anywhere in the audio
file, not just on the beats. For example, it might be useful to
ignore beats on which no events occur (such as measure 26 beat 3
in the above example).
You can use this webpage to extract loudness values for
all onsets using, for example, the mazurka plugin called
SpectralReflux::Onset Times instead of manually placed
event times.
Note that the dynamics extracted using PowerCurve::Smoothed
Power mazurka plugin cannot differentiate loudnesses on a small
time-scale (because the data is smoothed), so dynamic measurements less
than 100 milliseconds or so apart will not have a precise meaning,
since the loudness of the two close events will be blurred together.
The dynamic values are roughly in the range
from 0 to 100, and are approximately equivalent to dBSPL
values. More precisely, the values are measured as dBFS values
plus 100. The values are only approximately equal to dBSPL,
since the recording and playback sound levels can vary. Thus, the
absolute measured values may be offset by a fixed amount.
relative values between the measurements are more interesting.
A difference in dynamic values of 10 is about equal to one level change
in musical dynamics. For example, if the dynamic piano is assigned
to the numeric value 50, mezzo-piano would be at 60; mezzo-forte
at 70; forte at 80, etc.
In the options section, you can control the labels which are displayed
along with the data points. The following example shows the interpreted
dynamics option. The musical dynamic is given along with a fine dynamic
gradation (ten divisions between each dynamic). In the example, the
phrase starts piano and crescendos to mezzo-forte and
decrescendos to mezzo-piano. mp+5 means the basic dynamic
level is mezzo-piano plus 5 subdivisions (exactly half-way between
mp and mf).
In order to extract good dynamic values from the audio recording,
you should identify the onset times of notes as accurately as possible.
If you use onset times defined by your tapping to the beats in a
recording, there will be significant noise in the extracted dynamic
values from dyn-a-matic. The reason for this is illustrated in the
following figure:
In this example the methodology of the dyn-a-matic plugin is shown.
The orange area represents the waveform of an audio file which contains
two note onsets. The vertical red lines show where a listener may tap
while listening to the recording. The vertical blue lines show where
the actual note onsets occur in the audio file.
Dynamatic takes two input annotation layers from Sonic Visualiser.
The first is a list of times at which you want a measurement of the
dynamics. The second annotation layer is the automatically generated
data from the Power Curve:
Smoothed Power vamp plugin (using the default settings).
Dynamatic extracts the smoothed amplitude value exactly 100 milliseconds
after the user-specified time values in the first annotation layer. The
smoothed-power annotation data contains smoothed measurements of amplitude
in decibels every 10 milliseconds in the audio file. This means that
in the above figure, the amplitude value on the pink line is taken ten
dots after the vertical lines. The blue and red horizontal lines in the
figure demonstrate the time delay between the specified even time
and the value extracted from the smoothed power curve. (The 100 ms delay
is required because of (1) the smoothing process, and (2) the physics
of the piano which cause the maximum loudness of a note to follow the
actual onset time).
Consider the two events used in the above example. The red tap was
done after the actual blue onset time. This caused dyn-a-matic to measure
the loudness of the note(s) at the onset time as -32.6 dB. But a more
accurate measurement would be -31.7 dB if measured correctly from the
actual onset time rather than the listeners tap time. The difference is
-1.1 dB, which is not a very large difference.
However, now consider the second event where the listener's tap
occurs before the actual onset. In this case the loudness measure
is -42.1 dB, while a more accurate measurement would be -35.6. This
is a difference of -6.5 dB, which is a significant difference.
The difference between dynamic values is approximately 7-12 dB, so this
error is over 1/2 of a dynamic level and up to one dynamic level.
Note that listener taps which occur after the actual onsets
will not have much difference in loudness measurements when compared
to the actual earlier onset time. But when the listener tap comes
before the actual onset time, there will be a large difference between
the measured and actual loudness value.
One way of increasing the accuracy of your onset measurements
is to start with raw tapping data, and then filter the tapping data
through the tapsnap
online program which moves tapped beats to the nearest identified
onsets in the audio recording. The output from this program can
then be imported back into Sonic Visualiser and can be checked by
ear for accuracy. And after the onset times have been proof-listened
to, you can process it through the dyn-a-matic online program to
extract estimates of the dynamics.
|