Localization
Multiple output channels placed in various locations in a listening space (stereophonic, quadrophonic, surround sound)
Varying amplitudes and phase of sounds through each output channel gives the illusion of movement of sound
Localization Cues
Azimuth angle (direction on the horizontal plane from the listener)
Elevation (direction on the vertical plane from the listener)
Distance (from the listener)
Localization through Varying Intensity
The intensity of a sound is proportional to the square of its amplitude
The intensity of a sound is proportional to the inverse of the square of the distance of the sound from the listener
Example
Let the amplitude of two speakers be G1 (right) and G2 (left).
In order to maintain a sound with equal intensity c distributed between the two speakers, then G12 + G22 = c
Make G2 proportional to cos A and G1 proportional to sin A, where A is the angle from the listener to the virtual sound source along the horizontal plane.
Localization in Csound
instr 1
k1 linseg 0, p3, 1
a1 oscil p4, p5, 2
a2,a3,a4,a5 pan a1, k1, k1 3
outq a2, a3, a4, a5
endin
f2 0 2048 10 15 0 5 0 3
f3 0 8193 -8 0 819 .122 819 .294 819 .517 819 .769 819 1 819 1.154 819 1.207 819 1.176 819 1.098 822 1
Other Localization Techniques
Interaural timing differences (ITD)
Doppler effect
Echo effects
Reverberation
The process of dispersion of sound due to the multiple reflections of sound within the musical enclosure
The properties of the enclosure determine the effect of the reverberation
Example: carpeted walls vs. tiled walls
Echo (delay) in Csound
instr 1 k1 linseg 0, p3*0.1, p4, p3*0.9, 0 a1 oscil k1, p5, 1 a2 delay a1, p3*0.5 outs a1, a2 endin
Comb Filter
y[n] = x[n-D] + gy[n-D], where D is the sample delay.
The impluse response h[n] of this filter is given by h[n] = u[n-D] + gu[n-2D] + g2u[n-3D] + ...
where u[n] is the sequence {1,0,0,0,0....}.
The name comes from the frequency response graph, which looks like comb teeth.
The time (sec) for output to decrease 60 dB is given by T60 = 60D/(-20R log10 g).
Example
g = 0.7
==> T60 = 20t, where t is the loop time (sec) = D/R
If T60 = 2 sec, then t = 0.1 (this means you have an echo density of 10 per second).
AllPass Filter
y[n] = -gx[n] + (1 - g2)(x[n-D] + gy[n-D])
The frequency response of the allpass filter is 1.
Allpass filters have non-uniform phase response.
Natural reverberation effects can be achieved by combining comb and allpass filters.
Reverberation in Csound
instr 1 ; p4 = amp, p5 = pitch, p6 = reverb time in seconds k1 linseg 0, p3*.1, p4, p3*.9, 0 a1 oscil k1, p5, 1 a2 comb a1, p6, .0297 ; .0297 is loop time a3 comb a1, p6, .0371 a4 comb a1, p6, .0411 a5 comb a1, p6, .0437 a6 = a2 + a3 + a4 + a5 a7 alpass a6, .09683, .005 a8 alpass a7, .03292, .0017 a9 reverb a1, p6 ; reverb is same as comb/alpass combination outs a8, a9 endin
Speech Synthesis
Speech consists of streams of phonemes (see handout)
Air (noise) is expelled from the lungs and passes through the voice box (larynx) which acts like a filter
Other parts of the vocal and nasal tract act like filters to further modify and shape the output sound
Types of Excitation Sources
Voiced - due to the periodic vibration of the glottis (has a detectable pitch)
Fricative (unvoiced) - due to turbulence created by constriction at some point along the vocal tract
Plosive (unvoiced) - due to the sudden release of pressure built up behind a closure in the vocal tract
Voiced Speech
Start with pink noise (buzz).
Filter a locations of formants (frequencies that are emphasized during specific speech sound - see handout)
Pass through additional filters to simulate nasalization and other vocal effects (vibrato, tremelo)
Unvoiced Speech
Start with white noise (rand).
Filter to simulate fricative and plosive effects due to the lips, tongue and teeth
Much more difficult to simulate unvoiced speech by computer than voiced sounds like vowels or vowel combinations.
Speech Synthesis Example in Csound
; speech synthesis of /A/ (as in "hot") ; p4 = note amp instr 1 iformant1 = 730 iformant2 = 1090 iformant3 = 2440 k1 linseg 0, .1, p4, p3-.2, p4, .1, 0 a1 buzz k1, 10, (sr/2)/10, 1 a2 reson a1, iformant1, 0.05*iformant1 a3 reson 0.5*a1, iformant2, 0.04*iformant2 a4 reson 0.33*a1, iformant3, 0.03*iformant3 out a2+a3+a4 endin f1 0 8192 10 1
COURSE INFORMATION | HOMEWORK ASSIGNMENTS
COURSE PROJECT | CS240 HELP DESK