The Corner Effect

XII Colloquium on Musical Informatics
University of Udine
Gorizia, Italy
September 1998

Damián Keller, Chris Rolfe
damian_keller@sfu.ca, rolfe@sfu.ca

Abstract

We discuss some theoretical and practical aspects of real-time granulation of sampled sounds, such as windowing, grain overlap, synchronicity, and control through high level events. Our analysis of the trapezoidal window has shown that it approximates the response of a Gaussian window, with the addition of comb-shaped spectral effects. Zeros are proportional to the position of the 'corners' of the window. Therefore, we call the artifacts as the 'corner effect.'

Keywords: real-time granular synthesis, windowing, ecological models.

The introduction

This paper discusses some of the processes involved in granular synthesis (GS), in an effort to identify relevant variables in granular temporal and spectral transformations. Windowing, AM effects, grain overlap and their interaction produce complex time-varying spectral profiles. We address these issues in relation to the implementation of MacPod [11], a real-time GS system for the Macintosh PowerPC which is based on Truax's (1988) POD system. Furthermore, we discuss some new concepts and techniques relevant to the development of ecologically-based sound resynthesis, namely, the use of local or global parameters to define granular events, the control of phase-synchronicity among streams, and the simplification of windowing by using pre-stored grains [5].

The implementation of a real-time granular synthesis (GS) system on a personal computer presents two basic challenges: (1) an efficient use of computational resources to generate high grain densities, (2) a simple and intuitive organization of synthesis parameters to facilitate real time control. The first issue is directly related to the synthesis engine of the system, that is, how the source sounds are windowed and mixed. The second issue corresponds to the control level of the system, which concerns how the synthesis parameters are generated and how the user performer-composer interacts with them.

The complex

The interaction among processes in asynchronous GS generates rich sound results with fairly little source material. We have identified four causes for the increased complexity of granulated sound. (1) By applying an envelope, or window, we produce a signal equivalent to the convolution of the impulse response of the window and the sampled sound. In other words, the window applied on the original signal causes a resonant main lobe and several spectral sides lobes which smear the original spectrum. (2) At subaudio grain rates, amplitude modulation adds upper and lower components to the granulated sound. The spectral modifications are proportional to the spectral content of the signal and the grain rate applied. (3) The overlap among grains in different voices produces time-varying cancellation and reinforcement which also modify the spectrum of the original signal. (4) When time-stretching is applied to a single sound file, time-delayed copies of the granulated signal are overlaid. This process produces temporal and spectral effects that depend on the stretch-ratio being used.

The window

Windowing effects in audio signal processing are generally well understood. Window functions used in spectral analysis, such as von Hann and Hamming, minimize unwanted artifacts but increase the computation time [7, p. 149]. To reduce computational cost, the earliest real-time granular synthesis systems [12] used simple trapezoidal windows to good aural effect.

We have focused our research on the effects of the lowly trapezoidal grain window, within the context of asynchronous granular synthesis. The trapezoidal window, in fact, resembles the popular Gaussian window, with the addition of ripples that produce an effect aurally similar to comb-filtering. Zeros are proportional to the position of the 'corners' of the window and hence, we call the artifacts as the 'corner effect.' (Fig. Spectrum of a trapezoidal window).

While undesirable for most signal processing applications, this filtering effect is unobtrusive in GS. As we will discuss in 'The Overlap' section, aurally similar modifications of the signal are inherent in the GS technique due to the delay between overlapping grains. Therefore, we can confidently state that complex windowing is unwarranted for granular synthesis at medium-to-high grain densities. We invite the reader to compare the spectral effect of a triangular window with a trapezoidal window using identical synthesis settings. Both spectrograms show very similar results, with a slight 'smearing' of the spectrum when the trapezoidal window is used.

0dB

0kHz 20kHz

-96dB
Spectrum of a triangular window.
0dB

0kHz 20kHz

-96dB
Spectrum of a trapezoidal window.

Our particular focus is the application of GS to modeling environmental sounds. High grain densities (approaching one thousand grains per second) are needed to model complex, time-varying sound events. The chief objection to densities of this magnitude in real-time systems is the inefficiency of windowing and mixing the grains [2]. Using the trapezoidal function, however, we achieve real time GS with the required density on a standard Macintosh PowerPC.

The overlap

The grain overlap is defined as the time interval during which two or more grains are sounding simultaneously [4]. An average grain overlap can be estimated by the difference between the average grain rate and the grain duration. If grain duration is longer than the grain rate, overlap occurs. Thus, there are three possible configurations: (1) negative overlap, there is a delay between the end of a grain and the onset of the following grain, (2) no overlap, a grain starts when the previous ends; (3) positive overlap, before a grain ends the next one starts. In batch implementations, there can be as many overlapping grains as memory and patience allow. On the other hand, real-time constraints place a limit on the number of simultaneously sounding grains. MacPod can achieve up to 20 simultaneous grain streams, with a minimum grain rate of one millisecond.

Although some GS systems combine several grain streams into a single voice [1], it is conceptually clearer to conceive each voice as a separate stream. Thus, overlap can be controlled from a unique parameter which stands for the coincidence [3], or phase-synchronicity, among grain onsets in all active voices. Following the central limit theorem [8, p. 174), it is reasonable to state that if each grain stream is defined as an independent random process, the overlap distribution will eventually approach a Gaussian probability distribution.

Careful control of phase-synchronicity among grain onsets in different streams produces transformations in the temporal and spectral profile of the granulated sound. With very fast grain rates - under 5 ms. - using pitched sample material, we obtain formants akin to those produced by FOF synthesis. A small delay between grain onsets adds volume (as defined in [13]) to the original signal, producing an effect akin to early reflections in a reverberant space. Of course, we must keep in mind that all these processes are independent from the asynchronous grain rate established for each stream.

Synchronous stream

Asynchronous stream

Phase-synchronous streams

Phase-asynchronous streams

Within the context of ecologically-oriented resynthesis, phase-synchronicity is especially meaningful in the simulation of attacks. In stricking a solid object, most resonant frequencies will be excited in the first fifty milliseconds or less. Contrastingly, if the excitation is produced by several small objects, each impact will excite different frequencies at various time delays causing a granular sound texture. This type of sound can be heard when walking on glass pieces or on snow.

The stream

A grain stream generator produces a series of grains with a given frequency, amplitude and duration. These parameters can vary in time. The concept of grain generator implies that only a single grain can be produced at a time. Thus, when more than one simultaneous grain is desired (to produce overlaps) several grain generators have to be used. This introduces the need to define the phase relationship between the grain streams. The phase-asynchronous implementation, as found in asynchronous GS, produces streams which are completely independent. If the time among the grains in different streams is to be controlled, a phase-synchronous approach is necessary. As we stated before, in this case the grain onsets can be synchronized across streams or a short delay may be used. Therefore, there are three possible configurations: (1) a single stream generator, (2) multiple phase-asynchronous stream generators, and (3) multiple phase-synchronous stream generators.

The waveform

GS techniques have used different types of source material: (1) sine waves, in FOF synthesis [10]; (2) FIR filters derived by spectral analysis, in pitch-synchronous granular synthesis; and (3) sampled sounds, in asynchronous GS [12], FOG, and pulsar synthesis [9]. Ecologically-based resynthesis adds the option of using pre-stored sample grains [6].

More especifically, in ecologically-based GS we create a grain pool before the synthesis stage, instead of retrieving arbitrary segments of the sound file. The samples keep the spectro-temporal characteristics of the short original sounds, avoiding the 'blurring' effect that occurs in asynchronous GS [9]. These samples are placed on a time frequency grid according to meso-level time patterns which are, in turn, designed to match the temporal characteristics of naturally occurring sounds, e.g., bounce [6]. Given that this approach simplifies the windowing process, it may provide a good alternative to existing real-time methods.

The pointer

GS systems access the sound database contents in four different ways to: (1) incremental, the file is read from beginning to end; (2) loop, the file is read repeatedly from beginning to end; (3) cycle, the file is read repeatedly from beginning to end and backwards; and (4) random, the file is read at random locations.

The current implementation of MacPod, following the POD model, uses a single pointer to source material. Interestingly, the effect of the overlapping grains can be simply explained as a comb-filter delay. If one assumes a fixed grain envelope, an asynchronous grain six milliseconds later than the original is simply a six- millisecond delay mixed in with the original signal. By keeping the resolution at a sample level, we are able to explore a variety of spectral transformations - at subaudio rates - and reverb-like effects at slower rates.

The event

A logical implication of the ecological approach to sound resynthesis is to establish the sound event [6] as a high order unit of sound generation. Resynthesis parameters are thus directly linked to a finite time length. Rate of change is scaled according to the length of this event. Instead of fine-tuning unrelated parameters (such as amplitude or frequency of a given grain stream), transformations of a sound event are carried out along correlated variables within ecologically valid time ranges.

We point out two possible strategies: (1) High-level events are defined by global settings. These settings define ranges of possible values for the local parameters. (2) Local parameters determine the overall behavior of the high-level event. For example, the density of an event can be defined by two global parameters: duration and quantity of grains. If grains with fixed duration are evenly scattered along a predefined time span, we get an invariant average density. But let's say that we want to have a dense distribution that changes linearly to a sparse one:

Tme-varying grain distribution.

If synthesis parameters vary independently, we will spend several trials until we find the right amount of grains and the right rate of change in distribution. On the other hand, by using grain overlap as the only control variable and letting the quantity of grains and the overall duration change accordingly, we will be dealing directly with the relevant perceptual parameters. In this example, the only high-level variable that needs to be defined is the rate of change in grain overlap.

The conclusion

We have investigated several issues involved in the implementation of a real-time granular synthesis application. The focus of our work has been the efficient use of computational resources, and a simplified method for synthesis parameter control.

Our results point to two effective approaches to windowing: (1) the use of a trapezoidal function, as suggested by Truax (1988), (2) the use of a grain sample pool, as implemented in ecological sound resynthesis. By applying a trapezoidal window, we obtain aurally effective results with a drastic reduction of computational time. This type of window produces a spectral profile which depends on the placement of the 'corners' of the trapezoid. Thus, what has been regarded as an unwanted artifact by DSP theory, becomes a useful parameter for sound synthesis.

Our current efforts are concentrated on bringing the ecological perspective to the real-time realm. By using events instead of low-level control parameters, we pave the way to a more intuitive interface between user input and sound output. At the other end, the independence in grain rate control and the resolution of grain overlap at a sample level permit not only to work on the temporal characteristics of the sound, but also to shape its spectral profile.

MacPod: real-time granular synthesis for the Macintosh PowerPC.

The References

[1] Behles, G., Starke, S., & Röbel, A. (1998). Quasi synchronous and pitch-synchronous granular sound processing with Stampede II. Computer Music Journal, 22(2), 44-51.

[2] Cook, P.R. (1997). Physically informed sonic modeling (PhISM): synthesis of percussive sounds. Computer Music Journal, 21(3), 38-49.

[3] Dziech, A. (1993). Random Pulse Streams and their Applications. Warszawa: Elsevier.

[4] Jones, D.L., & Parks, T.W. (1988). Generation and combination of grains for music synthesis. Computer Music Journal, 12(2), 27-33.

[5] Keller, D. (1998). ". . . soretes de punta." Compact disc Harangue II. Burnaby, BC: Earsay. http://earsay.com

[6] Keller, D., & Truax, B. (1998). Ecologically-based granular synthesis, Proceedings of the International Computer Music Conference. Ann Arbor, MI: University of Michigan. http://www.sfu.ca/~dkeller

[7] Lynn, P.A., & Fuerst, W. (1998). Introductory Digital Signal Processing with Computer Applications. Chichester: John Wiley.

[8] Mix, D.F. (1995). Random Signal Processing. Englewood Cliffs: Prentice Hall.

[9] Roads, C. (1997). Sound transformation by convolution, Musical Signal Processing, C. Roads, S.T. Pope, A. Piccialli, & G. De Poli (Eds.). Lisse: Swets & Zeitlinger, 411-438.

[10] Rodet, X. (1984). Time-domain formant wave function synthesis. Computer Music Journal, 8(3), 9-14.

[11] Rolfe, C. (1998). MacPod. Real-time asynchronous granular synthesis software for the Macintosh PowerPC. Vancouver, BC: Third Monk Inc. http://www3.bc.sympatico.ca/thirdmonk

[12] Truax, B. (1988). Real-time granular synthesis with a digital signal processor. Computer Music Journal, 12(2), 14-26.

[13] Truax, B. (1992). Electroacoustic music and soundscape: the inner and outer world, Companion to Contemporary Musical Thought, Vol. 1, J. Paynter, T. Howell, R. Orton, & P. Seymour (Eds.). London: Routledge, 374-398.

Référence: http://www.sfu.ca/~dkeller/CornerEffect/CornerEffect.html

Synchronous stream	Asynchronous stream
Phase-synchronous streams	Phase-asynchronous streams