Thursday, December 9, 2004, 10:47 PM
Spectro-Morphology and the Musical Grammars

Music is an aural art. While many different compositional systems are possible, ignoring the listener is detrimental to any compositional process. This fact is at the root of arguments regarding the "cognitive opacity" of post-tonal music, because music is most naturally judged by what is heard, and what is heard is not always what the composer intends. If a listener cannot perceive a compositional structure in as great of detail as other musical elements, it does not mean that these structures are not worthwhile. However, the fact that they are not the most perceivable elements in a piece should be taken note of, so that more perceivable elements are attended to by the composer. These elements will likely be more important to the listener, as they may have no idea that the less perceivable, more idiosyncratic structures even exist. In an aural art form, what is heard is what is important, and a compositional process which takes this into account is superior to a process which does not.

In 1992, Fred Lerdahl published "Cognitive Constraints on Compositional Systems" in the Contemporary Music Review. In this article, he proposes that "serial organizations are cognitively opaque" (Lerdahl 231). His analysis and criticism of serial music has been controversial, and his argument is certainly flawed on some level. Probably the most debatable aspect of the paper lies in the second half, where he suggests that traditional tonal music be the only model for music because of hierarchies inherent in the diatonic scale (and to a lesser degree, some other tonal scales), and because of the regularity of metrical structure. While it seems somewhat doubtful that all music written outside of the tonal tradition is a brick wall of cognitive opacity, his main point has a certain resonance to it. Serial procedures can be quite opaque, but that does not imply that the music itself is opaque. The Lerdahl paper starts off with the invention of a compositional / analytical model which defines two distinct kinds of musical grammar. These grammars are the compositional grammar, and the listening grammar. He posits that strictly serial compositional practices are deficient because of a fissure between the two grammars. Since music is an aural art-form, the listening grammar must be paramount, and it follows that the "compositional grammar must be based on the listening grammar" (Lerdahl 236).

Lerdahl sets up seventeen compositional constraints which are intended to connect the listening grammar to the compositional grammar. Unfortunately these constraints are severely limited, because they allow for nothing but a tonal compositional system. However, these constraints can be separated into three main categories, which are less restrictive: constraints on event sequences, constraints on underlying materials, and constraints on pitch space. Constraints on event sequences mostly have to do with the segmentation and grouping of the musical surface into individual events. These events, according to Lerdahl, should rely on a hierarchical structure, because "most of human cognition relies on hierarchical structuring" (Lerdahl 239). Constraints on underlying materials define stability conditions, such that certain musical structures are judged to be innately more stable than other structures. Lastly, constraints on pitch space are related to the concept of a "framework of pitches that are cognized as relatively close to or far from one another" (Lerdahl, 248). So, all tonal recommendations aside, to simplify Lerdahl's argument we can say that in order for a compositional grammar to be based on a listening grammar it must possess three characteristics. It must be able to be segmented and grouped hierarchically, it must define stability conditions, and it must contain a concept of distance between sonic objects. Interestingly, Lerdahl seems to be aware that his constraints rely too heavily on the constructs of tonality, and he mentions, "it will be fascinating to discover how the new sound materials of computer music will be able to meet these constraints. In all probability the new materials will bring additional requirements into play" (Lerdahl 251). One person who has been influential in creating a compositional grammar which is based on a listening grammar is Denis Smalley, with his writings on spectro-morphology.

Smalley's framework is not tied to traditional tonality, but instead includes tonality as one part of a much larger system. Spectro-morphology is "an approach to sound materials and musical structures which concentrates on the spectrum of available pitches and their shaping in time" (Smalley 61). It's a fundamentally more robust model than Lerdahl puts forth, because it defines the important aural aspects of music, without relying on any one musical construct, such as the diatonic scale. However, he seems to agree strongly with Lerdahl, at least in principle: "today we continually need to reassert the primacy of aural experience in music. The heritage of twentieth-century formalism and the continuing propensity of composers to seek support in non-musical models have produced the undesirable side-effect of stressing concept at the expense of percept" (Smalley 62). A perceptual model is at the basis of spectro-morphology, and he notes that "listeners can only apprehend music if they discover a perceptual affinity with its materials and structure" (Smalley 62). In a statement that is almost identical to Lerdahl's claims regarding the compositional grammar and the listening grammar, Smalley states that "the practice of listening, and the perceptive observation of the listening process must therefore form the foundation of any musical investigation which seeks to explain the workings of spectro-morphology" (Smalley 63). The listening process is prominent in his model, and it deals with the musical exploration which is possible with electronic technology by recognizing the "inherent musicality in all sounds" (Smalley 61).

Smalley begins his framework by noting that the "lack of a shared terminology is a serious problem confronting electroacoustic music because a description of sound materials and their relationships is a prerequisite for evaluative discussion" (Smalley 63). All too often, traditional musical terminology falls short in describing electroacoustic or computer music, and so we tend to search for non-musical terms, such that "music involves mimesis: musical materials and structures find resemblances and echoes in the non-musical world" (Smalley 63). While all sounds possess a "dual potential" of abstract and concrete aspects of sound, it is important to note that "music is always related in some way to human experience, which means that mimesis is always at work even in music regarded as abstract" (Smalley 64). In laying out the Spectro-morphological framework, Smalley defines three main areas: Spectral typology, Morphology, and Motion. Lastly he deals with structuring processes.

Spectral typology segments all possible sounds into a note-noise continuum. The term, 'spectrum', is a generic catch-all for all aurally perceptible frequencies, and is used as a replacement for the words 'pitch' and 'timbre', which are too closely tied with the harmonic functionality of pitches in tonal music. The note-noise continuum is separated into three main groupings: note, node, and noise. The note group is broken down further into the note proper, the harmonic spectrum, and the inharmonic spectrum. The note proper "embraces traditional pitch perception: absolute pitches, intervallic and chordal combinations" (Smalley 65). In this way, spectro-morphology encompasses not only computer and electroacoustic music, but also tonal and serial music as well. Specific inter-relations of notes form the primary carriers of information in tonal music, but there is no need to limit musical information solely to these inter-relations. The differentiation between a note proper and the harmonic spectrum is not clear-cut, but the distinction relies on the fact that "once an harmonic spectrum is perceptible above a fundamental its spectral components can be featured as compositional values" (Smalley 66). The inharmonic spectrum sub-group is important to electroacoustic music, since the "medium makes viable the composition, decomposition, and development of spectral interiors". A nodal spectrum is defined as a "band or knot of sound which resists pitch identification". Lastly, there is a blurred buffer zone between note and noise which is a result of increased spectral density, and this is referred to as the pitch-effluvium continuum.

The effluvial state is important, as it defines the point where a listener needs to "change focal strategy as aural interest is forced away from charting the pitch behavior of internal components to follow the momentum of external shaping" (Smalley 67). This is precisely the point missed by Lerdahl in his recommendations of tonality over other forms of music. The context of music changes how the ear responds to structure. It's possible to set out a compositional grammar which is based on a listening grammar, but in order to do such a thing outside of the bounds of traditional tonality, it is necessary to expand the listening grammar to include salient characteristics of the music in question. There is structural potential to musical materials beyond the note proper, and this potential is "harnessed by comparing, relating, and transforming spectral types and their combinations" (Smalley 67).

Transference of energy is at the root of morphology, which is Smalley's second category of the spectro-morphological framework. Morphology deals with the "natural fundamentals of sound perception", most notably the reality that sound in the physical world is only created by energy being transferred from one party to another. So, "during the execution of a note, energy input is translated into changes in spectral richness or complexity" (Smalley 68). Smalley claims that ignoring these fundamentals in the composing of morphologies and structures results in a musical deficiency, which is detectable by the listener. There are three morphological archetypes of "instrumental sounds" (by "instrumental", Smalley is referring to sounds which are created in the physical world). These archetypes are attack-impulse, attack-decay, and graduated continuant. The attack-impulse archetype is a brief, detached note which has very little resonance. This results in practically no sonic decay of any sort. The attack-decay archetype is the opposite. While these sounds still have a short attack, they are resonated in some manner, and last longer than attack-impulse sounds. The graduated continuant is modeled on sustained sounds which have only a graduated attack and decay. For dealing with electroacoustic music, these morphological archetypes are extended into a listing of ten morphological models: attack-impulse archetype, closed attack-decay archetype, reversed closed attack-decay, open attack-decay archetype, reversed open attack-decay, linear attack-decay, reversed linear attack-decay, linear graduated continuant, swelled graduated continuant, and graduated continuant archetype. These morphological models are linked or merged into morphological strings which take advantage of correspondence, "a point or stage in time where a morphological shift takes place – a kind of morphological modulation" (Smalley 71). The internal tension of the morphological models are projected into higher structural levels, and thus, the "spectral and dynamic tensions inherent in the sounding extensions of gesture are the foundations of musical structuring" (Smalley 73).

Motion, is the last of Smalley's main groupings of spectro-morphology, and is of utmost importance, because it brings together the idea that "spectro-morphological composition, like other musical languages, is concerned with expectations gratified and foiled, and such expectations are founded on shared perceptual experience" (Smalley 75). Motion, in this sense, does not refer to spatial motion, but to real and imagined motion created by controlling spectral and dynamic shaping. Five basic motion categories are defined, and expanded upon: unidirectional, bi-directional, reciprocal, centric / cyclic, and eccentric / multi-directional. These motion categories apply to a variety of structural levels, from the "shape of a brief sound-object to the motion of a large structure, from the groupings of objects to the groupings of larger structures" (Smalley 73).

In dealing with the structure of spectro-morphological pieces, Smalley notes that there is no low-level unit consistent across the entire breadth of post-tonal music, as is present in tonal music in the form of the note. There is also no consistent density referrent, in the form of metrical pulse, which sets the pace on which the perception of music is based. While this lack of a low-level unit and metrical pulse make it impossible to define consistent structural hierarchies (which Lerdahl would argue is a deficiency of non-tonal music), it is still possible and important for structures to be multi-levelled. This allows us to "vary our perceptual focus throughout a range of levels during the listening process" (Smalley 81). He points out that even when permanent hierarchical relationships may not be found in an electroacoustic work, certain fractured hierarchies of varying temporal dimensions may frequently be found. However, an "inability to maintain control over the focal scanning of structural levels during the process of composition" is a "crucial reason for the failure of many electroacoustic works" (Smalley 81).

Tonal music does have certain advantages over post-tonal music, when perceptual focus is considered. However, through an attention to, and an expansion of the concept of a listening grammar, it is possible to define a compositional model which does not sacrifice perception, and is not limited to purely tonal structures. Lerdahl defined three main characteristics which a compositional grammar must possess in order for it to be based on the listening grammar. These characteristics are satisfied by tonal music, but they are also satisfied by the model proposed by Smalley. The spectro-morphological approach is able to be segmented and grouped hierarchically, at least in some form. While the hierarchy might not be as strong as is present in the diatonic scale, it is possible to create other types of hierarchies through attention to multi-leveled aspects of a composition. Stability conditions are met through the concept of spectral typology, which classifies sounds in a broad sense, and morphological motion, which attends to musical "expectations gratified and foiled". The concept of "distance" between sonic objects is satisfied by the concept of morphological motion between such objects, and the transference of energy to create a morphology. In order to move between one object and another, there must be a perceivable distance between the two objects. In satisfying the three characteristics proposed by Lerdahl, "spectro-morphology reaffirms the primacy of aural perception which has been so heinously ignored in the recent past" (Smalley 93).

While he makes a compelling argument, Smalley's proposal of a spectro-morphological model is not without flaw. Many of his attempts at categorizing typologies and morphologies seem too overly concerned with creating huge lists of sub-groupings, sometimes accompanied by complicated-looking pseudo-hierarchic graphical representations. In generating these lists, he makes distinctions between semantically similar words such as vortex and helix, but doesn't bother to define the words in the context which he is using them. This is especially noticeable in his hierarchical listing of Motion Typologies. He also leaves out obvious categories. In his definition of "instrumental archetypes", Smalley lists three different types of attack-based archetypes, and one "graduated continuant", where a sound swells into being, and decays slowly once it is done sounding. However, it's quite possible for an instrumental sound to swell slowly, but end abruptly, or start with a sudden attack, but end with a slow and steady decay. As he expands his list of archetypes into morphological models, he runs into even more problems, because it's still easy to come up with groupings he has left out. For instance, he lists three different types of graduated continuant: linear, swelled, and archetype. To this, one could add exponential, gaussian, parabolic, etc.. He would have been better off sticking with a shorter list of more general groupings. There's also no mention of emotional or relational aspects of music, which should be closely tied with a listening grammar.

Flawed as it may be, spectro-morphology is an important turning point, especially in electroacoustic music. The twentieth century is full of idiosyncratic compositional practices, but throughout the life-span of electroacoustic music, a disproportionate amount of compositional attention is given to underlying technological processes present in the music, which may not be of any interest to the listener. In a situation even more severe than is the case with serial procedures, technologies which are new and exciting today become mundane and gimmicky tomorrow. A piece written by John Chowning in the 1980s which uses nothing but Frequency Modulation (FM) synthesis was exciting at the time, but once FM synthesis was available to every rock band with a Yamaha synthesizer, the signature sound of the process became cliché and mundane. Even a listener capable of recognizing the underlying processes may find themselves utterly uninterested by that process. Attention to the musical aspects of a composition, as is proposed by Smalley, answers the call Lerdahl makes to base compositional grammars on a listening grammar, and it helps to focus attention back on the listener.

--Tom Gersic

Lerdahl, Fred. (1992) "Cognitive Constraints on Compositional Systems." 97-121. Contemporary Music Review. New York: Harwood Academic.

Smalley, Denis. (1986) "Spectro-Morphology and Structuring Processes." 61-93. The Language of Electroacoustic Music. Edited by Simon Emmerson. New York: Harwood Academic.