Come To Daddy Analysis
Wednesday, June 15, 2005, 11:32 PM
An analysis of the music video directed by Chris Cunningham

Link to PDF

The 1997 release of the music video for Aphex Twin’s Come to Daddy, Pappy mix had the effect of catapulting the director, Chris Cunningham, from relative obscurity into being a highly sought-after video director. Prior to this video’s release, Cunningham primarily worked on special effects for Hollywood productions, such as the movie AI, while it was under the direction of Stanley Kubrick. However, within a year, he had written and directed music videos for Madonna’s Frozen (at Madonna’s personal request), Portishead’s Only You, Leftfield’s Afrika Shox, Squarepusher’s Come on My Selector, Aphex Twin’s Windowlicker, and Björk’s All Is Full of Love. When asked how she chose Cunningham to direct her video, Björk’s response was, “This Aphex Twin video happened, and everybody knew, here comes a genius” (Brown 2003). While the video got somewhat minimal airplay in America, MTV executive James Hyman is quoted on Wikipedia as saying that it is “one of the most powerful, certainly the most disturbing I’ve ever seen” (Wikipedia). What made this video so powerful that it immediately grabbed the attention of artists like Madonna and Björk, both known for having many excellent music videos?

In an analysis of a music video, it is necessary to consider that the predominant role and raison d’être of the MTV-era music video is that of a marketing tool. As stated by Carol Vernallis, author of Experiencing Music Video, the “video must sell the song” (2004). In order to “sell the song”, the music in the music video must be the dominant element. Since the video is almost always produced after the music has already been recorded, it is typically the goal of the director to follow the form of the song, and to highlight the music, frequently at the expense of the visuals. This goal is the opposite of the typical classical Hollywood film director’s goal, where music is intentionally allowed to fall into the background in order to manipulate the viewer’s perception and cognition of the film through subconscious means. While music video is not the first form of multimedia to highlight aural elements at the expense of the visuals, it is distinct in its role of highlighting a particular piece of popular music and a particular artist, for the explicit mass-marketing purpose of increasing the sales totals of an album. Ensuring that the music does not fall into the background and that it is received well by the intended audience is of paramount importance. Were a music video not to highlight the music, it might not even be released. As stated by Joe Gow in the article Music Video as Communication: Popular Formulas and Emerging Genres, the music becomes the “dominant formal element” (1990).

It is possible to divide music videos into two groups. These groups represent the two “most basic formal possibilities of music video” (Gow 1990). The most popular form is that of the performance video. In a study of 138 videos identified as “the most well liked and influential clips in music video history”, Gow determined that “performance dominates just about all of the 138 clips” (1990). Of these videos, which were collected from three different TV shows airing on MTV and NBC (“Top 100 Videos of All Time”, airing on MTV, April 30, 1989; “Greatest Videos of the Decade”, airing on NBC, September 15, 1989; and “Most Requested Videos of All Time”, airing on MTV, March 11 and 12, 1990), only two videos were identified as something other than performance, and these were George Michael’s Father Figure, and The Stand by R.E.M.. These two videos were recognized by Gow as being narrative and abstract, respectively. Gow finds specifically that “given R.E.M.’s well publicized disdain for the trappings of the commercial music industry, this is not surprising” (1990) because “anti-performance pieces invite viewers to see their stars as wishing to run against the video mainstream” (1990). As the name suggests, the main subject of performance videos is the performing musician— “the performance oriented visuals cue viewers that, indeed, the recording of the music is the most significant element” (Gow 1990). Lisa St. Clair Harvey, in her article Temporary Insanity: Fun, Games, and Transformational Ritual in American Music Video, depicts the proliferation of the music video in the 1980s as the result of a kind of mass-cultural TV-watching ritual, and she quotes Pat Aufterheide, author of Music Videos: The Look of the Sound as saying that “music video performers mix-and-mach their public images without missing a beat or dropping a stitch…many performers hold up as a model the career of David Bowie, who pioneered the concept of the disposable image…with rock videos, personal ‘identity’ has become a central element of commodity production” (Harvey 1990).

The Come to Daddy video decidedly bucks the performance video trend. While Come to Daddy does indeed contain lyrics, they are minimal, and the short phrases that make up the lyrics are repeated multiple times in a ritualistic manner. Also, while the song is separated into a number of sections, many of these sections are lyric-free, and are composed entirely of repetitive bass lines and the dense textures of drum breakbeats, which is a common characteristic of the Intelligent Dance Music (IDM, also known as Drill and Bass, and Braindance) genre, attributed to Aphex Twin, Squarepusher, and other artists, frequently those signed to the Warp label. Of the sections that contain vocals, each is focused on a certain phrase, of which there are only two: “I want your soul, I will eat your soul”, and “come to daddy”. Aside from the breakbeats and bass lines, the other musical material is electronic and heavily processed. The drum patterns are so complex that they leave little question that this piece was ever intended to be performed in any traditional sense. To show an artist performing such a piece would not only raise some significant questions regarding exactly what it is that she or he should be shown playing, it would also likely result in a comical video. The piece is an electronic piece constructed from repetitive musical motives, and the instrumentation is made up of sampled drums, synthesized bass, and various other electronic instruments. In the terminology of Nicholas Cook, to show an artist lip-syncing a song to the camera is a highly conformant relationship between the audio and visuals because we would see a direct representation of what we are hearing (Cook 1998). However, Chris Cunningham’s vision for the Come to Daddy video fits the complementation paradigm, and the video is enhanced by this relationship. The visuals, while not entirely consistent with the music, represent a wonderfully warped narrative and associational interpretation of an already warped song and of the medium of music video.

In contrast to the performance video, the goal of the concept video is to stand apart from the crowd. The concept video allows for more flexibility with artistic direction, as the lip-synced performances in performance videos have the effect of “blunting narrative drive” (Vernallis 2004), along with blunting other thematic material. From a marketing perspective, “if a director is to produce a clip that might somehow stand apart from the numerous other videos appearing on a service such as MTV…then it is to non-performance oriented alternatives that she or he must turn” (Gow 1990). Since the concept video is to be set apart from the performance video, directors must turn toward the other formal systems available within the realm of traditional filmmaking. Of the five formal systems described by Bordwell and Thompson in Film Art: An Introduction, narrative form is by far the most common in western culture. To be considered a narrative, a piece should contain “a chain of events in cause-effect relationship occurring in time and space” (Bordwell 1997).

While the “chain of events” in Come to Daddy does not reflect a fully developed narrative in the strictly Aristotelian sense—because of the compressed nature of the music video, and the foregrounding of the music, the characters are more archetypal than the fully formed characters consistent with the definition of traditional narrative—it is a video based heavily on narrative form, although the narrative is forced to follow the formal structure of the music. The video places the viewer in a surreal, off-color world where evil lurks around every corner. That evil, embodied in its many different forms throughout, looks startlingly like many poorly-engineered copies (or clones) of Richard D. James, the Aphex Twin. In the beginning of the video, we find ourselves outside a dirty and barren housing complex. The first character that we see onscreen is that of an old woman (played by Coral Lorne) walking her dog. As the woman comes upon a pile of garbage, she allows her dog to sniff at it. As she waits for her dog to examine a small TV, we get the impression that she is being watched. This effect is achieved through the use of a shot-reverse-shot sequence, and subtle motion in the shadows at the edge of the screen. A shadowy figure who partially blocks the left edge of the screen might have gone entirely unnoticed if not for the visual synchronization of the shadow with the introduction of a new sound in the audio track. This synchronization draws our attention to the framing of the shot, which would be avoided within traditional film. However, this is not the case within the construct of music video, “the framing in music video makes us as aware of the edge of the frame as of the figure itself” (Vernallis 2004).

The fact that audio-visual synchronization is important to the Come to Daddy video is established right from the beginning by the camera’s motion through what seems to be either a viaduct or tunnel. As we exit the tunnel, we see the housing complex for the first time, directly in sync with a sound that is vaguely reminiscent of a landing spaceship from a science-fiction movie. This synchrony is a subtle effect, but it forces the viewer to notice the camera movement without necessarily consciously attending to it. After sniffing at the TV screen for a bit and then choosing to pee on the old lady instead, the dog is shocked by the TV screen, and begins an attempt to attack it, only to be held back by the surprisingly strong old lady. A ghostly, distorted image of Richard D. James appears on the screen, and his lip-synced proclamation, “I want your soul”, ends the introduction section of the video.

This introduction to the world of Come to Daddy, however, is not accompanied by the song, Come to Daddy. When compared with the compact disc recording, the Come to Daddy song proper begins only after the distorted image of Richard D. James first appears in the TV set. Throughout the first minute and sixteen seconds, we hear an introduction section that appears either to have been scored specifically for this video, or at the very least to have been pulled from some unreleased Aphex Twin material . While the illustrated book that accompanies the Work of Director Chris Cunningham DVD makes no mention of the introduction section, the lack of any additional credits, along with the unique electronic style, would suggest that it was scored by Richard D. James. However, it does raise the interesting question of whether the music was composed for the visuals, or the visuals were composed for the music in this section, and also whether it really matters.

The process of creation is not necessarily discernable by the receiver, as is evidenced by the “cognitive opacity” (Lerdahl 1992) of the permutational structures common in serialist music. In a discussion of Pierre Boulez’s Le Marteau sans Maître, which was “hailed as a masterpiece of post-war serialism” (Lerdahl 1992), Fred Lerdahl points out that for twenty-three years, “nobody could figure out, much less hear, how the piece was serial” (1992). In Music and Discourse, Jean-Jacques Nattiez defines the roles of the act of creation (poietic process), and the act of receiving (esthesic process), from a semiologist’s perspective, as separate parts of a whole, or the “total musical fact” (1990). As he states, “the esthesic process and the poietic process do not necessarily correspond” (1990). Since it is possible for a compositional system to impart meaning into a work that is impossible for the audience to perceive, questioning whether the music or the visuals came first is largely unrelated to the perceivable elements of the video. The sound track throughout this section is made up of ambient electronic sounds, mostly textural in nature. However, certain gestural sounds (sounds with a relatively short envelope) are synchronized with visual imagery, especially visual glitches, and other gestural sounds aren’t synchronized with anything.

The music in the introduction is quite distinct from the rest of the piece because of the lack of any periodic metrical structures. Throughout the majority of this section, the accent structure alignments between various aural and visual events draw attention either to the visual editing or to the framing and motion of the camera, but not to the content of the image. While the woman clearly says something to her dog about thirty seconds into the video, we don’t hear what she says. It becomes clear that even though we are in the presence of a dense electronic soundtrack, we are also in the midst of a diegetic silence. Near the end of the introduction, we finally see an accent structure alignment that draws attention to the content of the image: an arc of electricity that shocks the old lady’s dog. Accompanying this arc is the first sound which could possibly be considered diegetic because it is synchronized with an object contained within the diegesis. Whether the sound is actually diegetic or not isn’t entirely clear. The arc of electricity is followed by the dog barking at the TV set, and the dog’s motions are pseudo-synced with a sound that sounds similar to a dog bark, but is resonant, and drawn out in time. This sound has none of the typical envelope characteristics of a dog bark sound, and so we wonder if it is intended to be perceived as a dog bark at all. Most likely, this section is intended as a bridge between the diegetic silence of the introduction section, and what is to come—the image of James in the TV set speaking the lyrics of the song, “I want your soul”. Up until this point, because the audio-visual synchronization is based around nondiegetic visual material, we are forced to consider the medium of the film, and the music that accompanies it, as opposed to its content.

While the narrative of this music video is an important element, frustrating the audience’s ability to fully commit to the narrative is necessary to its function. The power of a narrative to dominate all other filmic elements is so strong that in order for a narrative video to work successfully as a music video, and not to be perceived as a short film, it is necessary for the audience’s attention to be diverted away from the narrative. If the director were to allow the narrative to come to the foreground, the music would fall into the background, as in a typical Hollywood-style film. Because of this common relationship of music with film, it is not uncommon to hear claims that music is subservient to visuals or to the narrative in traditional filmmaking. Music in cinema is considered by Gorbman, to lessen “defenses against the fantasy structures to which narrative provides access”, and to “free the image from strict realism” (Gorbman 1987). In considering film, Gorbman states “where narrative is the excuse for, the cement of, and the raison d’être of the film’s existence, we opt to focus attention on the narrative and visual realities on the screen before us” (Gorbman 1987). And most striking, Gorbman claims that “film music and easy-listening music have much in common. They are both utilitarian; both are received in a larger, nonmusical context; neither is designed to be closely-attended to” (Gorbman 1987). This is in stark contrast to the necessary dominance of music within a music video.

In describing her “cognitive framework for understanding musical soundtracks”, Annabel Cohen states, “when auditory information and visual information are structurally congruent (e.g. share temporal accent patterns), the visual congruent information becomes the figure, the visual focus of attention” (2001). Her framework depicts the ways in which music can augment visual meaning and structure, and how it can affect the short term memory of visual information and of narrative, but it completely ignores the possibility for visual information to affect the perception of auditory information, or for visual information to affect how auditory information is stored in short term memory. Her statement, “it is…synchronization that contributes to the inaudibility of the music” (Cohen 2001), is only true when this synchronization serves the narrative by ensuring that the content of the frame remains the subject of the film. When synchronization frustrates the narrative, it serves a different function by drawing attention to the formal elements of the video in a self-reflexive manner. In the words of Paul Newell Campbell, author of The Reflexive Function of Bergman’s Persona, regarding the section of Persona where the film suddenly splits and burns, “instead of the usual self-transcendent participation in and dependence on the narrative form, we now retreat frantically to the self-assertive shelter of our ‘real’ selves, to our own familiar personnae”.

While it may be true that music is subservient to the visual and narrative elements in most traditional Hollywood film, this is decidedly not the case with music video, and so there is no reason to believe that it has to be the case with other forms of multimedia either. The visual style of a music video is designed to serve the song, and the accent structure synchronization in Come to Daddy draws attention away from the narrative in order to draw attention to the musical structures. The editing style common in music video is one that reverses many of the tenets of traditional continuity editing, and Come to Daddy is no exception. During the introduction section, the 180-degree rule is broken repeatedly so that we frequently have almost no idea from which perspective we are viewing the action. This allows for a momentary spatiotemporal distortion of our perception of the video that goes unnoticed without careful attention: When the old woman first notices the horde of evil-looking little people, wearing dresses, all with Richard D. James’ smiling face, she backs up in fear. They run past her to pick up the TV. From earlier visual cues, their trip to pick up the TV should be a short one, but it involves a lengthy run down an alleyway that exists in a space that was previously nonexistent. Vernallis states that “disjunctive editing keeps us within the ever-changing surface of the song. Though such edits may create a momentary sense of disequilibrium, they force the viewer to focus on musical and visual cues, allowing the viewer to regain a sense of orientation” (Vernallis 2004). By synchronizing various visual cues with the music, and through fast-paced disjunctive editing which is also frequently synched to the music, the editing diverts the viewer’s attention, and “prevents powerful images from acquiring too much weight and stopping the flow of information” (Vernallis 2004).

As important as the narrative drive may be in Come to Daddy, as with Ingmar Bergman’s Persona, “any account which leaves out or dismisses as incidental the way [it] begins and ends hasn’t been talking about the film that [the director] made” (Sontag 1967). The very beginning of Persona tells us, without question, that the film is not meant to be understood as a traditional narrative, but as a film that is in part not about the characters in the film, but about film itself. Campbell states, “Persona announces its reflexive nature at once and with emphasis: carbons inside an arc lamp flare into brilliance; a portion of blank leader flashes by; and there is a series of fast-moving images that includes a fragment of what appears to be a slapstick comedy, a nail hammered through a hand, corpses of an old man and woman, a young boy lying under a sheet, the old woman’s face hanging upside down over the head of the bed or cart, the old woman’s eyes suddenly opening” (1979). After this seemingly-abstract entrance, the movie enters into what seems like a traditional narrative about a nurse helping to coax a catatonic actress back to health by taking her to a summer cottage. When the nurse, Alma, discovers that the patient, Elisabet, has not been ill, but has been studying her, she is outraged. In a climactic meeting of the two women, the film suddenly seems to split and burn, and we see images from the beginning again. The medium of the film again announces itself as a primary subject. This splitting of the film which has the effect of destroying the narrative drive, according to Campbell, tells us that the “film is equally about itself and the audience”. Though it presents itself to us in what is often conventional narrative-form, it gives us frequent and pointed indications of a nonnarrative mode” (1979).

The opening of Come to Daddy is quite similar to the opening of Persona. While the first shot that we see is of the housing complex where the narrative takes place, the next shot is what at first seems to be an abstract image, with the words “Come to daddy” at the bottom of the screen. This image is dominated by the presence of a bright light, much as is the first scene of Persona, where we see the carbons of a film projector’s arc lamp “flare into brilliance”. This light, however, is not the light of a film projector, or in fact any other sort of light associated with film. The light is that of a copy machine, scrolling across the screen, presumably making a copy. This image repeats in a slightly different manner two more times, once announcing Aphex Twin, and the last time announcing “Pictures by Chris Cunningham”. Even the use of the word “pictures” instead of “video” or the phrase “directed by” seems to point toward the act of copying, or to borrow a term from Walter Benjamin, mechanical reproduction.

In fact, while the narrative is about the emergence from a television of a giant creature that rounds up its various children, the plot becomes just one element of the many inter-related elements of the video. Another element is the theme of mechanical reproduction. From Benjamin, regarding The Work of Art in the Age of Mechanical Reproduction, “to an ever greater degree the work of art reproduced becomes the work of art designed for reproducibility. From a photographic negative, for example, one can make any number of prints; to ask for the ‘authentic’ print makes no sense” (1936). This theme of mechanical reproduction is self-reflexive because it takes note of its own presence within the world as a work of art “designed for reproducibility”. While this theme is used to frustrate the narrative in such a way that the music does not fall into the background, it is also a fully developed theme in its own right. The “pictures” of the copy machine hint at the importance of mechanical reproduction, but these pictures, along with the static of the thrown-out TV set, make it possible for the video to contain visual-glitch material throughout, synced to the music, without us wondering why the video is so glitchy. Even the characters in the video are imperfect reproductions of Richard D. James, whom we never see fully throughout the video. We see his face in the TV, on each of the little people, and at the end of the movie, on the giant creature (“thin man”, played by Al Stokes), who has emerged from the TV set (albeit a slightly larger one than in the beginning), blurring the line between human and machine.

In Come to Daddy, the narrative, camera positioning, editing, and an associationist theme of mechanical reproduction all create a constant interplay that allows the music to take the foreground without any of the narrative-frustrating devices being merely utilitarian in that sense. Were the video to have been created in a traditional narrative form, the music would have fallen into the background as it would in any traditional Hollywood style film. Given the importance and necessary predominance of the music in a music video, both from an artistic and a marketing standpoint, it is necessary to ensure that the narrative not take over. In the words of Campbell, we as the audience cannot be allowed to “play our usual role of observer projecting the self into the personae narrated by the film” (1979). Cunningham manages to employ a narrative and then frustrate that narrative through the use of abstract and associational imagery and editing. From a position in the artistic counterculture, he also managed to place a video on TV that subverts the MTV “disposable image” (Harvey 1990) by creating an underlying theme of art and/or humanity “designed for reproducibility”. The complex interplay of these elements, and the terrific Aphex Twin music, all contribute to the artistic success of the Come to Daddy music video.


Come to Daddy, Pappy mix, by Aphex Twin, accompanies the Come to Daddy video. However, Aphex Twin has three pieces named Come to Daddy. These are Come to Daddy, Pappy mix, Come to Daddy, Little Lord Faulteroy mix, and Come to Daddy, Mummy mix. The three tracks have very little to do with one another, other than the fact that they all co-exist on the Come to Daddy album. They are all entirely different in style and lyrical content. However, Come to Daddy, Pappy mix is generally the most well known, and is often referred to simply as, Come to Daddy. The music video makes no mention of anything other than this simplified version of the title.


Benjamin, W. (1936). Retrieved May. 22, 2005, from The Work of Art in the Age of Mechanical Reproduction:

Bordwell, David and Kristin Thompson. (1997). Film art : an introduction. New York: McGraw-Hill.

Brown, Richard (producer) and Payne, John (producer). (2003). The Work of Director Chris Cunningham [DVD and Book]. New York: Palm Pictures, LLC.

Campbell, Newell. (1979). The Reflexive Function of Bergman’s Persona. In Cinema Journal, 19:1.

Cook, Nicholas. (1998). Analysing Musical Multimedia. New York: Oxford University Press.

Gorbman, C. (1987). Unheard Melodies: Narrative Film Music. Bloomington, IN: Indiana University Press.

Gow, Joe. (1990b) Music video as communication: popular formulas and emerging genres. In Journal of Popular Culture, 26(2), pp.41-70.

Harvey, Lisa St. Clair. (1990). Temporary Insanity: Fun, Games, and Transformational Ritual in American Music Video. In Journal of Popular Culture, 24:1.

Lerdahl, Fred (1992). Cognitive Constraints on Compositional Systems. In Contemporary Music Review, 6, pp. 231-259.

Nattiez, Jean-Jacques. (1990). Music and Discourse: toward a semiology of music. Princeton, NJ: Princeton University Press.

Sontag, Susan (1967) Persona. In Sight and Sound, Autumn.

Vernallis, Carol. (2004). Experiencing music video. New York: Columbia University Press.

Wikipedia (entry last updated 2005). Come to Daddy. Retrieved May 19, 2005, from