|
|||||||||||
|
Proposal for an Artificial Intelligence Research Project
Proposed Subject: Computer Generated Theme and Variations
Prepared for: University Fellows Research Program Texas A&M University
By: Richard Wesley Todd
Submitted on March 26, 1998
I. The Coherence Roadblock in Automated Music
No one disputes the ability of a computer to ‘learn’ and use the rules of harmony to create music that sounds good. Forming and maintaining a musically appropriate structure at the note level is a problem successfully addressed in the past. Two problems with no current satisfactory response are large-scale coherence of form and original melody creation. The most successful venture into these areas so far is Experiments in Musical Intelligence (or EMI) [Smoliar 1994]. Surprisingly, code originally written to create lexically correct haiku poems forms the base of David Cope’s EMI program [Cope 1987]. EMI creates music by locating stylistically interesting points in existing works and incorporating these into generic, ‘context-free’ music [Cope 1991 & 1992]. This recombinant form of creation can be convincing for a few measures, but in terms of large-scale form becomes disjointed and incoherent [Smoliar 1994]. Cope says himself that "continuity deserves the most attention at this juncture," as computer music programs tend to lump musically correct phrases together with no interest in their potential connectivity [Cope 1991]. II. My Research Objective: Automated Variations on a Theme The goal of my proposed research is a computer program that takes a pre-composed musical theme and generates a logically coherent piece of music in Theme and Variations form (a musical form in which a single theme is transformed repeatedly with little or no extra material introduced). The program will use applicable current research in linguistics and natural language study as a starting point, due to the volume of research in those areas, the success of language ventures such as EMI, and the many acknowledged parallels between music and language study [Lerdahl and Jackendoff 1983]. Throughout my discussion, I will point out the parallel language research. There are many reasons to choose Theme and Variations form for the first incarnation of a music generator. First, it allows the program to avoid one of the two unsolved problems listed above—generating an original melody. This is the least understood and most ignored area of study. Successful projects (like EMI discussed above) must generate their melodies by recombining existing music, a procedure no human composer would use. My approach will also reuse its music input, but in emulation of actual human practice. Also, though the form of a set of variations is more complex than many other types (a minuet for example has a simple A->B->A form), the inherent constraint of only one theme greatly reduces the complexity of the task. My program can safely disregard the problem of choosing compatible themes. Monothematic form also forces the program to use the actual material to shape the music instead of relying on thematic variety. No existing computer design (to my knowledge) has this capability at present. III. The Overall Variation Program Design The structures and high-level patterns of music composition will be represented using knowledge representation techniques developed in artificial intelligence (AI). Examples of these include frames, rules, prototypes, and constraints. The composition process will involve applying these techniques to parts of an input melody, and synthesizing the resulting musical elements using AI tasks like heuristic searching, rule-based reasoning, or constraint-based reasoning. I present the main aspects of my program and the first steps I plan to take below. The two high-level tasks of this research will be generating an appropriate form for the entire set of variations, and creating variations that fit this mold. Two parts of the program will represent these tasks: the form designer and the variation designer. These modules will interact through dynamic goal-setting. The form designer will develop a large-scale plan for the music based on a set of goals (to be determined as part of my research but including vague terms like: lush harmony, about eight parts, very thin texture). Then, as the variation generator runs (to the specifications of the current plan), new possibilities or limitations that may be discovered are fed back to the form designer. Any changes made at this point will then alter the output from the variation designer in a dynamic give-and-take process. The result will be a coherent piece of music, as the program is aware of the large-scale implications of musical decisions at all times. This two-way flow of information between modules has been successfully explored and utilized in the domain of natural language generation [Hovy 1988]. In the language field, goals influence paragraph-shaping. Unfortunately, the differences between words and notes are such that language theories cannot be merely translated [Lerdahl and Jackendoff 1983]; constructs and methodologies must be rebuilt in a musical context. The final aspect of the high-level design is a mechanism that can determine the degree of similarity of the variations to their theme. Thus if a variation strays too far it might be erased, or the next variation might be constrained to compensate for this by strongly resembling the melody. Various pattern-matching schemes will be employed to compare the new melodies to the old. A fuzzy-logic approach to comparing melodic contour such as the one described in [Quinn 1997] could be useful. Also, the underlying (extramelodic) patterns in the music should be considered. The system capable of doing this analysis will also need to be written in order to generate complex variations and is described presently. Subordinate to the high-level design is a musical analysis system. The variation program needs the ability to break down the given melody in as many ways possible. Any musical fragment can be reduced to components (several discussed here), and the essence of variation is keeping some of these constant while changing others [Apel and Daniel 1960]. Solomon points out a constraint in his biography of Beethoven [Solomon 1977]: tonality should be preserved. In other words, a variation tends to loose touch with its theme if the point of modulation (change of key) is moved too far or elided completely. That aside, the number of possible variations by this definition is a function of the number of discrete musical properties recognized. By giving my program an understanding of its given theme, I will maximize the possibilities for variation. Fortunately, a great deal of work toward isolating musical properties has already been done in music theory. Most important factors can be seen in terms of a swinging pendulum, with a degree of intensity that fluctuates as music progresses. Fundamental aspects to consider include consonance, rhythmic activity, harmonic and tonal motion, and texture [Berry 1987]. Subtle tempo effects (such as rubato playing) will not be considered at this time. They are ubiquitous in human performance, but have been shown to have little or no effect on the perceived expressiveness of music [Kamenetsky, S., Hill, D., and Trehub, S. 1997]. Consonance (roughly, the degree of ‘pleasantness’ of a group of sounds) stands among the most important musical elements to codify, as it is a property of any sound and plays a role in determining higher-level qualities (such as intensity). The degree of consonance also helps define the form of musical phrases [Berry 1987]. The relative consonance and dissonance (the opposite of consonance) of opposing notes have been calculated via (to name a few) pure intuition, numerology, and as a function of the overtone series [Vidyamurthy and Chakrapani 1992]. Most approaches agree on the relative order of consonance of the various musical intervals. Unfortunately, none of these methods of measuring consonance are flexible enough for use in music creation. They disregard important factors (as described in [Berry 1987]) such as: the octave distance between widespread notes, the relative loudness of the pitches involved, the extent of simultaneity of the pitch events, and the impact of differing timbres. Though I have yet to see it in print, I additionally hypothesize that dissonance is attenuated if the listener expects it (for instance when the same melody is heard twice). Numerical dissonance models like [Smith 1997] are especially useless outside of pure, evenly-played intervals. The method based on the overtone series (a set of incidental pitches produced by an instrument as a result of wave interactions), on the other hand, is both mathematically sound and extendable. Thus, the first step in the thematic analysis stage of my research will be to elaborate the scheme in [Vidyamurthy and Chakrapani 1992] to match a more musically complete view of consonance/dissonance. Music fundamentals like texture and melodic groupings will complete the analysis engine and will be codified using music theory as a base. Most work in music theory relies too heavily on human intuition to be useful in a computer programming context, but efforts to formalize the theory exist. Formal musical theories have been put to work in AI programs before with success [Widmer 1992], and I am confident that aspects of them will be useful. One important formal theory describes rhythm, large-scale phrase grouping, and harmony in terms of ‘well-formedness,’ ‘preference,’ and ‘transformational’ rules [Lerdahl and Jackendoff 1987]. These rules are not only easy to translate into computer code but also happen to be rooted in linguistic research. Another formal approach to music claims the ability to analyze the entire space of possible melodies in terms of 215 unique fragments [Narmour 1992]. Narmour’s implication-realization model makes judgements based on the music’s tendency to proceed toward an implied point or thwart the listener’s expectations. AI Natural language systems that try to formulate realistic case-based explanations of news articles use a very similar expectation model [Schank, Kass, and Riesbeck 1994].
References Apel, Willi and Daniel, Ralph. 1960. The Harvard Brief Dictionary of Music. Pocket Books: New York. Berry, Wallace. 1987. Structural Functions in Music. Dover Publications, Inc.: New York. Cope, David. 1987. "An Expert System for Computer-Assisted Composition." Computer Music Journal 11(4): 30-46. Cope, David. 1991. Computers and Musical Style. A-R Editions, Inc.: Madison, Wisconsin. Cope, David. 1992. "Computer Modeling of Musical Intelligence in EMI." Computer Music Journal 16(2): 69-83. Hovy, Eduard H. 1988. Generating Natural Language under Pragmatic Constraints. Lawrence Erlbaum Associates: New Jersey. Kamenetsky, S., Hill, D., and Trehub, S. 1997. "Effect of Tempo and Dynamics On the Perception of Emotion in Music." Psychology of Music 25: 149-160. Lerdahl, Fred, and Jackendoff, Ray. 1983. A Generative Theory of Tonal Music. MIT Press: Cambridge, Massachusetts. Narmour, Eugene. 1992. The Analysis and Cognition of Melodic Complexity. University of Chicago Press: Chicago, Illinois. Quinn, Ian. 1997. "Fuzzy Extensions to the Theory of Contour." Music Theory Spectrum 19(2): 248-263. Shank, Roger, Kass, Alex, and Riesbeck, Christopher. 1994. Inside Case-Based Explanation. Lawrence Erlbaum Associates: New Jersey. Smith, Allan. 1997. "Cumulative Method of Quantifying Tonal Consonance." Music Perception 15(2): 183. Smoliar, Stephen. 1994. "Computers Compose Music, But Do We Listen?" Music Theory Online 0(6):(Web Page) <http://boethius.music.ucsb.edu/mto/issues/mto.94.0.6/mto.94.0.6.smoliar.art> Solomon, Maynard. 1977. Beethoven. Simon & Schuster Macmillan: New York. Vidyamurthy, G. and Chakrapani, Jaishankar. 1992. "Cognition of Tonal Centers: A Fuzzy Approach." Computer Music Journal 16(2): 45-50. Widmer, Gerhard. 1992. "Qualitative Perception Modeling and Intelligent Musical Learning." Computer Music Journal 16(2): 51-66.
|
|||||||||||