Chord sequence generation with semiotic patterns
Darrell Conklin
Chord sequences play a fundamental role in music analysis and generation. This paper considers the task of chord sequence generation for electronic dance music, specifically for pieces in the sub-genre of trance. These pieces contain repetitive harmonic sequences with a specific semiotic structure and coherence. The paper presents a formal representation for the semiotic structure of chord sequences, and describes a method for ranking the instances of a semiotic pattern using a statistical model trained on a corpus. Examples of generated chord sequences for a full trance piece are presented.
1. Introduction
The visionary enthusiasm of Ada Lovelace (1815–1852) that a mechanical device “might compose elaborate and scientific pieces of music of any degree of complexity or extent” is as motivating and fascinating today as it was pure science fiction in the nineteenth century. From the 1950s, with the first experiments in computational music generation (Brooks et al. 1957; Hiller 1970; Hiller and Isaacson 1959), through to today, the dream and excitement of computational creativity has never faded for computer scientists and musicologists.
Music generation methods can be broadly divided into two categories: rule-based methods use specified rules for style emulation and algorithmic composition; machine learning methods build generative statistical models from training corpora. A classic problem faced by all methods for music generation is how to appropriately balance general extraopus stylistic features with intraopus coherence created through reference to earlier music material in a piece. This duality has been referred to variously as long-term versus short-term models (Conklin and Witten 1995), prior knowledge versus on-the-fly knowledge (Cunha and Ramalho 1999), extraopus versus intraopus style (Narmour 1990), schematic versus contextual probabilities (Temperley 2014), and schematic versus veridical expectancies (Bharucha and Todd 1989).
The coherence problem poses big challenges for machine learning methods, because succinctly modeling intraopus reference is beyond finite-state, and even context-free, grammars. This can be shown formally by considering the copy language (sequences of the form e … e for any sequence e) which, despite its simplicity, cannot be generated by even a context-free grammar as it contains an arbitrary number of crossing dependencies (see Figure 1). The presence of crossing dependencies that the entire class of context models (e.g. Markov models, n-gram models, multiple viewpoint models, hidden Markov models) – which have great practical and computational advantages for machine learning and statistical modeling – cannot handle the coherence problem. This is especially a problem for genres such as electronic dance music which are “unapologetically repetitive” (Garcia 2005).
Figure 1. An instance of the copy language in music. There are two four-event sequences, separated by an arbitrary sequence (indicated by “…” ). The events could be any type of music object: notes, motifs, phrases, chords, entire sections, and so on. The network shows that there will exist arbitrarily long-range crossing dependencies that are both identical (solid arrows) and not literal (the dashed arrow).
One way to handle the coherence problem, while retaining the advantages and tractability of context models, is to adopt the structure of a template piece, and disallow sequences which do not satisfy that structure (Conklin 2003). To do this, it is necessary to use a pattern formalism that can describe the desired intraopus references in the template piece. Then instances of patterns can be ranked using the statistical model, or in another view, pieces generated by the model can be constrained by the pattern. The pattern captures the intraopus coherence, inherited from the template piece, and the statistical model captures the extraopus style. This idea is developed by Collins et al. (2016), who view a piece as a network of patterns described and related by p...