Notice and Invitation Oral Defense of Doctoral Dissertation The Volgenau School of Engineering, George Mason University
Yun-Sheng Wang Bachelor of Science, Feng Chia University, 1991 Master of Science, George Mason University, 2002
Unsupervised Bayesian Musical Key and Chord Recognition
Room 4801, Engineering Building
invited to attend.
Dr. Harry Wechsler, Chair
Dr. Jim Chen
Dr. Jessica Lin
Dr. Andrew Loerch
Dr. Pearl Wang
Many tasks in Music Information Retrieval can be approached using indirection in terms of data abstraction. Raw music signals can be abstracted and represented by using a combination of melody, harmony, or rhythm for musical structural analysis, emotion or mood projection, as well as efficient search of large collections of music. In this dissertation, we focus on two tasks: analyzing tonality and harmony of music signals. Our approach concentrates on transcribing western popular music into its tonal and harmonic content directly from the audio signals. While the majority of the proposed methods adopt the supervised approach which requires scarce manually-transcribed training data, our approach is unsupervised where model parameters for tonality and harmony are directly estimated from the target audio data. First, raw audio signals in the time domain are transformed using undecimated wavelet transform as a basis to build an enhanced 12-dimensional pitch class profile (PCP) in the frequency domain as features of the target music piece. Second, a bag of local keys are extracted from the frame-by-frame PCPs using an infinite Gaussian mixture which allows the audio data to “speak-for-itself” without pre-setting the number of Gaussian components to model the local keys. Third, the bag of local keys is applied to adjust the energy levels in the PCPs for chord extraction.
From experimental results, we demonstrate that our approach – a much simpler one compared to most of the existing methods – performs just as well or outperforms many of the much more complex models for the two tasks without using any training data.
A copy of this doctoral dissertation is on reserve at the Johnson Center Library.