Notice and Invitation Oral Defense of Doctoral Dissertation The Volgenau School of Engineering, George Mason University
Yun-Sheng Wang Bachelor of Science, Feng Chia University, 1991 Master of Science, George Mason University, 2002
Unsupervised Bayesian Musical Key and Chord Recognition
Wednesday,
04/09/2014, 1:00pm
Room 4801, Engineering Building
All are
invited to attend.
Committee
Dr. Harry Wechsler, Chair
Dr. Jim Chen
Dr. Jessica Lin
Dr. Andrew
Loerch
Dr. Pearl
Wang
Abstract
Many tasks in Music Information
Retrieval can be approached using indirection in terms of data
abstraction. Raw music signals can be abstracted and represented
by using a combination of melody, harmony, or rhythm for musical
structural analysis, emotion or mood projection, as well as
efficient search of large collections of music. In this
dissertation, we focus on two tasks: analyzing tonality and
harmony of music signals. Our approach concentrates on
transcribing western popular music into its tonal and harmonic
content directly from the audio signals. While the majority of
the proposed methods adopt the supervised approach which
requires scarce manually-transcribed training data, our approach
is unsupervised where model parameters for tonality and harmony
are directly estimated from the target audio data. First, raw
audio signals in the time domain are transformed using
undecimated wavelet transform as a basis to build an enhanced
12-dimensional pitch class profile (PCP) in the frequency domain
as features of the target music piece. Second, a bag of local
keys are extracted from the frame-by-frame PCPs using an
infinite Gaussian mixture which allows the audio data to
“speak-for-itself” without pre-setting the number of Gaussian
components to model the local keys. Third, the bag of local keys
is applied to adjust the energy levels in the PCPs for chord
extraction.
From experimental results, we
demonstrate that our approach – a much simpler one compared to
most of the existing methods – performs just as well or
outperforms many of the much more complex models for the two
tasks without using any training data.
A copy of this doctoral
dissertation is on reserve at the Johnson Center Library.