TY - GEN
T1 - Dans fi̇gürleri̇ni̇n i̇şi̇tsel-görsel anali̇zi̇ i̇çi̇n i̇şi̇tsel özni̇teli̇kleri̇n deǧerlendi̇ri̇lmesi̇
AU - Demir, Y.
AU - Ofli, F.
AU - Erzin, E.
AU - Yemez, Y.
AU - Tekalp, Ve A.M.
PY - 2008
Y1 - 2008
N2 - We present a framework for selecting best audio features for audiovisual analysis and synthesis of dance figures. Dance figures are performed synchronously with the musical rhythm. They can be analyzed through the audio spectra using spectral and rhythmic musical features. In the proposed audio feature evaluation system, dance figures are manually labeled over the video stream. The music segments, which correspond to labeled dance figures, are used to train hidden Markov model (HMM) structures to learn temporal spectrum patterns for the dance figures. The dance figure recognition performances of the HMM models for various spectral feature sets are evaluated. Audio features, which are maximizing dance figure recognition performances, are selected as the best audio features for the analyzed audiovisual dance recordings. In our evaluations, mel-scale cepstral coefficients (MFCC) with their first and second derivatives, spectral centroid, spectral flux and spectral roll-off are used as candidate audio features. Selection of the best audio features can be used towards analysis and synthesis of audio-driven body animation.
AB - We present a framework for selecting best audio features for audiovisual analysis and synthesis of dance figures. Dance figures are performed synchronously with the musical rhythm. They can be analyzed through the audio spectra using spectral and rhythmic musical features. In the proposed audio feature evaluation system, dance figures are manually labeled over the video stream. The music segments, which correspond to labeled dance figures, are used to train hidden Markov model (HMM) structures to learn temporal spectrum patterns for the dance figures. The dance figure recognition performances of the HMM models for various spectral feature sets are evaluated. Audio features, which are maximizing dance figure recognition performances, are selected as the best audio features for the analyzed audiovisual dance recordings. In our evaluations, mel-scale cepstral coefficients (MFCC) with their first and second derivatives, spectral centroid, spectral flux and spectral roll-off are used as candidate audio features. Selection of the best audio features can be used towards analysis and synthesis of audio-driven body animation.
KW - Audio-driven body animation
KW - Audio-visual analysis
UR - http://www.scopus.com/inward/record.url?scp=56449097955&partnerID=8YFLogxK
U2 - 10.1109/SIU.2008.4632707
DO - 10.1109/SIU.2008.4632707
M3 - Conference contribution
AN - SCOPUS:56449097955
SN - 9781424419999
T3 - 2008 IEEE 16th Signal Processing, Communication and Applications Conference, SIU
BT - 2008 IEEE 16th Signal Processing, Communication and Applications Conference, SIU
T2 - 2008 IEEE 16th Signal Processing, Communication and Applications Conference, SIU
Y2 - 20 April 2008 through 22 April 2008
ER -