Date of this Version
The variability and reduction that are characteristic of talking in natural interaction make it very difficult to detect prominence in conversational speech. In this paper, we present analytic studies and automatic detection results for pitch accent, as well as on the realization of information structure phenomena like givenness and focus. For pitch accent, our conditional random field model combining acoustic and textual features has an accuracy of 78%, substantially better than chance performance of 58%. For givenness and focus, our analysis demonstrates that even in conversational speech there are measurable differences in acoustic properties and that an automatic detector for these categories can perform significantly above chance.
Date Posted: 31 July 2012