As speech datasets used in sociolinguistic research increase in size, laborious and time-intensive manual orthographic transcription is a challenge, limiting the amount of (transcribed) data which can be analysed. In this paper, I discuss the use of (commercial) automatic speech recognition (ASR) as a tool in sociolinguistic research in the context of a case study: the Lothian Diary Project. I describe the kinds of errors produced by two commercial ASR systems for British English within the broader context of algorithmic bias in ASR, and suggest some best practices when working with ASR in sociolinguistic work.
"(Commercial) Automatic Speech Recognition as a Tool in Sociolinguistic Research,"
University of Pennsylvania Working Papers in Linguistics: Vol. 28:
2, Article 11.
Available at: https://repository.upenn.edu/pwpl/vol28/iss2/11