(Commercial) Automatic Speech Recognition as a Tool in Sociolinguistic Research

Markl, Nina

(Commercial) Automatic Speech Recognition as a Tool in Sociolinguistic Research

Files

Markl_paginated.pdf (126.69 KB)

Penn collection

University of Pennsylvania Working Papers in Linguistics

Permalink

https://repository.upenn.edu/handle/20.500.14332/45363

View all metadata

Author

Markl, Nina

Abstract

As speech datasets used in sociolinguistic research increase in size, laborious and time-intensive manual orthographic transcription is a challenge, limiting the amount of (transcribed) data which can be analysed. In this paper, I discuss the use of (commercial) automatic speech recognition (ASR) as a tool in sociolinguistic research in the context of a case study: the Lothian Diary Project. I describe the kinds of errors produced by two commercial ASR systems for British English within the broader context of algorithmic bias in ASR, and suggest some best practices when working with ASR in sociolinguistic work.

Publication date

2022-09-19

Journal Issue

Selected Papers from NWAV 49

Collection

Working Papers