Computational Speech Processing | Language documentation | Phonology
About Me
I am a PhD student in Linguistics specializing in computational speech processing, fieldwork, and phonology.
Research Interests
Automatic Speech Recognition (ASR) for low-resource languages
ASR for codemixed and bilingual audio
Computational methdos for aiding language documentation
My research focuses on computational methods in language documentation, particularly focusing on adapting foundational ASR architectures for processing fieldwork data.
I work with Tira, a Kordofanian language spoken in Sudan.
I am interested in how to improve automatic speech recognition on low-resourced languages, especially when code-mixed with high-resource languages, and how ASR methods can aid language documentation.
Publications
Wei-Jen Ko, Cutter Dalton, Mark Simmons, Eliza Fisher, Greg Durrett and Junyi Jessy Li. Discourse Comprehension: A Question Answering Framework to Represent Sentence Connections. Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Kaldhol, Nina and Sharon Rose and Mark Simmons. Prosody of topic and focus in Tira. ForthcomingCross-disciplinary approaches to Information Structure in Niger-Congo languages. (Contemporary African Linguistics). Berlin: Language Science Press.
Simmons, Mark and Patience Epps. Tonogenesis in the Naduhup family of northwest Amazonia. Diacrhonica.Forthcoming.
Simmons, Mark. Data augmentation for low-resource bilingual ASR from Tira linguistic elicitation using Whisper. Paper accepted for the Eighth Workshop on the Use of Computational Methods in the Study of Endangered Languages, Honolulu, Hawai'i.
Presentations and talks
Simmons, Mark. Morphologically constrained metaphony in Tira. A talk presented at Phonetics and Phonology in Europe 2023 Satellite Workshop 'Metaphony' June 1, 2023
Rose, Sharon and Simmons, Mark. Focus, topic and prosody in Tira. A talk presented at the 6th African Linguistics School in Porto Novo, Bénin.
Simmons, Mark. Reconstructing word-final voicing in Nadëb. A poster presented at the Annual Meeting for the Linguistics Society of America, January 6-9, 2022
Teaching assistant for LIGN 168: Computational speech processing at UCSD, Spring 2024.
Teaching assistant for LIGN 8: Languages of America at UCSD, Winter 2024.
Teaching assistant for LIGN 110: Phonetics at UCSD, Fall 2023.
Teaching assistant for LIGN 8: Languages of America at UCSD, Spring 2023.
Guest lecture on vowel harmony in Tira for LIGN 111.
Teaching assistant for LIGN 111: Phonology at UCSD, Winter 2023.
Teaching assistant for LIGN 110: Phonetics at UCSD, Fall 2022.
Languages and skills
Spanish
Portuguese
Python
LaTeX
Awards and grants
Brython-Davis Fellowship. Spring 2024 and Fall 2024
Cota Robles Fellowship. 2021-2022 and 2024-2025
George H. Mitchell Award. May 2021
Work experience
Undergraduate research assistant: Naduhup documentation team. Documentation and description of Nadëb language (Naduhup, Brazil) under Prof. Patience Epps, UT Austin 2018-2021
Undergraduate research assistant: Discourse Question Answering framework. Data annotation for discourse question answering research team under Prof. Jessy Li, UT Austin, Summer 2021
Graduate research assistant: Tira language documentation. Transcribing fieldwork interviews and ingressing morphological data for Tira language project. Summer 2022.
Graduate research assistant: Richard Montague archiving project.. Transcription of audio interviews, used ASR and speaker diarization to speed up transcription. Summer 2024.