michael ginn

/maɪkəl dʒɪn/

I am a Ph.D. student at the University of Colorado in the LECS Lab, supervised by Prof. Alexis Palmer and Prof. Mans Hulden. I am in the Department of Computer Science and the Institute for Cognitive Science, where I study computational linguistics and natural language processing. I obtained my bachelor's in Computer Science, with a second major in Linguistics, from Washington University in St. Louis in 2022.

research interests

Broadly, I am interested in exploring the ways modern NLP techniques such as large language models (LLMs) can best aid underresourced languages. This includes the following:

Morphology and tokenization
How can we design tokenization methods that are effective with limited data and languages with high morphological complexity? What do token embeddings learn about morphological structure?
LLMs and rare languages
How can LLMs be adapted to rare languages that are not in their training data? Can we leverage language reference materials such as grammars, dictionaries, and language learning courses to supplement limited data?
Finite-state automata
Can we effectively interpret neural models with finite-state approximations? Can FSTs be integrated into neurosymbolic systems to improve efficiency and interpretability?
Data augmentation
What strategies for augmenting existing language data are most effective? Why does augmentation work at all?
In-context learning
What is the effect of the number, selection strategy, and the presentation method of demonstrations? How does crosslingual ICL differ from English ICL?

industry experience

I've worked in software engineering as a four-time intern at Apple working on localization software, machine learning, and large language models. Lately, I contributed to the finetuning framework for the Apple MM1 model used by teams throughout Apple. I've worked as a student researcher at xMentium where I explored large language models and domain adaptation for legal texts. I've built several apps as a an independent iOS and macOS developer and working as a student software engineer at Magnify Your Voice.