Research
I’m interested in machine learning for text, scientific data systems, and visualization-driven exploration. This page collects the themes that show up across my work.
Current directions
- Natural language processing for extracting structured signals from unstructured text.
- Automatic speech recognition and speech-adjacent modeling questions.
- Bilingualism, language variation, and how language data can be modeled computationally.
- Time-series, EEG, and biomedical datasets with a focus on reusable data standards.
Recent research-adjacent work
- Curating and validating EEG datasets with BIDS and HED for cross-study reuse.
- Building NLP pipelines for regulatory risk signals from board meeting minutes.
- Designing visualization systems for long-running, exploratory analysis of climate and text data.