Peer-Reviewed Papers
-
A Concise Survey of OCR for Low-Resource Languages
AmericasNLP @ NAACL 2024 -
LIMIT: Language Identification, Misidentification, and Translation using Hierarchical Models in 350+ Languages
EMNLP 2023
-
Queer In AI: A Case Study in Community-Led Participatory AI
FAccT 2023
-
PALI: A Language Identification Benchmark for Perso-Arabic Scripts
VarDial 2023 @ EACL 2023
-
Are We There Yet? – Building an equitable future with low-resource and endangered language research
Queer in AI @ NeurIPS 2021
Collecting Verified COVID-19 Question-Answer Pairs
Proceedings of the 1st Workshop on NLP for COVID-19 at EMNLP 2020
Computational Biology
-
WebSeq: A Genomic Data Analytics Platform for Monogenic Disease Discovery
Journal of Bioinformatics and Systems Biology. 2023.
Multiplotlib: a python library for multi-view data visualization.
Richard Tapia Conference, Sept 2020.
Genomic Data Analysis for Rare Mendelian Disorders
Johns Hopkins CARES Symposium. August 2019