Research
My research is in computational biology and epidemiology, with most of my current work using large-scale health records to study how diseases develop, cluster, and change over time. A smaller part of my work is in single-cell genomics. The projects below are grouped by collaboration.
I work with a de-identified insurance claims dataset of over 200 million US patients (Merative MarketScan, 2003 to 2024). Claims data records what clinicians billed for rather than what a patient truly had, which introduces noise, however the scale allows population-level questions that cohort studies cannot answer.
Birth Order and Disease Risk Across the Phenome
with Steven Kushner and Andrey Rzhetsky
Older studies of birth order typically rely on a few thousand families and test a handful of outcomes. I ran a phenome-wide scan on 10.3 million individuals from 5.1 million two-child MarketScan families, testing 569 diseases with two complementary designs: a between-family matched cohort and a within-family sibling comparison. 150 diseases show Bonferroni-significant birth-order associations, with later-borns at elevated risk for most of them (including several psychiatric, metabolic, and immunological conditions). The within-family design rules out most family-level confounders, which is the part I believe matters most methodologically.
medRxivAccelerometry for Frailty Prediction
with Yanan Long, Megan Huisingh-Scheetz, and Andrey Rzhetsky
I worked with roughly a week of free-living hip and wrist accelerometer data from older adults enrolled in a longitudinal aging study, and trained a set of models to classify frailty status at baseline and to predict 12-month frailty decline. The free-living signal carries more information than I initially expected, and a small set of engineered features (activity fragmentation, intensity bout structure) accounts for most of it. The clinical point of the project is whether a wearable stream can substitute for a clinic-based frailty phenotype, which matters for large epidemiological studies where in-person assessment is infeasible.
medRxivOutside of the Rzhetsky Lab, I work on single-cell RNA-sequencing in the Bryson Lab at MIT, where I have contributed to a cross-disease atlas of human skin immune cells and to granuloma myeloid analysis in tuberculosis.
Cross-Disease Atlas of Human Skin Immune Cells
with Bryan Bryson and Robert Modlin
I have been assembling a harmonized single-cell atlas of human skin across 22 diseases and roughly 341,000 immune cells, split into a myeloid compartment of 16 clusters and a T/NK compartment of 21 clusters. The aim is to identify cell states that are genuinely disease-specific rather than disease-biased (these are very different things, and I have learned the hard way how easy it is to confuse them). Current findings include cancer-specific expansion of intermediate monocytes, cross-disease expansion of CXCL13+ T peripheral helper cells, and a shared Treg signature across autoimmune skin conditions. Manuscript in preparation.
Myeloid Signaling in Tuberculosis Granulomas
with Joshua Peters, Bryan Bryson, and collaborators
I contributed to a study of myeloid cell signaling in non-human primate tuberculosis granulomas, where we reconstructed conserved myeloid states across a time course of infection and associated them with IFN-γ and TGF-β signaling. My work was on applying CellTypist-based annotation to the granuloma compartments and on downstream comparative analysis. Authorship was offered after the preprint was posted.
bioRxivBefore UChicago and MIT, my research was in renal physiology at Case Western Reserve with Agustin Gonzalez-Vicente and Jeffrey Garvin, and in satellite-based air pollution analysis during a Berkeley REU with Misbath Daouda. A list of the resulting publications and abstracts is on the Publications page.