About Me

Hi! I’m a PhD candidate in Biomedical Informatics at Harvard Medical School focused on using large-scale clinical data to improve healthcare.

My work centers on evaluating and improving widely used clinical tools and understanding how they influence care in practice. I work with longitudinal health data at scale, including datasets of hundreds of thousands to millions of patients, to study patient trajectories and identify ways to better use routinely-collected measurements to support clinical decision-making.

More broadly, I’m passionate about applying data science and machine learning to problems that have real impact in healthcare, whether that means improving patient outcomes, enhancing patient experiences, or reducing clinician burden.

Outside of research, I maintain creative interests in music and ceramics.

B.S. in Biomedical Engineering
Johns Hopkins University
Baltimore, MD

B.M. in Vocal Performance
Peabody Conservatory
Baltimore, MD

Current PhD Student
Bioinformatics and Integrative Genomics
Harvard Medical School
Boston, MA

Selected Research Projects

Laboratory Trajectories Improve Kidney Failure Risk Estimation

Current kidney failure risk equations rely on only a patient's most-recent laboratory values, even though patients often have years of routinely collected lab history. This project examines whether longitudinal lab trajectories can improve risk estimation while remaining interpretable and clinically grounded.

From a large EHR dataset spanning over 5 million individuals, we constructed a cohort of 270K patients with chronic kidney disease. We developed a trajectory-based extension of standard risk models that incorporates features derived from repeated laboratory measurements over time. These models improve identification of high-risk patients compared with latest-value approaches, particularly for longer-term prediction.

The results highlight how routinely collected laboratory histories contain meaningful signal beyond a single measurement, and how simple, interpretable trajectory features can improve risk stratification in clinical practice.

AI-clinician collaboration via disagreement prediction
A decision pipeline and retrospective analysis of real-world radiologist-AI interactions

Model development doesn’t end at deployment. In this project, which resulted in a first-author publication in Cell Reports Medicine, we explore how post-deployment data can be used to improve AI systems in practice.

This work develops a decision pipeline that integrates disagreement prediction with clinical significance and prediction quality to guide how AI outputs are used. Using real-world radiology data, we show that disagreement signals can be used to selectively escalate high-risk or uncertain cases, enabling a balance between reducing unnecessary clinician workload and maintaining diagnostic safety.
Read More

Cancer Treatment and Trial Eligibility Changes from Estimating Kidney Function without Race

Clinical equations can shape care far beyond the specialty where they were developed. This project examines how the shift to race-free kidney function estimation may affect cancer treatment decisions and clinical trial eligibility across the U.S.

Using nationally representative NHANES data corresponding to ~19M U.S. cancer patients, we quantify how updated eGFR equations change kidney function estimates and assess downstream impacts on oncology care. The analysis projects changes affecting over 500K patients, highlighting how seemingly technical updates to clinical equations can have large-scale impacts on treatment access and eligibility.

Technical Areas

Machine learning on longitudinal healthcare data, survival analysis, rare-event prediction, and clinical impact evaluation

Preferred Programming Languages

  • Python (primary)
  • R
  • Tools

    • Cloud Computing (AWS, GCP)
    • Git
    • Docker
    • Deep Learning (PyTorch, Keras)
    To learn more about my journey and introduction to research, check out my story.