Research

Research areas

The lab develops computational methods for biological data analysis, with emphasis on cancer genomics, structural variation, machine learning, and sequencing-based inference. Our work combines algorithm design, statistical modeling, and practical bioinformatics applications to address important questions in modern biology.

Overview

A computational view of complex biological systems

Our research focuses on extracting meaningful biological insight from large and often noisy genomic data. We are particularly interested in problems where careful modeling, scalable computation, and real biological interpretation must work together.

Genomics Large-scale sequencing data analysis
Cancer Tumor evolution and heterogeneity
ML Machine learning and inference
Bioinfo Practical tool development
Theme 01

Structural Variation Detection

Structural variation plays an important role in genome function and disease. The lab develops computational approaches to detect and characterize insertions, deletions, rearrangements, and other large-scale genomic alterations across different sequencing technologies.

What we study

We design methods for identifying structural variants from sequencing data generated by platforms such as short-read, long-read, and other complementary technologies. These methods aim to improve sensitivity, specificity, and interpretability in complex genomic regions.

Why it matters

Structural variation can alter gene function, disrupt regulation, and contribute to human disease, including cancer. Better computational detection enables stronger biological interpretation and more reliable downstream analysis.

Sequencing integration

Combining signals from multiple sequencing technologies to improve variant detection in difficult genomic contexts.

Algorithm development

Building scalable and robust computational tools for large genomic datasets and diverse experimental platforms.

Biological interpretation

Translating detected variants into biologically meaningful hypotheses about genome structure, function, and disease.

Theme 02

Cancer Heterogeneity and Evolution

Cancer is inherently heterogeneous, with tumors often containing multiple distinct subclonal populations. The lab develops methods to characterize this heterogeneity and reconstruct the evolutionary history of cancer from genomic data.

What we study

We investigate how tumor cell populations diversify over time and how genomic signals can be used to recover subclonal structure, lineage relationships, and progression patterns from bulk and single-cell data.

Why it matters

Understanding heterogeneity is essential for explaining disease progression, treatment resistance, and tumor development. Computational reconstruction of these patterns helps reveal the biological processes shaping cancer.

Subclone inference

Identifying distinct tumor populations and estimating their genomic signatures from mixed or single-cell data.

Evolutionary reconstruction

Recovering lineage relationships and progression histories to better understand how tumors emerge and diversify.

Cancer-focused computation

Developing biologically informed methods tailored to real cancer datasets and clinically relevant genomic complexity.

Theme 03

Machine Learning for Genomics and Single-Cell Data

Modern genomics generates high-dimensional datasets that require methods capable of detecting structure, reducing noise, and uncovering biologically meaningful patterns. The lab uses machine learning and statistical inference to address these challenges.

What we study

We develop and apply computational models for large-scale biological data, especially in genomics and single-cell sequencing, where the goal is to extract interpretable signals from complex measurements.

Why it matters

Machine learning can reveal hidden structure in biological systems, improve prediction, and support new discoveries when combined with domain knowledge and rigorous statistical reasoning.

Single-cell analysis

Modeling cell-to-cell variation to understand heterogeneity, lineage, and biological state in high-resolution sequencing data.

Statistical inference

Designing principled approaches that quantify uncertainty while remaining useful for real biological applications.

Interpretable models

Emphasizing methods that are not only accurate, but also informative for understanding biological mechanisms and data structure.

Approach

How we work

The lab’s methodology is grounded in strong computational foundations while remaining closely connected to biological applications and real sequencing data.

Algorithmic thinking

We approach biological questions through efficient algorithms, optimization, and scalable computation.

Statistical rigor

We build models that account for uncertainty, noise, and the complexity of genomic measurement processes.

Biological relevance

We focus on methods that address important real-world questions in cancer genomics and computational biology.

Interested in collaborating or joining the lab?

We are interested in collaborations and in working with students motivated by bioinformatics, machine learning, cancer genomics, and computational biology.