CS8395 Special Topics in Computational Biology

🧭 Description

This graduate seminar examines how computational methods are designed, assessed, and advanced to tackle modern biological questions, with emphasis on applying machine learning and AI to model, integrate, and interpret single-cell and spatial transcriptomic data. In particular the course explores how choices in data representation, model specification, and algorithm design shape what can be inferred from such datasets.

Through discussions and student presentations of recent work from both conferences (e.g., ISMB, RECOMB, NeurIPS, MLCB etc.) and journals (Nature Methods, Nature Biotechnology etc.), students will learn how to critically evaluate existing models, identify the underlying assumptions and limitations, and develop their own ideas. The course also includes hands-on mini-projects in which students will design, implement, and assess a computational method inspired by the papers discussed in class.



💡 Beyond running analyses, we focus on how to think like a method developer — turning ideas into models that connect computation and biology.

Logistics

  • Course Code: CS8395-04
  • Term: Spring 2026
  • Class Times: Tuesdays & Thursdays, 02:45 PM - 04:00 PM
  • Location: Olin Hall 131

🧬 Computational biology in a Nutshell

Computational Biology in a Nutshell

🎯 Learning Goals

By the end of this course, students will be able to:

  • Define computational biology and its scope across computer science and life sciences
  • Understand major biological data types — single-cell, spatial, proteomic, genomic, and multi-omics
  • Formulate computational methods to address biological problems
  • Evaluate and reconstruct mathematical models from research literature
  • Appreciate the role of simulation and benchmarking in method development
  • Translate mathematical models into efficient, reproducible implementations

🗓️ Lecture Outline

Lecture/Date Theme / Discussion Paper(s) Computational Topics Hands-on Notebook
L1/ Tue, Jan 06, 2026 What is computational biologyWay et al., Eraslan et al.; How to read a paper
L2/ Thu, Jan 08, 2026 In Class Project Discussion How do we present a paper effectively? What not to do? TBD
L3/ Tue, Jan 13, 2026 Mathematical Foundation, Representation of dataScanpy Distribution, Linear Algebra play_with_distribution.ipynb
L4/ Thu, Jan 15, 2026 Dimensionality ReductionSun et al.;Yin et. al. PCA, SVD, tSNE, UMAP, unsupervised/supervised clustering pca_tsne.ipynb
L5/ Tue, Jan 20, 2026 Clustering Algorithms - Zappia et al.;Yang et al. Leiden and Louvain algorithms hvg_selection.ipynb
L6/ Thu, Jan 22, 2026 Clustering Algorithms - Eraslan et al. Benzio et al. Leiden and Louvain algorithms representation_learning.ipynb
L7/ Tue, Jan 27, 2026 Clustering Algorithms Discrete structure in latent space Canceled because of school closure
L8/ Thu, Jan 29, 2026 Trajectory inference - Haghverdi et al. 2016, Nature Meth Qui et al. 2022, Nature Genetics Non-linear representation, Autoencoders, nonlinear manifolds. Canceled because of school closure
L9/ Tue, Feb 03, 2026 Spatial TranscriptomicsSquidpy, Palla et al. 2022, Nature Methods Spatial data structures, spot deconvolution TBD
Feb 05, 2026 Initial Proposal Submission (1 page) Will be graded TBD
L10/ Thu, Feb 05, 2026 VAE (continued) TBD
L11/ Tue, Feb 10, 2026 Probabilistic spatial deconvolutionDestVI, Lopez et al. 2022, Nature Biotech; RCTD, Cable et al. 2019, Nature Biotech NMF/PNMF, VAEs for spatial data TBD
L12/ Thu, Feb 12, 2026 GNN foundation - GCN, Kipf & Welling 2016, ICLR; GraphSAGE, Hamilton et al. 2017, NeurIPS GAT, Veličković et al. 2018, ICLR Graph representations, GNNs, gnn_basics.ipynb
L13/ Tue, Feb 17, 2026 GNN/GCN for spatial transcriptomics data - GraphST; STAGATE Graph representations, GNNs, spatial autocorrelation TBD
L14/ Thu, Feb 19, 2026 Bridging Histology and Spatial TranscriptomicsTangram SpaGCN iStar histology.ipynb
L15/ Tue, Feb 24, 2026 Spatial Transcriptomics (continued), Spatial StatisticsspatialDE; SPARK DESpace Detecting spatially variable genes spatial_cov.ipynb
L16/ Thu, Feb 26, 2026 Segmentation ICellPose; CellVIT Cell Segmentation basics spatial_cov.ipynb
L17/ Tue, Mar 03, 2026 Segmentation IIBaysor; Bayesian view of cell segmentation spatial_cov.ipynb
L18/ Thu, Mar 05, 2026 Multi-omics integration - - Papers TBD TBD
Tue, Mar 10, 2026 No class — Spring Break TBD
Thu, Mar 12, 2026 No class — Spring Break
Mar 16, 2026 Deadline for midterm report submission Will be graded TBD
Tue, Mar 17, 2026 Mid semester project presentations I Will be graded TBD
Thu, Mar 19, 2026 Mid semester project presentations II Will be graded TBD
L20/ Tue, Mar 24, 2026 Simulating multi-omics data - - Papers TBD TBD TBD
L20/ Thu, Mar 26, 2026 Simulation and benchmarking - - Papers TBD TBD TBD
L21/ Tue, Mar 31, 2026 Benchmarking - - Papers TBD TBD
L22/ Tue, Apr 02, 2026 Advanced topics - Neural flow TBD
L23/ Tue, Apr 07, 2026 Advanced topics - Neural ODE and other physics inspired models I TBD
L24/ Tue, Apr 09, 2026 Advanced topics - Neural ODE and other physics inspired models II TBD
L25/ Tue, Apr 14, 2026 Final Project discussion TBD
L26/ Tue, Apr 16, 2026 Final Group Presentations I and Report submission Will be graded TBD
L27/ Tue, Apr 21, 2026 Final Group Presentations II and Report submission Will be graded TBD

🧩 Core Topics (Summary)

  1. Foundations
    • What is computational biology?
    • Designing methods vs running tools
  2. Data Representation
    • Single-cell, spatial, and multi-omics data
    • Structuring biological datasets
  3. Exploratory Analysis
    • Visualization, dimensionality reduction, clustering
    • Feature selection in multi-omics contexts
  4. Problem Formulation
    • Translating biological questions into computational problems
    • Case studies from recent literature
  5. Modeling & Software
    • Statistical, probabilistic, and ML frameworks
    • Coding (Python, R, C++, Rust), efficiency, and reproducibility
  6. Simulation & Benchmarking
    • Role of simulations and benchmark datasets
    • Designing fair, reproducible evaluations

🏗️ Course Structure

Lectures/Discussions: Weekly topics introduced through short lectures and literature discussions (back-to-back or two days of the same week).

Readings: Research papers, book chapters, and review articles.

Assignments: Focused problem sets and small coding projects.

Final Project: Students will select a computational biology problem, survey existing literature, and either (1) reconstruct/improve an existing method or (2) propose/implement a novel approach.

👥 Evaluation (Groups)

Component Weight
Paper Presentation 20%
- Presentation  
- Critical Evaluation  
Initial Proposal Presentation 15%
Midterm Project Report 15%
Midterm Project Presentation 15%
Final Project Report 15%
Final Project Presentation 15%
Peer Review 10%
Attendance (only counted after 19th February according to announcement) 10%

📚 Suggested References

  • Compeau & Pevzner, BIOINFORMATICS ALGORITHMS - An Active Learning Approach
  • John Tukey, Exploratory Data Analysis
  • Pevzner & Shamir, Computational and Systems Biology
  • Blum, Hopcroft & Kannan, Foundations of Data Science
  • Single-cell Best Practices
  • Selected research papers (to be assigned weekly)

🌎 Similar Courses at Other Universities