Skip to main content

Syllabus

WeekDatesTopicRelated Reading/ContentHomework
18/25Introduction slides
8/27Sequence Modelling and Generation with Transformer slides
29/1Modeling Structure with Graph Neural Networks slides
9/3Single-cell biology Research Overview slides
39/8Diffusion Models slides; Small molecule drug design via MARS slidesMARS: Markov Molecular Sampling for Multi-objective Drug Discovery; Multi-Objective Molecule Generation using Interpretable Substructures
9/10Multi-domain Distribution Learning for De Novo Drug DesignA 3D Generative Model for Structure-Based Drug Design; Pocket2Mol: Efficient Molecular Sampling Based on 3D Protein Pockets; Structure-based drug design with equivariant diffusion models; Reinforced genetic algorithm for structure-based drug design
49/15Simulating 500 million years of evolution with a language modelESM2 and ProGen2
9/17Importance Weighted Expectation-Maximization for Protein Sequence DesignProximal Exploration for Model-guided Protein Sequence Design
59/22Robust deep learning based protein sequence design using ProteinMPNNA Deep SE(3)-Equivariant Model for Learning Inverse Protein Folding
9/24Generative Enzyme Design Guided by Functionally Important Sites and Small-Molecule SubstratesScaffolding protein functional sites using deep learning
69/29Highly accurate protein structure prediction with AlphaFold (AlphaFold2)Protein Structure Prediction: AlphaFold/AlphaFold2
10/1De novo design of protein structure and function with RFdiffusionDiffusion probabilistic modeling of protein backbones in 3D for the motif-scaffolding problem; PPDiff: Diffusing in Hybrid Sequence-Structure Space for Protein-Protein Complex Design
710/6De novo design of luciferases using deep learning
10/8Accurate structure prediction of biomolecular interactions with AlphaFold 3RosettaFold3, RosettaFoldAllAtom
810/13, 10/15Fall break, no class
910/20Caduceus: Bi-Directional Equivariant Long-Range DNA Sequence ModelingDNABERT: pre-trained Bidirectional Encoder Representations from Transformers model for DNA-language in genome; Nucleotide Transformer; Hyenadna: Long-range genomic sequence modeling at single nucleotide resolution; Dnabert-2: Efficient foundation model and benchmark for multi-species genome
10/22Predicting DNA structure using a deep learning method
1010/27CodonBERT: Large Language Models for mRNA design and optimization
10/29AlphaGenome: advancing regulatory variant effect prediction with a unified DNA sequence modelA sequence-based global map of regulatory activity for deciphering human genetics
1111/3no class
11/5Design of highly functional genome editors by modeling the universe of CRISPR-Cas sequences
1211/10A programmable reaction-diffusion system for spatiotemporal cell signaling circuit design
11/12Transfer learning enables predictions in network biology
1311/17scGPT: toward building a foundation model for single-cell multi-omics using generative AIscGPT: toward building a foundation model for single-cell multi-omics using generative AI; Universal Cell Embeddings: A Foundation Model for Cell Biology
11/19FlowMol3: Flow Matching for 3D De Novo Small-Molecule GenerationAutonomous, multiproperty-driven molecular discovery: From predictions to measurements and back
1411/24A small-molecule TNIK inhibitor targets fibrosis in preclinical and clinical models
11/26thanksgiving, no class
1512/1Conditional Antibody Design as 3D Equivariant Graph TranslationEnd-to-End Full-Atom Antibody Design; Conditional Antibody Design as 3D Equivariant Graph Translation; Atomically accurate de novo design of single-domain antibodies
12/3A generative deep learning approach to de novo antibiotic design
Optional
Base-resolution models of transcription-factor binding reveal soft motif syntax