NCI Data Science Learning Exchange
►Machine Learning for Drug Function Classification A Hands-On Tutorial
The presentation and video recording are now available.
Overview: This two-part workshop will introduce you to the concepts and tools in Machine Learning to generate molecular descriptors for drug function classification. You will receive hands-on instruction to generate and explore small molecule (drug-like) chemical structures, compute chemical descriptors, and create and analyze Machine Learning classification models. The workshop will use open source chemoinformatics software and the scikit-learn library to compute key pharma-relevant descriptors and generate/analyze drug classification models.
Part 1: a 30-minute presentation followed by a 20-minute hands-on code/tools review. This includes:
- Introduction to ML concepts to create molecular structures and extract features or chemical descriptors.
- How to generate and analyze molecular fingerprint descriptors
- How to use the following two tools to explore data (chemical) analysis and feature generation:
- Rdkit libraries, Python’s open source cheminformatics software toolkit
- Mordred and other open source software to generate molecular features
Part 2: a 30-minute presentation followed by a 20-minute hands-on tools review. We will extend the concepts demonstrated in Part 1 to build machine learning classification models for predicting small-molecule (drug-like) function (ex., CNS, GI Agent, etc.). Tools include:
- Scikit-learn for creating Random Forest classification models
- A modeling workflow that include data collection/curation, featurization (fingerprints), classification modeling using ensemble-based methods and analysis and based on the lessons-learned from AMPL publication
Date: Thursday, July 16, 2020
Time: 1:00 – 3:00 p.m.
Location: WebEx
Supporting Link: GitHub
Instructor: Sarangan Ravichandran, PhD, PMP [C], Data Scientist, Frederick National Laboratory for Cancer Research and Adjunct Professor in Bioinformatics, Hood College
Questions? Contact the NCI Data Science Learning Exchange