NCI Hub - Group: NCI Data Science Learning Exchange ~ Calendar

Skip to main content

The NCI Community Hub will be retiring in May 2024. For more information please visit the NCIHub Retirement Page:https://ncihub.cancer.gov/groups/ncihubshutdown/overview

Login

Discoverability Visible
Join Policy Open/Anyone
Created 08 Sep 2021

NCI Data Science Learning Exchange

►

Calendar

Back to Events Calendar

ATOM Modeling Pipeline (AMPL) for Drug Discovery - A hands-on tutorial –

	Tuesday, September 14, 2021 @ 02:00 pm EDT — Tuesday, September 14, 2021 @ 03:30 pm EDT
	Do you want to know how to use Machine Learning (ML) for accelerating drug discovery? Join us on September 14, 2:00 pm – 3:30 pm ET for the second workshop on using Machine Learning (ML) to accelerate drug discovery! The workshop focuses on using the Atom Modeling PipeLine (AMPL), an open-source conda-based software that automates key drug discovery steps. AMPL is designed to take molecular binding data (ex., IC50, ki, etc.) and carry out the ML steps with minimal user intervention (see the figure shown above). The first workshop held in June highlighted AMPL’s capabilities for creating ML-ready datasets. Date: Tuesday, Sep 14, 2021 Time: 2:00 p.m – 3:30 p.m. ET Location: Webex Registration: Not required Presenter: Sarangan Ravichandran, PhD, PMP Senior Data Scientist, ATOM Consortium/Frederick National Laboratory for Cancer Research (FNLCR) and Adjunct Professor in Bioinformatics, Hood College Supporting materials: Tutorial and AMPL: A Data-Driven Modeling Pipeline for Drug Discovery The second workshop on September 14 will demonstrate three preliminary in-silico drug discovery topics: data ingestion cleaning/tidying curation on AMPL Note: This session will be 90 minutes and will use Google COLAB notebooks (compatible to Jupyter notebooks) for demonstration. Please see the outline below: Notebook-1: Ingestion, Cleaning and Exploratory Data Analysis(EDA) of Binding Assay Data (30 minutes) Issues associated with data ingestion and curation (data sources: Drug Data Commons; ChEMBL and ExCAPE-DB) Exploratory data analysis of the ingested datasets Standardization of outcome units such as IC50 (etc. um to nM) Data visualization and comparison Notebook-2: Standardization of SMILES, Featurization and Compound Overlap/Diversity using a Python Jupyter Notebook (30 minutes) Compound overlap SMILES standardization Explore compound diversity using featurization and Tanimoto distance Create plots/heatmaps for analysis Notebook-3: Curate, Merge Datasets to Create the Final ML-ready Dataset (30 minutes) Removal of duplicates Filter extreme data Merge the DTC, ChEMBL, and ExCAPE-DB datasets to create a curated dataset Data curation on the merged data Creation of ML-ready dataset To learn more about the software, visit the AMPL GitHub repository at this link Questions? Contact the ncidatasciencelearningexchange@mail.nih.gov "> NCI Data Science Learning Exchange

	Export to My Calendar (ics)