Skip to main content
The NCI Community Hub will be retiring in May 2024. For more information please visit the NCIHub Retirement Page:
  • Discoverability Visible
  • Join Policy Open/Anyone
  • Created 08 Sep 2021
ATOM Modeling Pipeline (AMPL) for Drug Discovery – 
Tuesday, June 08, 2021 @ 01:00 pm EDT — Tuesday, June 08, 2021 @ 02:00 pm EDT

Do you want to know how to use Machine Learning (ML) for accelerating drug discovery? Join us on June 8, 1:00 pm – 2:00 pm ET, for the first in a series of workshops on how to use the Atom Modeling PipeLine (AMPL), an open-source conda-based software that automates key drug discovery steps. AMPL is designed to take molecular binding data (ex., IC50, ki, etc.) and carry out key ML steps with minimal user intervention. The first workshop will introduce AMPL and highlight AMPL’s capabilities for creating ML-ready datasets. Follow-on workshops will be offered during the summer and will cover modeling methods and inference.

Date: Tuesday, June 8, 2021

Time: 1:00 p.m – 2:00 p.m. ET

Recording: here

Presentation: here

Location: Webex

Registration: Not required

Presenter: Sarangan Ravichandran, PhD, PMP Senior Data Scientist, ATOM Consortium/Frederick National Laboratory for Cancer Research (FNLCR) and Adjunct Professor in Bioinformatics, Hood College

Supporting materials: Tutorial and AMPL: A Data-Driven Modeling Pipeline for Drug Discovery

The workshop on June 8 will include two parts, a short presentation followed by a hands-on tutorial.

Part 1: A 20-minute presentation that will cover the following topics:

  • Introduction to small-molecule binding and the database sources

  • Issues associated with data ingestion and curation

  • Exploratory data analysis of the ingested and curated datasets

  • Use of different featurization methods like molecular fingerprints or properties (Molecular Weight, number of hydrogen-bond acceptors, etc.)

  • Creation of ML-ready datasets

Part 2: A 35-minute AMPL code demonstration followed by a 5-minute Q&A. We will share a Python Jupyter notebook that will cover the following ML steps: data ingestion/curation, featurization, and visualization to create ML-ready datasets. Here are the key sections of the notebook:

  • Highlights of AMPL functions that are designed to address the common issues encountered during the data ingestion and curation of drug discovery or small-molecule-focused projects

  • Introduction of the extensible AMPL featurizer module and a demonstration on how simple keyword choices can lead to the computation of a range of different feature sets

  • Exploratory Data Analysis and visualization code templates that can be adopted for other drug discovery projects with very little modification

To learn more about the software, visit the AMPL GitHub repository at this link

Questions? Contact the">NCI Data Science Learning Exchange

Export to My Calendar (ics)