The information packet and video recording are now available.
Agenda - click the Sessions to view presentations
1:00 pm – 1:10 pm : Welcome and Introduction of speakers: NCI and FNL
1:10 pm – 1:45 pm : Session I: What is Data Science? Mark Jensen, PhD, FNLCR BIDS
1:45 pm – 2:20 pm : Session II: Data Science Methodology: Randall Johnson, PhD, FNLCR ABCS
2:20 pm – 2:45 pm : PANEL Discussion: Q & A for Sessions I and II
2:45 pm – 2:55 pm : BREAK
2:55 pm – 3:30 pm : Session III: Formulating the Question: Martin Skarzynski, PhD, NCI DCEG Fellow
3:30 pm – 4:15 pm : Session IV: Data Gathering: Simina Boca, PhD, Georgetown University Medical Center
4:15 pm - 4:50 pm : PANEL Discussion: Q & A for Sessions III and IV
4:50 pm – 5:00 pm : Wrap-Up: Feedback and Next Workshop
Topics: What is Data Science? Instructor: Mark Jensen, PhD, FNLCR BIDS
- Definition
- History
Data Science Methodology - Instructor: Randall Johnson, PhD, FNLCR ABCS
- Methodology for cancer data science that includes knowledge of the problem, understanding the data, data preparation, etc.
Formulating the Question - Instructor: Martin Skarzynski, PhD, NCI DCEG Fellow
- Understanding the research question
- Analytic Approach: use cases on
- Regression
- Classification
- Clustering
Data Gathering - Instructor: Simina Boca, PhD, Georgetown University Medical Center
Data requirements
- Content, format, representation
Data collection
- Data understanding
- Descriptive statistics and visualization
- Additional data collection
- Data preparation
- Importing data into Jupyter notebook using Pandas
- Cleaning data and taking care of missing values in Python
- Exploring Data
a. using Pandas functions (.head, .shape, .tail, etc.)
b. through visualization with Matplotlib - Determining the right algorithms to use for further Machine Learning (ML) analysis based on data characteristics
Date: Tuesday, April 21, 2019
Time: 1:00-5:00 p.m.
Location: NCI Shady Grove, Seminar 406 (Terrace East Building)
Questions? Contact the NCI Data Science Learning Exchange