Skip to main content
The NCI Community Hub will be retiring in May 2024. For more information please visit the NCIHub Retirement Page

April 10th 2020

Topic: Cancer Data Analytics on the ISB-CGC platform

                    Dr. Kawther Abdilleh, Bioinformatics Scientist, GDIT, ISB-CGC 

                    Dr.Fabian Seidl, Bioinformatics Scientist, GDIT, ISB-CGC





The ISB Cancer Genomics Cloud (ISB-CGC) is one of three Cancer Cloud Resources funded by the National Cancer Institute, serving to democratize access to large cancer datasets as well as high-performance compute resources on the Google Cloud Platform. With a focus on Data as a Service (DaaS), the ISB-CGC offers multiple avenues for accessing and analyzing large-scale cancer datasets including TCGA, TARGET and other important references such as GENCODE and COSMIC. ISB-CGC is intentionally designed as an open platform allowing a wide range of users with diverse skill sets to choose approaches best suited to their tasks at hand. Users can analyze petabytes of data using complex workflows written in the workflow language of their choice (including but not limited to CWL, WDL, Snakemake, Nextflow, etc). They can develop new analysis methods in common languages such as Python, R, and SQL, conduct multivariate data analysis on easily accessible and query-able tables of large-scale cancer data, and use interactive web tools designed for cohort creation and data discovery and exploration. Here we will demonstrate how the flexible computing infrastructure of ISB-CGC enables researchers to analyze cloud hosted data with their own tools as well as with a collection of powerful Google Cloud Platform native tools and technologies (including Google BigQuery for big data analysis and Google Compute Engine for complex workflow execution).

Created by Durga Addepalli Last Modified Wed November 4, 2020 10:59 pm by Durga Addepalli