September 10th, 2021
Tilte: WFPM: A novel WorkFlow Package Manager to enable collaborative bioinformatics workflow development
Presenter: Junjun Zhang
Title: Senior Bioinformatics Manager
Organization: Ontario Institute for Cancer Research
Recent advances in bioinformatics workflow development solutions have focused on addressing reproducibility and portability but significantly lag behind in supporting component reuse and sharing, which results in poor adoption of the widely practiced Don’t Repeat Yourself (DRY) principle and the divide-and-conquer strategy.
To address these limitations, the International Cancer Genome Consortium Accelerating Research in Genomic Oncology (ICGC ARGO) initiative (https://www.icgc-argo.org) has adopted a modular approach in which a series of "best practice" genome analysis workflows have been encapsulated in a series of well-defined packages which are then incorporated into higher-level workflows. Following this approach, we have developed five production workflows which extensively reuse component packages. This flexible architecture enables ARGO developers spreading across the globe to collaboratively build its uniform workflows with different developers focusing on different components. All ARGO component packages are reusable for the general bioinformatics community to import as modules to build their own workflows.
Recently, we have developed an open source command line interface (CLI) tool called WorkFlow Package Manager (WFPM) CLI that provides assistance throughout the entire workflow development lifecycle to implement best practices and the aforementioned modular approach. With a highly streamlined process and automation in template code generation, continuous integration testing and releasing, WFPM CLI significantly lowers the barriers for users to develop standard reusable workflow packages. WFPM CLI source code: https://github.com/icgc-argo/wfpm_, documentation: https://wfpm.readthedocs.io