Taskblaster: A generic framework for automated computational workflows
Abstract
We introduce Taskblaster, a generic Python framework for composing, executing, and managing computational workflows with automated error handling. Taskblaster supports dynamic workflows where computational tasks are created in response to the output of other tasks. Flow control with branching and iteration (if and while logic) is provided at the workflow level, making the system Turing complete. A workflow may apply other workflows, which then are referred to as sub-workflows. This promotes a modular structure that increases readability and eases the workflow design and reuse of code. Tasks are associated with file system paths so that they can be intuitively accessed under a directory tree. A command line interface allows monitoring and control of workflows. Tasks are executed by worker processes that may run directly in a terminal or be submitted using a queueing system, allowing task-specific resource control. We provide a library (ASR-lib) of workflows for common materials simulations employing the Atomic Simulation Environment and the GPAW electronic structure code. Taskblaster can equally well be used with other computational codes.