Skip to search boxSkip to navigationSkip to main content

Kepler/pPOD: Scientific workflow and provenance support for assembling the tree of life

  • Shawn Bowersb(Author)
    ,
  • Timothy McPhillipsb(Author)
    ,
  • Sean Riddleb(Author)
    ,
  • Manish Kumar Ananda(Author)
    ,
  • Bertram Ludäschera, b(Author)
  • aUniversity of California
    ,
  • bUniversity of California, Davis
Research Output: Chapter in Book/Report/Conference proceeding Conference contribution

Open access

Abstract

The complexity of scientific workflows for analyzing biological data creates a number of challenges for current workflow and provenance systems. This complexity is due in part to the nature of scientific data (e.g., heterogeneous, nested data collections) and the programming constructs required for automation (e.g., nested workflows, looping, pipeline parallelism). We present an extended version of the Kepler scientific workflow system to address these challenges, tailored for the systematics community. Our system combines novel approaches for representing scientific data, modeling and automating complex analyses, and recording and browsing associated provenance information.