Skip to search boxSkip to navigationSkip to main content

Approaches for semantically annotating and discovering scientific observational data

  • Huiping Caoc(Author)
    ,
  • Shawn Bowersa(Author)
    ,
  • Mark P. Schildhauerb(Author)
  • ,
  • bNational Center for Ecological Analysis and Synthesis
    ,
  • cNew Mexico State University
Research Output: Chapter in Book/Report/Conference proceeding Conference contribution

Abstract

Observational data plays a critical role in many scientific disciplines, and scientists are increasingly interested in performing broad-scale analyses by using data collected as part of many smaller scientific studies. However, while these data sets often contain similar types of information, they are typically represented using very different structures and with little semantic information about the data itself, which creates significant challenges for researchers who wish to discover existing data sets based on data semantics (observation and measurement types) and data content (the values of measurements within a data set). We present a formal framework to address these challenges that consists of a semantic observational model, a high-level semantic annotation language, and a declarative query language that allows researchers to express data-discovery queries over heterogeneous (annotated) data sets. To demonstrate the feasibility of our framework, we also present implementation approaches for efficiently answering discovery queries over semantically annotated data sets.