Skip to search boxSkip to navigationSkip to main content

Feature engineering of time-series data to extract discriminative features

  • David Stolzea(Author)
    ,
  • Katie McConkya(Author)
    ,
  • Michael Kuhla(Author)
    ,
  • aRochester Institute of Technology
Research Output: Contribution to conference Paper Peer-review

Abstract

Time-series data streams often contain predictive value in the form of unique patterns. While these patterns may be used as leading indicators for event prediction, a lack of prior knowledge of pattern shape and irregularities can render traditional forecasting methods ineffective. This research tests an automated means of predetermining the most effective combination of transformations to be applied to time-series data when training a classification algorithm. This method relies on using meta-features of a provided data set such as coefficient of variation and length to determine optimal transformations to test based on past trials. The transformations applied include converting values of the time-series data stream into a binary data set, with each anomalous value being labeled as a “1”. The number of binary points to be used for training is varied to determine an optimal length. The training set is then aggregated into bins containing a set number of data points, with each bin represented by the summation of the contained binary values. Application of these transformations creates a simplified set of values with which a classifier is trained. By comparing the performance of multiple trained classifiers generated using different transformation parameters, an optimal combination may be determined.