Skip to content

Feature extraction issue when series name is a number #97

@windischbauer

Description

@windischbauer

I have a data set where each series has a number as a key and I would like to extract the data.
I have adapted the code from the tutorial to show the issue and I am using v0.3.0.:

import pandas as pd; import scipy.stats as ss; import numpy as np
from tsflex.features import FeatureDescriptor, FeatureCollection, FuncWrapper

# 1. -------- Get your time-indexed data --------
# Data contains 1 column; ["TMP"]
url = "https://github.com/predict-idlab/tsflex/raw/main/examples/data/empatica/"
data = pd.read_parquet(url + "tmp.parquet").set_index("timestamp")

# I renamed the column name to showcase the issue:
data.rename(columns={'TMP': 1234}, inplace=True)

# 2 -------- Construct your feature collection --------
fc = FeatureCollection(
    feature_descriptors=[
        FeatureDescriptor(
            function=FuncWrapper(func=ss.skew, output_names="skew"),
            series_name="1234",
            window="5min", stride="2.5min",
        )
    ]
)
# -- 2.1. Add features to your feature collection
# NOTE: tsflex allows features to have different windows and strides
# fc.add(FeatureDescriptor(np.min, "TMP", '0.5min', '2.5min'))

# 3 -------- Calculate features --------
fc.calculate(data=data, return_df=True)  # which outputs:

IndexError: list index out of range when the series name is in quotation marks and
TypeError: argument of type 'int' is not iterable when the series name is a number

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions