HumanSignal
diff --git a/‎README.md‎
Lines changed: 1 addition & 0 deletions b/‎README.md‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎label_studio_ml/examples/timeseries_segmenter/Dockerfile‎
Lines changed: 48 additions & 0 deletions b/‎label_studio_ml/examples/timeseries_segmenter/Dockerfile‎
Lines changed: 48 additions & 0 deletions
diff --git a/‎label_studio_ml/examples/timeseries_segmenter/README.md‎
Lines changed: 56 additions & 0 deletions b/‎label_studio_ml/examples/timeseries_segmenter/README.md‎
Lines changed: 56 additions & 0 deletions
diff --git a/‎label_studio_ml/examples/timeseries_segmenter/_wsgi.py‎
Lines changed: 122 additions & 0 deletions b/‎label_studio_ml/examples/timeseries_segmenter/_wsgi.py‎
Lines changed: 122 additions & 0 deletions
diff --git a/‎label_studio_ml/examples/timeseries_segmenter/docker-compose.yml‎
Lines changed: 40 additions & 0 deletions b/‎label_studio_ml/examples/timeseries_segmenter/docker-compose.yml‎
Lines changed: 40 additions & 0 deletions
@@ -62,6 +62,7 @@ Check the **Required parameters** column to see if you need to set any additiona
 | [sklearn_text_classifier](/label_studio_ml/examples/sklearn_text_classifier)               | Text classification with [scikit-learn](https://scikit-learn.org/stable/)                                                                            | ✅              | ❌                | ✅        | None                        | Arbitrary | 
 | [spacy](/label_studio_ml/examples/spacy)                                                   | NER by [SpaCy](https://spacy.io/)                                                                                                                    | ✅              | ❌                | ❌        | None                       | Set      [(see documentation)](https://spacy.io/usage/linguistic-features) |
 | [tesseract](/label_studio_ml/examples/tesseract)                                           | Interactive OCR. [Details](https://github.com/tesseract-ocr/tesseract)                                                                               | ❌              | ✅                | ❌        | None                       | Set (characters)                                                           | 
+| [timeseries_segmenter](/label_studio_ml/examples/timeseries_segmenter)             | Time series segmentation using scikit-learn | ✅              | ✅                | ✅        | None   | Set |
 | [watsonX](/label_studio_ml/exampels/watsonx)| LLM inference with [WatsonX](https://www.ibm.com/products/watsonx-ai) and integration with [WatsonX.data](watsonx.data)| ✅ | ✅| ❌ | None| Arbitrary|
 | [yolo](/label_studio_ml/examples/yolo)                                                     | All YOLO tasks are supported: [YOLO](https://docs.ultralytics.com/tasks/) | ✅ | ❌ | ❌ | None | Arbitrary |
 
 
@@ -0,0 +1,48 @@
+# syntax=docker/dockerfile:1
+ARG PYTHON_VERSION=3.11
+
+FROM python:${PYTHON_VERSION}-slim AS python-base
+ARG TEST_ENV
+
+WORKDIR /app
+
+ENV PYTHONUNBUFFERED=1 \
+    PYTHONDONTWRITEBYTECODE=1 \
+    PORT=${PORT:-9090} \
+    PIP_CACHE_DIR=/.cache \
+    WORKERS=1 \
+    THREADS=8
+
+# Update the base OS
+RUN --mount=type=cache,target="/var/cache/apt",sharing=locked \
+    --mount=type=cache,target="/var/lib/apt/lists",sharing=locked \
+    set -eux; \
+    apt-get update; \
+    apt-get upgrade -y; \
+    apt install --no-install-recommends -y  \
+        git; \
+    apt-get autoremove -y
+
+# install base requirements
+COPY requirements-base.txt .
+RUN --mount=type=cache,target=${PIP_CACHE_DIR},sharing=locked \
+    pip install -r requirements-base.txt
+
+# install custom requirements
+COPY requirements.txt .
+RUN --mount=type=cache,target=${PIP_CACHE_DIR},sharing=locked \
+    pip install -r requirements.txt
+
+# install test requirements if needed
+COPY requirements-test.txt .
+# build only when TEST_ENV="true"
+RUN --mount=type=cache,target=${PIP_CACHE_DIR},sharing=locked \
+    if [ "$TEST_ENV" = "true" ]; then \
+      pip install -r requirements-test.txt; \
+    fi
+
+COPY . .
+
+EXPOSE 9090
+
+CMD gunicorn --preload --bind :$PORT --workers $WORKERS --threads $THREADS --timeout 0 _wsgi:app
@@ -0,0 +1,56 @@
+# Time Series Segmenter for Label Studio
+
+This example demonstrates a minimal ML backend that performs time series segmentation.
+It trains a logistic regression model on labeled CSV data and predicts segments
+for new tasks. The backend expects the labeling configuration to use
+`<TimeSeries>` and `<TimeSeriesLabels>` tags.
+
+## Before you begin
+
+1. Install the [Label Studio ML backend](https://github.com/HumanSignal/label-studio-ml-backend?tab=readme-ov-file#quickstart).
+2. Set `LABEL_STUDIO_HOST` and `LABEL_STUDIO_API_KEY` in `docker-compose.yml`
+   so the backend can download labeled tasks for training.
+
+## Quick start
+
+```bash
+# build and run
+docker-compose up --build
+```
+
+Connect the model from the **Model** page in your project settings. The default
+URL is `http://localhost:9090`.
+
+## Labeling configuration
+
+Use a configuration similar to the following:
+
+```xml
+<View>
+  <TimeSeriesLabels name="label" toName="ts">
+    <Label value="Run"/>
+    <Label value="Walk"/>
+  </TimeSeriesLabels>
+  <TimeSeries name="ts" valueType="url" value="$csv_url" timeColumn="time">
+    <Channel column="sensorone" />
+    <Channel column="sensortwo" />
+  </TimeSeries>
+</View>
+```
+
+The backend reads the time column and channels to build feature vectors for
+training and prediction. Each CSV referenced by `csv_url` is expected to contain
+at least the time column and the listed channels.
+
+## Training
+
+Training starts automatically when annotations are created or updated. The model
+collects all labeled segments, extracts sensor values inside each segment and
+fits a logistic regression classifier. Model artifacts are stored in the
+`MODEL_DIR` (defaults to the current directory).
+
+## Prediction
+
+For each task, the backend loads the CSV, applies the trained classifier to each
+row and groups consecutive predictions into labeled segments. Prediction scores
+are averaged per segment and returned to Label Studio.
@@ -0,0 +1,122 @@
+import os
+import argparse
+import json
+import logging
+import logging.config
+
+logging.config.dictConfig({
+  "version": 1,
+  "disable_existing_loggers": False,
+  "formatters": {
+    "standard": {
+      "format": "[%(asctime)s] [%(levelname)s] [%(name)s::%(funcName)s::%(lineno)d] %(message)s"
+    }
+  },
+  "handlers": {
+    "console": {
+      "class": "logging.StreamHandler",
+      "level": os.getenv('LOG_LEVEL'),
+      "stream": "ext://sys.stdout",
+      "formatter": "standard"
+    }
+  },
+  "root": {
+    "level": os.getenv('LOG_LEVEL'),
+    "handlers": [
+      "console"
+    ],
+    "propagate": True
+  }
+})
+
+from label_studio_ml.api import init_app
+from model import TimeSeriesSegmenter
+
+
+_DEFAULT_CONFIG_PATH = os.path.join(os.path.dirname(__file__), 'config.json')
+
+
+def get_kwargs_from_config(config_path=_DEFAULT_CONFIG_PATH):
+    if not os.path.exists(config_path):
+        return dict()
+    with open(config_path) as f:
+        config = json.load(f)
+    assert isinstance(config, dict)
+    return config
+
+
+if __name__ == "__main__":
+    parser = argparse.ArgumentParser(description='Label studio')
+    parser.add_argument(
+        '-p', '--port', dest='port', type=int, default=9090,
+        help='Server port')
+    parser.add_argument(
+        '--host', dest='host', type=str, default='0.0.0.0',
+        help='Server host')
+    parser.add_argument(
+        '--kwargs', '--with', dest='kwargs', metavar='KEY=VAL', nargs='+', type=lambda kv: kv.split('='),
+        help='Additional LabelStudioMLBase model initialization kwargs')
+    parser.add_argument(
+        '-d', '--debug', dest='debug', action='store_true',
+        help='Switch debug mode')
+    parser.add_argument(
+        '--log-level', dest='log_level', choices=['DEBUG', 'INFO', 'WARNING', 'ERROR'], default=None,
+        help='Logging level')
+    parser.add_argument(
+        '--model-dir', dest='model_dir', default=os.path.dirname(__file__),
+        help='Directory where models are stored (relative to the project directory)')
+    parser.add_argument(
+        '--check', dest='check', action='store_true',
+        help='Validate model instance before launching server')
+    parser.add_argument('--basic-auth-user',
+                        default=os.environ.get('ML_SERVER_BASIC_AUTH_USER', None),
+                        help='Basic auth user')
+    
+    parser.add_argument('--basic-auth-pass',
+                        default=os.environ.get('ML_SERVER_BASIC_AUTH_PASS', None),
+                        help='Basic auth pass')    
+    
+    args = parser.parse_args()
+
+    # setup logging level
+    if args.log_level:
+        logging.root.setLevel(args.log_level)
+
+    def isfloat(value):
+        try:
+            float(value)
+            return True
+        except ValueError:
+            return False
+
+    def parse_kwargs():
+        param = dict()
+        for k, v in args.kwargs:
+            if v.isdigit():
+                param[k] = int(v)
+            elif v == 'True' or v == 'true':
+                param[k] = True
+            elif v == 'False' or v == 'false':
+                param[k] = False
+            elif isfloat(v):
+                param[k] = float(v)
+            else:
+                param[k] = v
+        return param
+
+    kwargs = get_kwargs_from_config()
+
+    if args.kwargs:
+        kwargs.update(parse_kwargs())
+
+    if args.check:
+        print('Check "' + TimeSeriesSegmenter.__name__ + '" instance creation..')
+        model = TimeSeriesSegmenter(**kwargs)
+
+    app = init_app(model_class=TimeSeriesSegmenter, basic_auth_user=args.basic_auth_user, basic_auth_pass=args.basic_auth_pass)
+
+    app.run(host=args.host, port=args.port, debug=args.debug)
+
+else:
+    # for uWSGI use
+    app = init_app(model_class=TimeSeriesSegmenter)
@@ -0,0 +1,40 @@
+version: "3.8"
+
+services:
+  timeseries_segmenter:
+    container_name: timeseries_segmenter
+    image: heartexlabs/label-studio-ml-backend:timeseries-segmenter
+    init: true
+    build:
+      context: .
+      args:
+        TEST_ENV: ${TEST_ENV}
+    environment:
+      # LABEL_STUDIO_HOST: This is the host URL for Label Studio, used for training.
+      # It can be set via environment variable "LABEL_STUDIO_HOST".
+      # If not set, it defaults to 'http://localhost:8080'.
+      - LABEL_STUDIO_HOST=${LABEL_STUDIO_HOST:-http://localhost:8080}
+      # LABEL_STUDIO_API_KEY: This is the API key for Label Studio, used for training.
+      # It can be set via environment variable "LABEL_STUDIO_API_KEY".
+      # There is no default value for this, so it must be set.
+      - LABEL_STUDIO_API_KEY=${LABEL_STUDIO_API_KEY}
+      # START_TRAINING_EACH_N_UPDATES: This is the number of updates after which training starts.
+      # It is an integer value and can be set via environment variable "START_TRAINING_EACH_N_UPDATES".
+      # If not set, it defaults to 10.
+      - START_TRAINING_EACH_N_UPDATES=${START_TRAINING_EACH_N_UPDATES:-10}
+      # specify these parameters if you want to use basic auth for the model server
+      - BASIC_AUTH_USER=
+      - BASIC_AUTH_PASS=
+      # set the log level for the model server
+      - LOG_LEVEL=DEBUG
+      # any other parameters that you want to pass to the model server
+      - ANY=PARAMETER
+      # specify the number of workers and threads for the model server
+      - WORKERS=1
+      - THREADS=8
+      # specify the model directory (likely you don't need to change this)
+      - MODEL_DIR=/data/models
+    ports:
+      - "9090:9090"
+    volumes:
+      - "./data/server:/data"