Skip to content

Commit 7ab9712

Browse files
committed
Merge branch 'master' of github.com:heartexlabs/label-studio-ml-backend into fix/gliner
2 parents 727e2a1 + 5603e9f commit 7ab9712

File tree

12 files changed

+499
-245
lines changed

12 files changed

+499
-245
lines changed

.github/workflows/build.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -80,7 +80,7 @@ jobs:
8080
echo "image_branch_version=$image_branch_version" >> $GITHUB_OUTPUT
8181
8282
- name: Set up Docker Buildx
83-
uses: docker/setup-buildx-action@v3.6.1
83+
uses: docker/setup-buildx-action@v3.7.1
8484

8585
- name: Login to DockerHub
8686
if: ${{ !github.event.pull_request.head.repo.fork }}
@@ -109,7 +109,7 @@ jobs:
109109
core.setOutput("tags", tags);
110110
111111
- name: Push Docker image
112-
uses: docker/build-push-action@v6.7.0
112+
uses: docker/build-push-action@v6.9.0
113113
id: docker_build_and_push
114114
with:
115115
context: "${{ env.DOCKER_EXAMPLES_DIRECTORY }}/${{ env.backend_dir_name }}"

.github/workflows/tests.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -121,15 +121,15 @@ jobs:
121121

122122
- name: "Upload general coverage to Codecov"
123123
if: ${{ matrix.backend_dir_name == 'the_simplest_backend' }}
124-
uses: codecov/codecov-action@v4.5.0
124+
uses: codecov/codecov-action@v4.6.0
125125
with:
126126
name: codecov-general
127127
files: ./tests/${{ matrix.backend_dir_name }}_coverage.xml
128128
token: ${{ secrets.CODECOV_TOKEN }}
129129
fail_ci_if_error: false
130130

131131
- name: "Upload ml-backend ${{ matrix.backend_dir_name }} coverage to Codecov"
132-
uses: codecov/codecov-action@v4.5.0
132+
uses: codecov/codecov-action@v4.6.0
133133
with:
134134
name: codecov-${{ matrix.backend_dir_name }}
135135
files: ./label_studio_ml/examples/${{ matrix.backend_dir_name }}/coverage.xml

label_studio_ml/examples/yolo/README.md

Lines changed: 234 additions & 56 deletions
Large diffs are not rendered by default.

label_studio_ml/examples/yolo/README_TIMELINE_LABELS.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -82,7 +82,7 @@ This tutorial uses the [YOLO example](https://github.com/HumanSignal/label-studi
8282
| `model_classifier_f1_threshold` | float | 0.95 | F1 score threshold for early stopping during training. Set to prevent overfitting. |
8383
| `model_classifier_accuracy_threshold` | float | 1.00 | Accuracy threshold for early stopping during training. Set to prevent overfitting. |
8484
| `model_score_threshold` | float | 0.5 | Minimum confidence threshold for predictions. Labels with confidence below this threshold will be disregarded. |
85-
| `model_path` | string | None | Path to the custom YOLO model. See more in the section "Custom YOLO Models." |
85+
| `model_path` | string | None | Path to the custom YOLO model. See more in the section [Your own custom YOLO models](./README.md#your-own-custom-yolo-models). |
8686

8787
**Note:** You can customize the neural network parameters directly in the labeling configuration by adjusting the attributes in the `<TimelineLabels>` tag.
8888

@@ -158,7 +158,7 @@ and it generates predictions for each frame in the video.
158158

159159
#### Custom YOLO models for feature extraction
160160

161-
You can load your own YOLO models using the steps described in the [main README](./README.md#custom-yolo-models).
161+
You can load your own YOLO models using the steps described in the [main README](./README.md#your-own-custom-yolo-models).
162162
However, it should have similar architecture as `yolov8-cls` models. See `utils/neural_nets.py::cached_feature_extraction()` for more details.
163163

164164
#### Cache folder

label_studio_ml/examples/yolo/control_models/base.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -198,4 +198,4 @@ def __str__(self):
198198

199199
class Config:
200200
arbitrary_types_allowed = True
201-
protected_namespaces = ('__.*__', '_.*') # Excludes 'model_'
201+
protected_namespaces = ("__.*__", "_.*") # Excludes 'model_'

label_studio_ml/examples/yolo/control_models/keypoint_labels.py

Lines changed: 10 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ def is_control_matched(cls, control) -> bool:
3636

3737
def build_point_mapping(self):
3838
"""Build a mapping between points and Label Studio labels, e.g.
39-
<Label value="left_eye" predicted_values="person" model_index="2" /> => {"person::2": "left_eye"}
39+
<Label value="nose" predicted_values="person" model_index="0" /> => {"person::0": "nose"}
4040
"""
4141
mapping = {}
4242
for value, label_tag in self.control.labels_attrs.items():
@@ -80,12 +80,15 @@ def create_keypoints(self, results, path):
8080
) # Convert normalized keypoints to percentages
8181
model_label = model_names[int(results[0].boxes.cls[bbox_index])]
8282

83+
point_logs = "\n".join(
84+
[f' model_index="{i}", xy={xyn}' for i, xyn in enumerate(point_xyn)]
85+
)
8386
logger.debug(
8487
"----------------------\n"
8588
f"task id > {path}\n"
8689
f"type: {self.control}\n"
87-
f"keypoints > {point_xyn}\n"
8890
f"model label > {model_label}\n"
91+
f"keypoints >\n{point_logs}\n"
8992
f"confidences > {bbox_conf}\n"
9093
)
9194

@@ -115,7 +118,7 @@ def create_keypoints(self, results, path):
115118
logger.warning(
116119
f"Point {index_name} not found in point map, "
117120
f"you have to define it in the labeling config, e.g.:\n"
118-
f'<Label value="nose" predicted_values="person" index="1" />'
121+
f'<Label value="nose" predicted_values="person" model_index="0" />'
119122
)
120123
continue
121124
point_label = self.point_map[index_name]
@@ -126,10 +129,10 @@ def create_keypoints(self, results, path):
126129
"to_name": self.to_name,
127130
"type": "keypointlabels",
128131
"value": {
129-
"keypointlabels": [point_label], # Keypoint label
130-
"width": self.point_size
131-
/ image_width
132-
* 100, # Keypoint width, just visual styling
132+
# point label
133+
"keypointlabels": [point_label],
134+
# point width, just visual styling
135+
"width": self.point_size / image_width * 100,
133136
"x": x,
134137
"y": y,
135138
},

label_studio_ml/examples/yolo/control_models/timeline_labels.py

Lines changed: 41 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
BaseNN,
88
MultiLabelLSTM,
99
cached_feature_extraction,
10-
cached_yolo_predict
10+
cached_yolo_predict,
1111
)
1212
from utils.converter import (
1313
get_label_map,
@@ -22,7 +22,7 @@
2222
class TimelineLabelsModel(ControlModel):
2323
"""
2424
Class representing a TimelineLabels control tag for YOLO model.
25-
See README_TIMELINE_LABELS.md for more details.
25+
See README_TIMELINE_LABELS.md for more details.
2626
"""
2727

2828
type = "TimelineLabels"
@@ -50,7 +50,7 @@ def create(cls, *args, **kwargs):
5050
f"TimelinesLabels model works in simple mode (without training), "
5151
f"but no labels from YOLO model names are matched:\n{instance.control.name}\n"
5252
f"Add labels from YOLO model names to the labeling config or use `predicted_values` to map them. "
53-
f"As alternative option, you can set `model_trainable=\"true\"` in the TimelineLabels control tag "
53+
f'As alternative option, you can set `model_trainable="true"` in the TimelineLabels control tag '
5454
f"to train the model on the labels from the labeling config."
5555
)
5656
return instance
@@ -64,15 +64,21 @@ def predict_regions(self, video_path) -> List[Dict]:
6464
def create_timelines_simple(self, video_path):
6565
logger.debug(f"create_timelines_simple: {self.from_name}")
6666
# get yolo predictions
67-
frame_results = cached_yolo_predict(self.model, video_path, self.model.model_name)
67+
frame_results = cached_yolo_predict(
68+
self.model, video_path, self.model.model_name
69+
)
6870

6971
# Initialize a dictionary to keep track of ongoing segments for each label
7072
model_names = self.model.names
7173
needed_ids = [i for i, name in model_names.items() if name in self.label_map]
72-
needed_labels = [name for i, name in model_names.items() if name in self.label_map]
74+
needed_labels = [
75+
name for i, name in model_names.items() if name in self.label_map
76+
]
7377

7478
probs = [frame.probs.data[needed_ids].cpu().numpy() for frame in frame_results]
75-
label_map = {self.label_map[label]: idx for idx, label in enumerate(needed_labels)}
79+
label_map = {
80+
self.label_map[label]: idx for idx, label in enumerate(needed_labels)
81+
}
7682

7783
return convert_probs_to_timelinelabels(
7884
probs, label_map, self.control.name, self.model_score_threshold
@@ -81,7 +87,9 @@ def create_timelines_simple(self, video_path):
8187
def create_timelines_trainable(self, video_path):
8288
logger.debug(f"create_timelines_trainable: {self.from_name}")
8389
# extract features based on pre-trained yolo classification model
84-
frame_results = cached_feature_extraction(self.model, video_path, self.model.model_name)
90+
frame_results = cached_feature_extraction(
91+
self.model, video_path, self.model.model_name
92+
)
8593

8694
yolo_probs = [frame.probs for frame in frame_results]
8795
path = self.get_classifier_path(self.project_id)
@@ -95,12 +103,22 @@ def create_timelines_trainable(self, video_path):
95103
# run predict and convert to timelinelabels
96104
probs = classifier.predict(yolo_probs)
97105
regions = convert_probs_to_timelinelabels(
98-
probs, classifier.get_label_map(), self.control.name, self.model_score_threshold
106+
probs,
107+
classifier.get_label_map(),
108+
self.control.name,
109+
self.model_score_threshold,
99110
)
100111

101112
return regions
102113

103114
def fit(self, event, data, **kwargs):
115+
if not self.trainable:
116+
logger.debug(
117+
'TimelineLabels model is in not trainable mode. '
118+
'Use model_trainable="true" to enable training.'
119+
)
120+
return
121+
104122
"""Fit the model."""
105123
if event == "START_TRAINING":
106124
# TODO: the full training makes a lot of sense here, but it's not implemented yet
@@ -109,17 +127,20 @@ def fit(self, event, data, **kwargs):
109127
)
110128

111129
if event in ("ANNOTATION_CREATED", "ANNOTATION_UPDATED"):
112-
features, labels, label_map, project_id = self.load_features_and_labels(data)
130+
features, labels, label_map, project_id = self.load_features_and_labels(
131+
data
132+
)
113133
classifier, path = self.load_classifier(features, label_map, project_id)
114134
return self.train_classifier(classifier, features, labels, path)
115135

116136
def train_classifier(self, classifier, features, labels, path):
117-
""" Train the classifier model for timelinelabels using incremental partial learning.
118-
"""
137+
"""Train the classifier model for timelinelabels using incremental partial learning."""
119138
# Stop training when accuracy or f1 score reaches this threshold, it helps to avoid overfitting
120139
# because we partially train it on a small dataset from one annotation only
121140
get = self.control.attr.get
122-
epochs = int(get("model_classifier_epochs", 1000)) # Maximum number of training epochs
141+
epochs = int(
142+
get("model_classifier_epochs", 1000)
143+
) # Maximum number of training epochs
123144
f1_threshold = float(get("model_classifier_f1_threshold", 0.95))
124145
accuracy_threshold = float(get("model_classifier_accuracy_threshold", 1.00))
125146

@@ -129,13 +150,13 @@ def train_classifier(self, classifier, features, labels, path):
129150
labels,
130151
epochs=epochs,
131152
f1_threshold=f1_threshold,
132-
accuracy_threshold=accuracy_threshold
153+
accuracy_threshold=accuracy_threshold,
133154
)
134155
classifier.save_and_cache(path)
135156
return result
136157

137158
def load_classifier(self, features, label_map, project_id):
138-
""" Load or create a classifier model for timelinelabels.
159+
"""Load or create a classifier model for timelinelabels.
139160
1. Load neural network parameters from labeling config.
140161
2. Try loading classifier model from memory cache, then from disk.
141162
3. Or create a new classifier instance if there wasn't successful loading, or if parameters have changed.
@@ -155,11 +176,11 @@ def load_classifier(self, features, label_map, project_id):
155176
# Create a new classifier instance if it doesn't exist
156177
# or if labeling config has changed
157178
if (
158-
not classifier
159-
or classifier.label_map != label_map
160-
or classifier.sequence_size != sequence_size
161-
or classifier.hidden_size != hidden_size
162-
or classifier.num_layers != num_layers
179+
not classifier
180+
or classifier.label_map != label_map
181+
or classifier.sequence_size != sequence_size
182+
or classifier.hidden_size != hidden_size
183+
or classifier.num_layers != num_layers
163184
):
164185
logger.info("Creating a new classifier model for timelinelabels")
165186
input_size = len(features[0])
@@ -176,7 +197,7 @@ def load_classifier(self, features, label_map, project_id):
176197
return classifier, path
177198

178199
def load_features_and_labels(self, data):
179-
""" Load features and labels from the annotation
200+
"""Load features and labels from the annotation
180201
Args:
181202
data: event data, dictionary with keys 'task' and 'annotation'
182203
Returns:

label_studio_ml/examples/yolo/control_models/video_rectangle.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -74,7 +74,9 @@ def predict_regions(self, path) -> List[Dict]:
7474

7575
# run model track
7676
try:
77-
results = self.model.track(path, conf=conf, iou=iou, tracker=tracker, stream=True)
77+
results = self.model.track(
78+
path, conf=conf, iou=iou, tracker=tracker, stream=True
79+
)
7880
finally:
7981
# clean temporary file
8082
if tmp_yaml and os.path.exists(tmp_yaml):

label_studio_ml/examples/yolo/tests/test_neural_nets.py

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,14 @@ def test_multi_label_lstm():
3030
).tolist() # Shape: (batch_size, seq_len, output_size)
3131

3232
# Perform partial training with batch size of 16
33-
model.partial_fit(data, labels, batch_size=16, epochs=1000, accuracy_threshold=0.999, f1_threshold=0.999)
33+
model.partial_fit(
34+
data,
35+
labels,
36+
batch_size=16,
37+
epochs=1000,
38+
accuracy_threshold=0.999,
39+
f1_threshold=0.999,
40+
)
3441

3542
# Example prediction
3643
predictions = model.predict(data)

0 commit comments

Comments
 (0)