-
Notifications
You must be signed in to change notification settings - Fork 7
Open
Description
Describe the Bug
I'm'trying to train the YOLO 11n model with CometML On. However, after the first epoch, I get a "KeyError: 85" and the stack trace seems to lead to comet.
Expected behavior
It should've followed to the next epochs.
Where is the issue?
- Comet Python SDK
- Comet UI
- Third Party Integrations (Huggingface, TensorboardX, Pytorch Lighting etc)
To Reproduce
Here is my python code:
import comet_ml
from ultralytics import YOLO
comet_ml.login(project_name='yolo11code_teste1')
model = YOLO("yolo11n.pt")
results = model.train(
data="coco.yaml",
project="yolo11code_teste1",
batch=16,
save_period=1,
save_json=True,
epochs=100,
imgsz=320,
)Stack Trace
If possible please include the full stack trace of your issue here
Ultralytics 8.3.59 π Python-3.10.12 torch-2.0.0+cu117 CUDA:0 (NVIDIA GeForce RTX 3060 Laptop GPU, 5938MiB)
engine/trainer: task=detect, mode=train, model=yolo11n.pt, data=coco.yaml, epochs=100, time=None, patience=100, batch=16, imgsz=320, save=True, save_period=1, cache=False, device=None, workers=8, project=yolo11code_teste1, name=train14, exist_ok=False, pretrained=True, optimizer=auto, verbose=True, seed=0, deterministic=True, single_cls=False, rect=False, cos_lr=False, close_mosaic=10, resume=False, amp=True, fraction=1.0, profile=False, freeze=None, multi_scale=False, overlap_mask=True, mask_ratio=4, dropout=0.0, val=True, split=val, save_json=True, save_hybrid=False, conf=None, iou=0.7, max_det=300, half=False, dnn=False, plots=True, source=None, vid_stride=1, stream_buffer=False, visualize=False, augment=False, agnostic_nms=False, classes=None, retina_masks=False, embed=None, show=False, save_frames=False, save_txt=False, save_conf=False, save_crop=False, show_labels=True, show_conf=True, show_boxes=True, line_width=None, format=torchscript, keras=False, optimize=False, int8=False, dynamic=False, simplify=True, opset=None, workspace=None, nms=False, lr0=0.01, lrf=0.01, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=7.5, cls=0.5, dfl=1.5, pose=12.0, kobj=1.0, nbs=64, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, bgr=0.0, mosaic=1.0, mixup=0.0, copy_paste=0.0, copy_paste_mode=flip, auto_augment=randaugment, erasing=0.4, crop_fraction=1.0, cfg=None, tracker=botsort.yaml, save_dir=yolo11code_teste1/train14
from n params module arguments
0 -1 1 464 ultralytics.nn.modules.conv.Conv [3, 16, 3, 2]
1 -1 1 4672 ultralytics.nn.modules.conv.Conv [16, 32, 3, 2]
2 -1 1 6640 ultralytics.nn.modules.block.C3k2 [32, 64, 1, False, 0.25]
3 -1 1 36992 ultralytics.nn.modules.conv.Conv [64, 64, 3, 2]
4 -1 1 26080 ultralytics.nn.modules.block.C3k2 [64, 128, 1, False, 0.25]
5 -1 1 147712 ultralytics.nn.modules.conv.Conv [128, 128, 3, 2]
6 -1 1 87040 ultralytics.nn.modules.block.C3k2 [128, 128, 1, True]
7 -1 1 295424 ultralytics.nn.modules.conv.Conv [128, 256, 3, 2]
8 -1 1 346112 ultralytics.nn.modules.block.C3k2 [256, 256, 1, True]
9 -1 1 164608 ultralytics.nn.modules.block.SPPF [256, 256, 5]
10 -1 1 249728 ultralytics.nn.modules.block.C2PSA [256, 256, 1]
11 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
12 [-1, 6] 1 0 ultralytics.nn.modules.conv.Concat [1]
13 -1 1 111296 ultralytics.nn.modules.block.C3k2 [384, 128, 1, False]
14 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
15 [-1, 4] 1 0 ultralytics.nn.modules.conv.Concat [1]
16 -1 1 32096 ultralytics.nn.modules.block.C3k2 [256, 64, 1, False]
17 -1 1 36992 ultralytics.nn.modules.conv.Conv [64, 64, 3, 2]
18 [-1, 13] 1 0 ultralytics.nn.modules.conv.Concat [1]
19 -1 1 86720 ultralytics.nn.modules.block.C3k2 [192, 128, 1, False]
20 -1 1 147712 ultralytics.nn.modules.conv.Conv [128, 128, 3, 2]
21 [-1, 10] 1 0 ultralytics.nn.modules.conv.Concat [1]
22 -1 1 378880 ultralytics.nn.modules.block.C3k2 [384, 256, 1, True]
23 [16, 19, 22] 1 464912 ultralytics.nn.modules.head.Detect [80, [64, 128, 256]]
YOLO11n summary: 319 layers, 2,624,080 parameters, 2,624,064 gradients, 6.6 GFLOPs
Transferred 499/499 items from pretrained weights
COMET WARNING: To get all data logged automatically, import comet_ml before the following modules: torch.
COMET WARNING: As you are running in a Jupyter environment, you will need to call `experiment.end()` when finished to ensure all metrics and code are logged before exiting.
COMET INFO: Experiment is live on comet.com https://www.comet.com/antoniodourado/yolo11code-teste1/512d631dea2143288b7b45553c89faa2
COMET INFO: Couldn't find a Git repository in '/home/dourado/ml/yolo11code' nor in any parent directory. Set `COMET_GIT_DIRECTORY` if your Git Repository is elsewhere.
Freezing layer 'model.23.dfl.conv.weight'
AMP: running Automatic Mixed Precision (AMP) checks...
AMP: checks passed β
train: Scanning /home/dourado/ml/datasets/coco/labels/train2017.cache... 117266 images, 1021 backgrounds, 0 corrupt: 100%|ββββββββββ| 118287/118287 [00:00<?, ?it/s]
val: Scanning /home/dourado/ml/datasets/coco/labels/val2017.cache... 4952 images, 48 backgrounds, 0 corrupt: 100%|ββββββββββ| 5000/5000 [00:00<?, ?it/s]
Plotting labels to yolo11code_teste1/train14/labels.jpg...
optimizer: 'optimizer=auto' found, ignoring 'lr0=0.01' and 'momentum=0.937' and determining best 'optimizer', 'lr0' and 'momentum' automatically...
optimizer: SGD(lr=0.01, momentum=0.9) with parameter groups 81 weight(decay=0.0), 88 weight(decay=0.0005), 87 bias(decay=0.0)
Image sizes 320 train, 320 val
Using 8 dataloader workers
Logging results to yolo11code_teste1/train14
Starting training for 100 epochs...
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
1/100 0.856G 1.298 1.607 1.17 238 320: 100%|ββββββββββ| 7393/7393 [07:33<00:00, 16.31it/s]
Class Images Instances Box(P R mAP50 mAP50-95): 100%|ββββββββββ| 157/157 [00:17<00:00, 8.78it/s]
all 5000 36335 0.538 0.332 0.345 0.232
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
/tmp/ipykernel_186493/1161498898.py in <module>
----> 1 results = model.train(
2 data="coco.yaml",
3 project="yolo11code_teste1",
4 batch=16,
5 save_period=1,
~/.local/lib/python3.10/site-packages/ultralytics/engine/model.py in train(self, trainer, **kwargs)
804
805 self.trainer.hub_session = self.session # attach optional HUB session
--> 806 self.trainer.train()
807 # Update model and cfg after training
808 if RANK in {-1, 0}:
~/.local/lib/python3.10/site-packages/ultralytics/engine/trainer.py in train(self)
205
206 else:
--> 207 self._do_train(world_size)
208
209 def _setup_scheduler(self):
~/.local/lib/python3.10/site-packages/ultralytics/engine/trainer.py in _do_train(self, world_size)
451 self.scheduler.last_epoch = self.epoch # do not move
452 self.stop |= epoch >= self.epochs # stop if exceeded epochs
--> 453 self.run_callbacks("on_fit_epoch_end")
454 self._clear_memory()
455
~/.local/lib/python3.10/site-packages/ultralytics/engine/trainer.py in run_callbacks(self, event)
166 """Run all existing callbacks associated with a particular event."""
167 for callback in self.callbacks.get(event, []):
--> 168 callback(self)
169
170 def train(self):
~/.local/lib/python3.10/site-packages/ultralytics/utils/callbacks/comet.py in on_fit_epoch_end(trainer)
358 _log_confusion_matrix(experiment, trainer, curr_step, curr_epoch)
359 if _should_log_image_predictions():
--> 360 _log_image_predictions(experiment, trainer.validator, curr_step)
361
362
~/.local/lib/python3.10/site-packages/ultralytics/utils/callbacks/comet.py in _log_image_predictions(experiment, validator, curr_step)
261
262 image_path = Path(image_path)
--> 263 annotations = _fetch_annotations(
264 img_idx,
265 image_path,
~/.local/lib/python3.10/site-packages/ultralytics/utils/callbacks/comet.py in _fetch_annotations(img_idx, image_path, batch, prediction_metadata_map, class_label_map)
192 img_idx, image_path, batch, class_label_map
193 )
--> 194 prediction_annotations = _format_prediction_annotations_for_detection(
195 image_path, prediction_metadata_map, class_label_map
196 )
~/.local/lib/python3.10/site-packages/ultralytics/utils/callbacks/comet.py in _format_prediction_annotations_for_detection(image_path, metadata, class_label_map)
180 cls_label = prediction["category_id"]
181 if class_label_map:
--> 182 cls_label = str(class_label_map[cls_label])
183
184 data.append({"boxes": [boxes], "label": cls_label, "score": score})
KeyError: 85Metadata
Metadata
Assignees
Labels
No labels