Skip to content

Commit 27684ee

Browse files
authored
Add custom models (#1687)
1 parent cc740c9 commit 27684ee

File tree

5 files changed

+274
-7
lines changed

5 files changed

+274
-7
lines changed

docs/source/docs/objectDetection/about-object-detection.md

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -4,16 +4,16 @@
44

55
PhotonVision supports object detection using neural network accelerator hardware built into Orange Pi 5/5+ coprocessors. The Neural Processing Unit, or NPU, is [used by PhotonVision](https://github.com/PhotonVision/rknn_jni/tree/main) to massively accelerate certain math operations like those needed for running ML-based object detection.
66

7-
For the 2025 season, PhotonVision does not currently ship with a pre-trained detector. If teams are interested in using object detection, they can follow the custom process outlined {ref}`below <docs/objectDetection/about-object-detection:Uploading Custom Models>`.
7+
For the 2025 season, PhotonVision does not currently ship with a pre-trained detector. If teams are interested in using object detection, they can follow the custom process outlined {ref}`below <docs/objectDetection/about-object-detection:Uploading Custom Models>`.
88

99
## Tracking Objects
1010

1111
Before you get started with object detection, ensure that you have followed the previous sections on installation, wiring, and networking. Next, open the Web UI, go to the top right card, and switch to the “Object Detection” type. You should see a screen similar to the image above.
1212

13-
PhotonVision currently ships with a NOTE detector based on a [YOLOv5 model](https://docs.ultralytics.com/yolov5/). This model is trained to detect one or more object "classes" (such as cars, stoplights, or in our case, NOTES) in an input image. For each detected object, the model outputs a bounding box around where in the image the object is located, what class the object belongs to, and a unitless confidence between 0 and 1.
13+
PhotonVision does not currently ship with a pretrained model. Models are trained to detect one or more object "classes" (such as cars, stoplights) in an input image. For each detected object, the model outputs a bounding box around where in the image the object is located, what class the object belongs to, and a unitless confidence between 0 and 1.
1414

1515
:::{note}
16-
This model output means that while its fairly easy to say that "this rectangle probably contains a NOTE", we don't have any information about the NOTE's orientation or location. Further math in user code would be required to make estimates about where an object is physically located relative to the camera.
16+
This model output means that while its fairly easy to say that "this rectangle probably contains an object", we don't have any information about the object's orientation or location. Further math in user code would be required to make estimates about where an object is physically located relative to the camera.
1717
:::
1818

1919
## Tuning and Filtering
@@ -40,7 +40,11 @@ Coming soon!
4040
## Uploading Custom Models
4141

4242
:::{warning}
43-
PhotonVision currently ONLY supports YOLOv5 models trained and converted to `.rknn` format for RK3588 CPUs! Other models require different post-processing code and will NOT work. The model conversion process is also highly particular. Proceed with care.
43+
PhotonVision currently ONLY supports 640x640 YOLOv5 & YOLOv8 models trained and converted to `.rknn` format for RK3588 CPUs! Other models require different post-processing code and will NOT work. The model conversion process is also highly particular. Proceed with care.
4444
:::
4545

46-
Use a program like WinSCP or FileZilla to access your coprocessor's filesystem, and copy the new `.rknn` model file into /home/pi. Next, SSH into the coprocessor and `sudo mv /path/to/new/model.rknn /opt/photonvision/photonvision_config/models/NEW-MODEL-NAME.rknn`. Repeat this process with the labels file, which should contain one line per label the model outputs with no training newline. Next, restart PhotonVision via the web UI.
46+
In the settings, under `Device Control`, there's an option to upload a new object detection model. Naming convention
47+
should be `name-verticalResolution-horizontalResolution-modelType`. Additionally, the labels
48+
file ought to have the same name as the RKNN file, with `-labels` appended to the end. For example, if the
49+
RKNN file is named `note-640-640-yolov5s.rknn`, the labels file should be named
50+
`note-640-640-yolov5s-labels.txt`.
Lines changed: 188 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,188 @@
1+
<script setup lang="ts">
2+
import { ref, computed } from "vue";
3+
import axios from "axios";
4+
import { useStateStore } from "@/stores/StateStore";
5+
import { useSettingsStore } from "@/stores/settings/GeneralSettingsStore";
6+
7+
const showObjectDetectionImportDialog = ref(false);
8+
const importRKNNFile = ref<File | null>(null);
9+
const importLabelsFile = ref<File | null>(null);
10+
11+
const handleObjectDetectionImport = () => {
12+
if (importRKNNFile.value === null || importLabelsFile.value === null) return;
13+
14+
const formData = new FormData();
15+
formData.append("rknn", importRKNNFile.value);
16+
formData.append("labels", importLabelsFile.value);
17+
18+
useStateStore().showSnackbarMessage({
19+
message: "Importing Object Detection Model...",
20+
color: "secondary",
21+
timeout: -1
22+
});
23+
24+
axios
25+
.post("/utils/importObjectDetectionModel", formData, {
26+
headers: { "Content-Type": "multipart/form-data" }
27+
})
28+
.then((response) => {
29+
useStateStore().showSnackbarMessage({
30+
message: response.data.text || response.data,
31+
color: "success"
32+
});
33+
})
34+
.catch((error) => {
35+
if (error.response) {
36+
useStateStore().showSnackbarMessage({
37+
color: "error",
38+
message: error.response.data.text || error.response.data
39+
});
40+
} else if (error.request) {
41+
useStateStore().showSnackbarMessage({
42+
color: "error",
43+
message: "Error while trying to process the request! The backend didn't respond."
44+
});
45+
} else {
46+
useStateStore().showSnackbarMessage({
47+
color: "error",
48+
message: "An error occurred while trying to process the request."
49+
});
50+
}
51+
});
52+
53+
showObjectDetectionImportDialog.value = false;
54+
importRKNNFile.value = null;
55+
importLabelsFile.value = null;
56+
};
57+
58+
// Filters out models that are not supported by the current backend, and returns a flattened list.
59+
const supportedModels = computed(() => {
60+
const { availableModels, supportedBackends } = useSettingsStore().general;
61+
return supportedBackends.flatMap((backend) => availableModels[backend] || []);
62+
});
63+
</script>
64+
65+
<template>
66+
<v-card dark class="mb-3" style="background-color: #006492">
67+
<v-card-title class="pa-6">Object Detection</v-card-title>
68+
<div class="pa-6 pt-0">
69+
<v-row>
70+
<v-col cols="12 ">
71+
<v-btn color="secondary" @click="() => (showObjectDetectionImportDialog = true)" class="justify-center">
72+
<v-icon left class="open-icon"> mdi-import </v-icon>
73+
<span class="open-label">Import New Model</span>
74+
</v-btn>
75+
<v-dialog
76+
v-model="showObjectDetectionImportDialog"
77+
width="600"
78+
@input="
79+
() => {
80+
importRKNNFile = null;
81+
importLabelsFile = null;
82+
}
83+
"
84+
>
85+
<v-card color="primary" dark>
86+
<v-card-title>Import New Object Detection Model</v-card-title>
87+
<v-card-text>
88+
Upload a new object detection model to this device that can be used in a pipeline. Naming convention
89+
should be <code>name-verticalResolution-horizontalResolution-modelType</code>. Additionally, the labels
90+
file ought to have the same name as the RKNN file, with <code>-labels</code> appended to the end. For
91+
example, if the RKNN file is named <code>note-640-640-yolov5s.rknn</code>, the labels file should be
92+
named <code>note-640-640-yolov5s-labels.txt</code>. Note that ONLY 640x640 YOLOv5 & YOLOv8 models
93+
trained and converted to `.rknn` format for RK3588 CPUs are currently supported!
94+
<v-row class="mt-6 ml-4 mr-8">
95+
<v-file-input label="RKNN File" v-model="importRKNNFile" accept=".rknn" />
96+
</v-row>
97+
<v-row class="mt-6 ml-4 mr-8">
98+
<v-file-input label="Labels File" v-model="importLabelsFile" accept=".txt" />
99+
</v-row>
100+
<v-row
101+
class="mt-12 ml-8 mr-8 mb-1"
102+
style="display: flex; align-items: center; justify-content: center"
103+
align="center"
104+
>
105+
<v-btn
106+
color="secondary"
107+
:disabled="importRKNNFile === null || importLabelsFile === null"
108+
@click="handleObjectDetectionImport"
109+
>
110+
<v-icon left class="open-icon"> mdi-import </v-icon>
111+
<span class="open-label">Import Object Detection Model</span>
112+
</v-btn>
113+
</v-row>
114+
</v-card-text>
115+
</v-card>
116+
</v-dialog>
117+
</v-col>
118+
</v-row>
119+
<v-row>
120+
<v-col cols="12">
121+
<v-simple-table fixed-header height="100%" dense dark>
122+
<thead style="font-size: 1.25rem">
123+
<tr>
124+
<th class="text-left">Available Models</th>
125+
</tr>
126+
</thead>
127+
<tbody>
128+
<tr v-for="model in supportedModels" :key="model">
129+
<td>{{ model }}</td>
130+
</tr>
131+
</tbody>
132+
</v-simple-table>
133+
</v-col>
134+
</v-row>
135+
</div>
136+
</v-card>
137+
</template>
138+
139+
<style scoped lang="scss">
140+
.v-btn {
141+
width: 100%;
142+
}
143+
@media only screen and (max-width: 351px) {
144+
.open-icon {
145+
margin: 0 !important;
146+
}
147+
.open-label {
148+
display: none;
149+
}
150+
}
151+
.v-data-table {
152+
width: 100%;
153+
height: 100%;
154+
text-align: center;
155+
background-color: #006492 !important;
156+
157+
th,
158+
td {
159+
background-color: #006492 !important;
160+
font-size: 1rem !important;
161+
color: white !important;
162+
}
163+
164+
td {
165+
font-family: monospace !important;
166+
}
167+
168+
tbody :hover td {
169+
background-color: #005281 !important;
170+
}
171+
172+
::-webkit-scrollbar {
173+
width: 0;
174+
height: 0.55em;
175+
border-radius: 5px;
176+
}
177+
178+
::-webkit-scrollbar-track {
179+
-webkit-box-shadow: inset 0 0 6px rgba(0, 0, 0, 0.3);
180+
border-radius: 10px;
181+
}
182+
183+
::-webkit-scrollbar-thumb {
184+
background-color: #ffd843;
185+
border-radius: 10px;
186+
}
187+
}
188+
</style>

photon-client/src/views/GeneralSettingsView.vue

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
<script setup lang="ts">
22
import MetricsCard from "@/components/settings/MetricsCard.vue";
33
import DeviceControlCard from "@/components/settings/DeviceControlCard.vue";
4+
import ObjectDetectionCard from "@/components/settings/ObjectDetectionCard.vue";
45
import NetworkingCard from "@/components/settings/NetworkingCard.vue";
56
import LightingControlCard from "@/components/settings/LEDControlCard.vue";
67
import { useSettingsStore } from "@/stores/settings/GeneralSettingsStore";
@@ -12,6 +13,7 @@ import ApriltagControlCard from "@/components/settings/ApriltagControlCard.vue";
1213
<MetricsCard />
1314
<DeviceControlCard />
1415
<NetworkingCard />
16+
<ObjectDetectionCard v-if="useSettingsStore().general.supportedBackends.length > 0" />
1517
<LightingControlCard v-if="useSettingsStore().lighting.supported" />
1618
<ApriltagControlCard />
1719
</div>

photon-server/src/main/java/org/photonvision/server/RequestHandler.java

Lines changed: 72 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@
2929
import java.util.ArrayList;
3030
import java.util.HashMap;
3131
import java.util.Optional;
32+
import java.util.regex.Pattern;
3233
import javax.imageio.ImageIO;
3334
import org.apache.commons.io.FileUtils;
3435
import org.opencv.core.Mat;
@@ -37,6 +38,7 @@
3738
import org.opencv.imgcodecs.Imgcodecs;
3839
import org.photonvision.common.configuration.ConfigManager;
3940
import org.photonvision.common.configuration.NetworkConfig;
41+
import org.photonvision.common.configuration.NeuralNetworkModelManager;
4042
import org.photonvision.common.dataflow.DataChangeDestination;
4143
import org.photonvision.common.dataflow.DataChangeService;
4244
import org.photonvision.common.dataflow.events.IncomingWebSocketEvent;
@@ -98,7 +100,8 @@ public static void onSettingsImportRequest(Context ctx) {
98100

99101
ConfigManager.getInstance().setWriteTaskEnabled(false);
100102
ConfigManager.getInstance().disableFlushOnShutdown();
101-
// We want to delete the -whole- zip file, so we need to teardown loggers for now
103+
// We want to delete the -whole- zip file, so we need to teardown loggers for
104+
// now
102105
logger.info("Writing new settings zip (logs may be truncated)...");
103106
Logger.closeAllLoggers();
104107
if (ConfigManager.saveUploadedSettingsZip(tempFilePath.get())) {
@@ -543,6 +546,72 @@ public static void onProgramRestartRequest(Context ctx) {
543546
restartProgram();
544547
}
545548

549+
public static void onObjectDetectionModelImportRequest(Context ctx) {
550+
try {
551+
// Retrieve the uploaded files
552+
var modelFile = ctx.uploadedFile("rknn");
553+
var labelsFile = ctx.uploadedFile("labels");
554+
555+
if (modelFile == null || labelsFile == null) {
556+
ctx.status(400);
557+
ctx.result(
558+
"No File was sent with the request. Make sure that the model and labels files are sent at the keys 'rknn' and 'labels'");
559+
logger.error(
560+
"No File was sent with the request. Make sure that the model and labels files are sent at the keys 'rknn' and 'labels'");
561+
return;
562+
}
563+
564+
if (!modelFile.extension().contains("rknn") || !labelsFile.extension().contains("txt")) {
565+
ctx.status(400);
566+
ctx.result(
567+
"The uploaded files were not of type 'rknn' and 'txt'. The uploaded files should be a .rknn and .txt file.");
568+
logger.error(
569+
"The uploaded files were not of type 'rknn' and 'txt'. The uploaded files should be a .rknn and .txt file.");
570+
return;
571+
}
572+
573+
// verify naming convention
574+
// this check will need to be modified if different model types are added
575+
576+
Pattern modelPattern = Pattern.compile("^[a-zA-Z0-9]+-\\d+-\\d+-yolov[58][a-z]*\\.rknn$");
577+
578+
Pattern labelsPattern =
579+
Pattern.compile("^[a-zA-Z0-9]+-\\d+-\\d+-yolov[58][a-z]*-labels\\.txt$");
580+
581+
if (!modelPattern.matcher(modelFile.filename()).matches()
582+
|| !labelsPattern.matcher(labelsFile.filename()).matches()) {
583+
ctx.status(400);
584+
ctx.result("The uploaded files were not named correctly.");
585+
logger.error("The uploaded object detection model files were not named correctly.");
586+
return;
587+
}
588+
589+
// TODO move into neural network manager
590+
591+
var modelPath =
592+
Paths.get(
593+
ConfigManager.getInstance().getModelsDirectory().toString(), modelFile.filename());
594+
var labelsPath =
595+
Paths.get(
596+
ConfigManager.getInstance().getModelsDirectory().toString(), labelsFile.filename());
597+
598+
try (FileOutputStream out = new FileOutputStream(modelPath.toFile())) {
599+
modelFile.content().transferTo(out);
600+
}
601+
602+
try (FileOutputStream out = new FileOutputStream(labelsPath.toFile())) {
603+
labelsFile.content().transferTo(out);
604+
}
605+
606+
NeuralNetworkModelManager.getInstance()
607+
.discoverModels(ConfigManager.getInstance().getModelsDirectory());
608+
609+
ctx.status(200).result("Successfully uploaded object detection model");
610+
} catch (Exception e) {
611+
ctx.status(500).result("Error processing files: " + e.getMessage());
612+
}
613+
}
614+
546615
public static void onDeviceRestartRequest(Context ctx) {
547616
ctx.status(HardwareManager.getInstance().restartDevice() ? 204 : 500);
548617
}
@@ -602,7 +671,8 @@ public static void onCalibrationSnapshotRequest(Context ctx) {
602671
return;
603672
}
604673

605-
// encode as jpeg to save even more space. reduces size of a 1280p image from 300k to 25k
674+
// encode as jpeg to save even more space. reduces size of a 1280p image from
675+
// 300k to 25k
606676
var jpegBytes = new MatOfByte();
607677
Mat img = null;
608678
try {

photon-server/src/main/java/org/photonvision/server/Server.java

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -127,6 +127,9 @@ private static void start(int port) {
127127

128128
// Utilities
129129
app.post("/api/utils/offlineUpdate", RequestHandler::onOfflineUpdateRequest);
130+
app.post(
131+
"/api/utils/importObjectDetectionModel",
132+
RequestHandler::onObjectDetectionModelImportRequest);
130133
app.get("/api/utils/photonvision-journalctl.txt", RequestHandler::onLogExportRequest);
131134
app.post("/api/utils/restartProgram", RequestHandler::onProgramRestartRequest);
132135
app.post("/api/utils/restartDevice", RequestHandler::onDeviceRestartRequest);

0 commit comments

Comments
 (0)