feat: add yolov13 implementation for TensorRT 10 by ydk61 · Pull Request #1699 · wang-xinyu/tensorrtx

ydk61 · 2026-01-28T04:33:47Z

No description provided.

Copilot

Pull request overview

This PR adds a full YOLOv13 detection pipeline with TensorRT 10 support, including C++ engine building/inference, GPU-based pre/post-processing, a custom YOLO TensorRT plugin, and two Python inference entrypoints (PyCUDA and cuda-python). The changes introduce new model construction blocks, calibration utilities for INT8, and build tooling (CMake) plus documentation for setup and usage.

Changes:

Add C++ detection executable (yolov13-det) with TensorRT 10, including engine build/serialize, dynamic buffer setup using tensor names, and batched GPU preprocessing/postprocessing.
Implement custom YOLO plugin (YoloLayer_TRT) and associated CUDA kernels for decoding and NMS, plus shared type, preprocess, and postprocess utilities.
Provide Python detection scripts for TensorRT (PyCUDA and cuda-python variants) and a README with environment requirements, build instructions, and Python usage.

Reviewed changes

Copilot reviewed 25 out of 25 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
`yolov13/yolov13_det_trt_cuda-python.py`	Adds a TensorRT 10 detection script using the cuda-python driver/runtime APIs and manual tensor-address binding.
`yolov13/yolov13_det_trt.py`	Adds a TensorRT detection script using PyCUDA, mirroring the C++ pipeline at a higher level.
`yolov13/yolov13_det.cpp`	Implements C++ engine build/deserialize, buffer preparation, and batched inference with optional GPU or CPU postprocessing.
`yolov13/src/preprocess.cu`	Implements CUDA-based affine warp, normalization, and batched preprocessing into NCHW float tensors.
`yolov13/src/postprocess.cu`	Implements CUDA decode and NMS kernels to postprocess YOLO outputs on GPU.
`yolov13/src/postprocess.cpp`	Adds CPU-side NMS, GPU decode result handling, and bounding-box drawing utilities.
`yolov13/src/model.cpp`	Builds the YOLOv13 TensorRT network (backbone, neck, heads, plugin attachment) and serializes the engine for TRT 10.
`yolov13/src/calibrator.cpp`	Implements an INT8 entropy calibrator for TensorRT using OpenCV preprocessing.
`yolov13/src/block.cpp`	Provides reusable TensorRT network building blocks (convs, attention, hypergraph modules, YOLO head wiring, plugin setup).
`yolov13/readme.md`	Documents environment requirements, model export to `.wts`, C++ build/run, and Python inference usage, including TensorRT 10 notes.
`yolov13/plugin/yololayer.h`	Declares the YOLO TensorRT plugin and its creator for the detection head.
`yolov13/plugin/yololayer.cu`	Implements the YOLO plugin, including serialization and CUDA-based detection decoding across feature maps.
`yolov13/include/utils.h`	Adds image preprocessing helpers, directory scanning, label loading, and small string utilities.
`yolov13/include/types.h`	Defines shared detection/affine structures and bbox element constants used across CPU/GPU code.
`yolov13/include/preprocess.h`	Declares CUDA preprocessing API used by the C++ inference pipeline.
`yolov13/include/postprocess.h`	Declares CPU/GPU postprocessing, decode, NMS, and drawing interfaces.
`yolov13/include/model.h`	Declares the YOLOv13 TensorRT engine build function.
`yolov13/include/macros.h`	Adds export and TensorRT-compatibility macros for plugins and loggers.
`yolov13/include/logging.h`	Brings in TensorRT sample-style logging utilities and test helpers.
`yolov13/include/cuda_utils.h`	Adds a CUDA error-checking macro for all CUDA calls.
`yolov13/include/config.h`	Centralizes model/input configuration, thresholds, and build-time precision macros.
`yolov13/include/calibrator.h`	Declares the INT8 entropy calibrator class used in engine building.
`yolov13/include/block.h`	Declares all network construction primitives and plugin wiring used in `model.cpp`.
`yolov13/gen_wts.py`	Adds a tool to convert a PyTorch YOLOv13 `.pt` checkpoint into a `.wts` file for TensorRT building.
`yolov13/CMakeLists.txt`	Adds build configuration for the plugin library and `yolov13-det` executable against CUDA, TensorRT, and OpenCV.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-28T04:40:04Z

yolov13/yolov13_det_trt_cuda-python.py

+
+
+class YoLov13TRT(object):
+    """


cuda.cuCtxCreate is being called with a None pointer as the first argument, but the CUDA Python driver API expects a (flags, device) signature (e.g. cuCtxCreate(0, dev)), so this call will fail at runtime with an argument error and prevent context creation. Update the call to match the CUDA Python API (omit the None parameter and pass just flags and device) so that the context is created correctly on TensorRT 10.

Copilot · 2026-01-28T04:40:04Z

yolov13/include/block.h

+nvinfer1::ILayer* DownsampleConv(nvinfer1::INetworkDefinition* network,
+                                 std::map<std::string, nvinfer1::Weights> weightMap, nvinfer1::ITensor& input,
+                                 int in_channels, std::string lname, bool channel_adjust = true);


DownsampleConv is declared twice with the same signature (lines 100-102 and again here with a default for channel_adjust), which redeclares the default argument and can cause a compilation error on some compilers. Remove one of the duplicate declarations or keep only a single declaration with the desired default value to avoid ODR/default-argument redefinition issues.

Copilot · 2026-01-28T04:40:05Z

yolov13/include/utils.h

+#pragma once
+#include <dirent.h>
+#include <fstream>
+#include <opencv2/opencv.hpp>


This header uses std::string, std::ostringstream, and std::unordered_map further down but only includes <dirent.h>, <fstream>, and OpenCV headers, relying on transitive includes for the C++ standard library types. To make this header self-contained and avoid brittle build issues, explicitly include <string>, <sstream>, and <unordered_map> here.

Suggested change

#include <opencv2/opencv.hpp>

#include <opencv2/opencv.hpp>

#include <string>

#include <sstream>

#include <unordered_map>

#include <vector>

Copilot · 2026-01-28T04:40:05Z

yolov13/include/preprocess.h

@@ -0,0 +1,16 @@
+#pragma once
+
+#include <map>


This header declares functions that use std::vector (e.g. std::vector<cv::Mat>&) but does not include <vector>, instead depending on transitive includes from other headers. For a self-contained and portable header, add an explicit #include <vector> so that any translation unit including this header compiles reliably.

Suggested change

#include <map>

#include <map>

#include <vector>

Copilot · 2026-01-28T04:40:05Z

yolov13/include/postprocess.h

+#pragma once
+
+#include <opencv2/opencv.hpp>
+#include "NvInfer.h"
+#include "cuda_utils.h"
+#include "types.h"
+


This header defines interfaces using std::vector (e.g. std::vector<Detection> and std::vector<cv::Mat>) but does not include <vector>, relying on indirect includes instead. Please add an explicit #include <vector> so the header is self-contained and does not depend on transitive standard library includes.

Copilot · 2026-01-28T04:40:06Z

yolov13/gen_wts.py

@@ -0,0 +1,56 @@
+import sys  # noqa: F401


Import of 'sys' is not used.

Suggested change

import sys # noqa: F401

Copilot · 2026-01-28T04:40:06Z

yolov13/yolov13_det_trt.py

+import time
+import cv2
+import numpy as np
+import pycuda.autoinit  # noqa: F401


Import of 'pycuda' is not used.

Copilot AI review requested due to automatic review settings January 28, 2026 04:33

Copilot started reviewing on behalf of ydk61 January 28, 2026 04:34 View session

feat: add yolov13 implementation for TensorRT 10

08ddd40

ydk61 force-pushed the feat-yolov13-trt10 branch from 9442b7d to 08ddd40 Compare January 28, 2026 04:37

Copilot AI reviewed Jan 28, 2026

View reviewed changes

mpj1234 merged commit bbd19a0 into wang-xinyu:trt10 Jan 28, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add yolov13 implementation for TensorRT 10#1699

feat: add yolov13 implementation for TensorRT 10#1699
mpj1234 merged 1 commit intowang-xinyu:trt10from
ydk61:feat-yolov13-trt10

ydk61 commented Jan 28, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jan 28, 2026

Uh oh!

Copilot AI Jan 28, 2026

Uh oh!

Copilot AI Jan 28, 2026

Uh oh!

Copilot AI Jan 28, 2026

Uh oh!

Copilot AI Jan 28, 2026

Uh oh!

Copilot AI Jan 28, 2026

Uh oh!

Copilot AI Jan 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

-#include <opencv2/opencv.hpp>
+#include <opencv2/opencv.hpp>
+#include <string>
+#include <sstream>
+#include <unordered_map>
+#include <vector>

Conversation

ydk61 commented Jan 28, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants