How to configure and train a yolov8n model with a pyramid enhancement block (Pre module to the yolov8) #23681
Replies: 2 comments 2 replies
-
|
👋 Hello @Mahanthvadlamoodi, thank you for your interest in Ultralytics 🚀! We recommend a visit to the Docs for new users where you can find many Python and CLI usage examples and where many of the most common questions may already be answered. This is an automated response 🤖—an Ultralytics engineer will also assist soon 🛠️. If this is a 🐛 Bug Report, please provide a minimum reproducible example to help us debug it. In your case (custom module + model parsing/training), an MRE is especially important—please include: If this is a custom training ❓ Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our Tips for Best Training Results. Also share: Join the Ultralytics community where it suits you best. For real-time chat, head to Discord 🎧. Prefer in-depth discussions? Check out Discourse. Or dive into threads on our Subreddit to share knowledge with the community. UpgradeUpgrade to the latest pip install -U ultralyticsEnvironmentsYOLO may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):
StatusIf this badge is green, all Ultralytics CI tests are currently passing. CI tests verify correct operation of all YOLO Modes and Tasks on macOS, Windows, and Ubuntu every 24 hours and on every commit. |
Beta Was this translation helpful? Give feedback.
-
|
@glenn-jocher can you please help me here? |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I have a Pyramid enhancement network, which helps the detector mostly in the low light scenarios, referencing the PEYOLO paper, i am trying to implement it with the yolov8n model (3M params) and PENet is around 90K params
but facing problems in how to add this PENet with the yolov8n model, PENet dont have separate loss function, it should be combined training using yolo loss only as this is a recognition driven enhancement.
below is the PENet code
``
PENet: Pyramid Enhancement Network (
import torch
import torch.nn as nn
import torch.nn.functional as F
-----------------------
Laplacian Pyramid utils
-----------------------
class Lap_Pyramid_Conv(nn.Module):
def init(self, num_high=3, kernel_size=5, channels=3):
super().init()
self.num_high = num_high
# register fixed kernel buffer for AMP safety (dtype/device later matched)
self.register_buffer("kernel", self.gauss_kernel(kernel_size, channels), persistent=False)
-----------------------
Small enhancement blocks
-----------------------
class ResidualBlock(nn.Module):
def init(self, in_features, out_features=None):
super().init()
if out_features is None:
out_features = in_features
self.block = nn.Sequential(
nn.Conv2d(in_features, in_features, 3, padding=1),
nn.LeakyReLU(inplace=True),
nn.Conv2d(in_features, in_features, 3, padding=1),
)
self.conv_out = nn.Conv2d(in_features, out_features, 3, padding=1)
class DPM(nn.Module):
def init(self, inplanes, planes, act=nn.LeakyReLU(0.2, inplace=True), bias=False):
super().init()
self.conv_mask = nn.Conv2d(inplanes, 1, 1, bias=bias)
self.softmax = nn.Softmax(dim=2)
self.channel_add_conv = nn.Sequential(
nn.Conv2d(inplanes, planes, 1, bias=bias),
act,
nn.Conv2d(planes, inplanes, 1, bias=bias)
)
-----------------------
Sobel filter (AMP-safe)
-----------------------
def sobel(img):
dtype, device, ch = img.dtype, img.device, img.shape[1]
gx = torch.tensor([[1, 0, -1],
[2, 0, -2],
[1, 0, -1]], dtype=dtype, device=device).view(1, 1, 3, 3)
gy = torch.tensor([[1, 2, 1],
[0, 0, 0],
[-1, -2, -1]], dtype=dtype, device=device).view(1, 1, 3, 3)
edge_x = F.conv2d(img, gx.repeat(ch, 1, 1, 1), padding=1, groups=ch)
edge_y = F.conv2d(img, gy.repeat(ch, 1, 1, 1), padding=1, groups=ch)
return torch.sqrt(edge_x ** 2 + edge_y ** 2 + 1e-6)
class LowPassModule(nn.Module):
def init(self, in_channel, sizes=(1, 2, 3, 6)):
super().init()
self.stages = nn.ModuleList([nn.AdaptiveAvgPool2d((s, s)) for s in sizes])
self.relu = nn.ReLU()
ch = in_channel // 4
self.channel_splits = [ch] * 4
class AE(nn.Module):
def init(self, n_feat=3, bias=False):
super().init()
self.edge_conv = nn.Conv2d(3, 3, 1, bias=bias)
self.res1 = ResidualBlock(3, 32)
self.dpm = DPM(32, 32)
self.res2 = ResidualBlock(32, 3)
self.low_conv1 = nn.Conv2d(3, 32, 1)
self.low_conv2 = nn.Conv2d(32, 3, 1)
self.low_pass = LowPassModule(32)
-----------------------
PENet main
-----------------------
class PENet(nn.Module):
def init(self, c1=3, num_high=3, gauss_kernel=5):
super().init()
self.num_high = num_high
self.lap_pyramid = Lap_Pyramid_Conv(num_high, gauss_kernel, channels=c1)
for i in range(self.num_high + 1):
self.setattr(f"AE_{i}", AE(c1))
self.out_conv = nn.Conv2d(c1, c1, 1)
-----------------------
YOLO compatibility wrapper
-----------------------
class PENetWrapper(PENet):
"""
A thin wrapper so parse_model (YOLO) can find expected attributes:
- .f (from index), .i (index), .type (name)
These attributes are set here as defaults; YOLO's model builder will update indices.
"""
def init(self, *args, **kwargs):
super().init(*args, **kwargs)
# YOLO's graph builder expects these fields
self.f = -1 # takes input from previous layer by default
self.i = 0
self.type = "PENet"
expose
all = ["PENet", "PENetWrapper"]
please help me with the model setup and trainign strategy, i am using ExDark dataset
Beta Was this translation helpful? Give feedback.
All reactions