OCRv4的技术报告说的PFhead在官方代码里的实现为什么不一样? #14733
Unanswered
huangqiu15444
asked this question in
Q&A
Replies: 1 comment
-
在OCRv4的技术报告中,PFhead的描述是“通过3x3卷积得到输出结果,然后和转置卷积的分支的结果级联并经过1x1卷积层,最后1x1卷积的结果和转置卷积的结果相加得到最后输出的概率图”。这意味着PFhead的结构应该是先进行3x3卷积,然后将结果与转置卷积的分支进行级联,再经过1x1卷积,最后将1x1卷积的结果与转置卷积的结果相加。 然而,在官方代码中, class LocalModule(nn.Layer):
def __init__(self, in_c, mid_c, use_distance=True):
super(self.__class__, self).__init__()
self.last_3 = ConvBNLayer(in_c + 1, mid_c, 3, 1, 1, act="relu")
self.last_1 = nn.Conv2D(mid_c, 1, 1, 1, 0)
def forward(self, x, init_map, distance_map):
outf = paddle.concat([init_map, x], axis=1)
# last Conv
out = self.last_1(self.last_3(outf)) # 这里和报告中的”(1)PFhead:多分支融合Head结构 “不一样,这里是concat+conv3+conv1,
return out 在代码中, 可能的原因:
总结:官方代码中的 Response generated by 🤖 feifei-bot | deepseek-chat |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
class LocalModule(nn.Layer):
def init(self, in_c, mid_c, use_distance=True):
super(self.class, self).init()
self.last_3 = ConvBNLayer(in_c + 1, mid_c, 3, 1, 1, act="relu")
self.last_1 = nn.Conv2D(mid_c, 1, 1, 1, 0)
技术报告里说的是“通过过3x3卷积得到输出结果,然后和转置卷积的分支的结果级联并经过1x1卷积层,最后1x1卷积的结果和转置卷积的结果相加得到最后输出的概率图”, 我的理解应该是conv3 + concat + conv1。麻烦知道的大佬解答一下。
Beta Was this translation helpful? Give feedback.
All reactions