-
Notifications
You must be signed in to change notification settings - Fork 136
feat(deepseek-ocr): deepseek-ocr support(On working) #486
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: v2
Are you sure you want to change the base?
Conversation
- Implement Conv2DOp and LayerNorm2DOp for CPU backend - Add ARM NEON optimized kernels for conv2d and layernorm2d - Integrate new ops into IR and op registration system - Extend Tensor API with size() and flatten() methods - Update RTTI and OpTypes to support new operations
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughThis PR adds Conv2D and LayerNorm2D operations to the CPU backend with ARM-optimized kernels, introduces DeepSeek OCR model architecture including vision encoders and tokenizers, extends the Tensor API with utility methods, integrates new operations into the IR/RTTI system, and provides comprehensive test coverage for new kernels. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant CPUConv2DOp
participant ARM Kernels
participant BLAS
participant Output
User->>CPUConv2DOp: load(weights, bias)
CPUConv2DOp->>ARM Kernels: conv2d_fp32_im2col_weight(src, packed_weights)
ARM Kernels-->>CPUConv2DOp: packed_weights
User->>CPUConv2DOp: forward(input)
CPUConv2DOp->>ARM Kernels: conv2d_fp32_im2col_input(input → col_data)
ARM Kernels-->>CPUConv2DOp: col_data
CPUConv2DOp->>BLAS: matmul(packed_weights, col_data)
BLAS-->>CPUConv2DOp: output_gemm
CPUConv2DOp->>CPUConv2DOp: apply_bias(output_gemm)
CPUConv2DOp-->>Output: final_output
sequenceDiagram
participant Input
participant LayerNorm2DOp
participant ARM NEON
participant Output
Input->>LayerNorm2DOp: forward(x, weight, bias)
Note over LayerNorm2DOp: Iterate per (batch, spatial_pos)
LayerNorm2DOp->>ARM NEON: compute_mean_channels(x)
ARM NEON-->>LayerNorm2DOp: mean_per_pos
LayerNorm2DOp->>ARM NEON: compute_variance_channels(x, mean)
ARM NEON-->>LayerNorm2DOp: variance_per_pos
LayerNorm2DOp->>LayerNorm2DOp: inv_std = 1/sqrt(var + eps)
LayerNorm2DOp->>ARM NEON: normalize_affine((x-mean)*inv_std*weight+bias)
ARM NEON-->>LayerNorm2DOp: normalized
LayerNorm2DOp-->>Output: y
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Rationale: This PR spans heterogeneous components requiring distinct reasoning paths: ARM NEON kernel optimizations with vectorized intrinsics (medium density), multi-layer operation integration across aops/backends/nn (structural consistency), DeepSeek model architecture with transformer blocks and attention mechanisms (high logic density), and IR/RTTI registration patterns (straightforward but multi-file). The change count exceeds 30 files with mixed complexity (simple includes, intricate kernels, architecture definitions). While many changes follow established patterns, the NEON optimizations, Conv2D im2col logic, LayerNorm2D channel-wise statistics, and Vision Transformer blocks require careful verification. Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 30
🧹 Nitpick comments (32)
mllm/models/deepseek_ocr/modeling_deepseek_ocr.hpp (1)
1-2: Consider deferring empty placeholder files.This file contains only the copyright header with no implementation. Since the PR is marked as "On working", consider deferring the commit of empty placeholder files until they contain actual code to reduce noise in the codebase.
mllm/models/deepseek_ocr/conversation.hpp (1)
1-2: Consider deferring empty placeholder files.This file contains only the copyright header with no implementation. Consider deferring the commit of empty placeholder files until they contain actual code.
mllm/core/Tensor.hpp (2)
479-485: Consider improving the documentation for size().The method declaration is fine, but the doc comment
@briefis empty. Consider adding a description such as:/** * @brief Gets the size of a specific dimension. * * @param id Dimension index (supports negative indexing) * @return int32_t Size of the specified dimension */
533-539: Consider improving the documentation for flatten().The method declaration is fine and follows the existing pattern of using
0x7fffffffas a sentinel value (consistent with other methods in this file). However, the doc comment@briefis empty. Consider adding a description such as:/** * @brief Flattens the tensor from the specified dimension onward. * * @param dim Starting dimension for flattening (default: flatten all dimensions) * @return Tensor Flattened tensor view */mllm/nn/layers/LayerNorm2D.hpp (1)
17-17: Remove ineffective const on parameter and avoid magic number default.Const on by-value param is noise, and 1e-6 should be a named default.
Apply this diff:
- explicit LayerNorm2D(const int32_t num_channels, float eps = 1e-6); + static constexpr float kDefaultEps = 1e-6f; + explicit LayerNorm2D(int32_t num_channels, float eps = kDefaultEps);mllm/backends/cpu/ops/LayerNorm2DOp.cpp (1)
4-4: Remove unused include.isn’t used. Drop it to speed up builds slightly.
-#include <cstring>mllm/nn/layers/Conv2D.hpp (2)
12-16: Prefer fixed-size containers for 2D parameters.kernel_size/stride/padding/dilation are always 2D. Using std::array<int32_t, 2> conveys intent, avoids heap allocs, and prevents size bugs.
- Conv2D(int32_t in_channels, int32_t out_channels, const std::vector<int32_t>& kernel_size, - const std::vector<int32_t>& stride_size, const std::vector<int32_t>& padding_size, - const std::vector<int32_t>& dilation_size, bool bias = true, + Conv2D(int32_t in_channels, int32_t out_channels, const std::array<int32_t,2>& kernel_size, + const std::array<int32_t,2>& stride_size, const std::array<int32_t,2>& padding_size, + const std::array<int32_t,2>& dilation_size, bool bias = true, aops::Conv2DOpImplType impl_type = aops::Conv2DOpImplType::kDefault);
19-21: Mark accessors noexcept.These are non-throwing in normal operation; add noexcept to signal and enable minor optimizations.
- [[nodiscard]] Tensor weight() const; + [[nodiscard]] Tensor weight() const noexcept; - [[nodiscard]] Tensor bias() const; + [[nodiscard]] Tensor bias() const noexcept;mllm/backends/cpu/ops/Conv2DOp.cpp (3)
4-4: Remove unused include.isn’t referenced.
-#include <cstring>
86-97: Out_h/out_w derived from output; compute/validate explicitly.Use input + params to compute out_h/out_w and verify against output.shape() to prevent silent shape mismatches.
- int MATMUL_M = options_.out_channels; - int MATMUL_K = options_.in_channels * kernel_size[0] * kernel_size[1]; - int MATMUL_N = output.shape()[2] * output.shape()[3]; + const int MATMUL_M = options_.out_channels; + const int MATMUL_K = options_.in_channels * kernel_size[0] * kernel_size[1]; + const int in_h = input.shape()[2], in_w = input.shape()[3]; + const int out_h = (in_h + 2 * padding[0] - dilation[0] * (kernel_size[0] - 1) - 1) / stride[0] + 1; + const int out_w = (in_w + 2 * padding[1] - dilation[1] * (kernel_size[1] - 1) - 1) / stride[1] + 1; + MLLM_CHECK(output.shape()[2] == out_h && output.shape()[3] == out_w, "Conv2D output shape mismatch"); + int MATMUL_N = out_h * out_w; // batch handled below
105-126: Bias/arch paths fine; improve error messages and ensure batch-aware matmul.
- After packing B batches, set MATMUL_N = B * out_h * out_w and pass that to matmul.
- Minor: “Pls” → “Please”.
- case aops::MatMulOpType::kBLAS: { + case aops::MatMulOpType::kBLAS: { #if defined(MLLM_USE_BLAS) - blas::matmul_fp32(weight_.ptr<mllm_fp32_t>(), packed_inputs.ptr<mllm_fp32_t>(), output.ptr<mllm_fp32_t>(), - options_.bias ? bias_.ptr<mllm_fp32_t>() : nullptr, MATMUL_M, MATMUL_N, MATMUL_K, false, false); + blas::matmul_fp32(weight_.ptr<mllm_fp32_t>(), packed_inputs.ptr<mllm_fp32_t>(), output.ptr<mllm_fp32_t>(), + options_.bias ? bias_.ptr<mllm_fp32_t>() : nullptr, + MATMUL_M, /*N=*/B * MATMUL_N, MATMUL_K, false, false); #else - NYI("BLAS not supported. Pls set MLLM_USE_BLAS=ON to enable BLAS supports in cmake."); + NYI("BLAS not supported. Please set MLLM_USE_BLAS=ON to enable BLAS in CMake."); #endif break; } case aops::MatMulOpType::kMllmBlas: { auto thread_count = options_.getThreads(); #if defined(MLLM_HOST_ARCH_ARM64) || defined(MLLM_HOST_ARCH_ARM) - arm::mllm_blas_matmul_fp32(MATMUL_M, MATMUL_K, MATMUL_N, output.ptr<mllm_fp32_t>(), weight_.ptr<mllm_fp32_t>(), - packed_inputs.ptr<mllm_fp32_t>(), options_.bias ? bias_.ptr<mllm_fp32_t>() : nullptr, + arm::mllm_blas_matmul_fp32(MATMUL_M, MATMUL_K, /*N=*/B * MATMUL_N, + output.ptr<mllm_fp32_t>(), weight_.ptr<mllm_fp32_t>(), + packed_inputs.ptr<mllm_fp32_t>(), + options_.bias ? bias_.ptr<mllm_fp32_t>() : nullptr, false, false, thread_count);mllm/backends/cpu/ops/Conv2DOp.hpp (1)
20-25: Minor: mark factory method noexcept/final.Creation shouldn’t throw; also mark final for clarity.
-class CPUConv2DOpFactory : public TypedOpFactory<OpTypes::kConv2D, aops::Conv2DOpOptions> { +class CPUConv2DOpFactory final : public TypedOpFactory<OpTypes::kConv2D, aops::Conv2DOpOptions> { public: - std::shared_ptr<BaseOp> createOpImpl(const aops::Conv2DOpOptions& options) override { + std::shared_ptr<BaseOp> createOpImpl(const aops::Conv2DOpOptions& options) noexcept override { return std::make_shared<CPUConv2DOp>(options); } };mllm/backends/cpu/kernels/arm/conv2d.hpp (1)
21-31: Parameter types and const/noexcept polish.Use fixed-width ints for dims and add noexcept; keeps ABI explicit and hints compiler.
-void conv2d_fp32_im2col_input(const float* input_data, const int channels, const int height, const int width, - const int kernel_h, const int kernel_w, const int pad_h, const int pad_w, const int stride_h, - const int stride_w, const int dilation_h, const int dilation_w, float* col_data); +void conv2d_fp32_im2col_input(const float* input_data, + int32_t channels, int32_t height, int32_t width, + int32_t kernel_h, int32_t kernel_w, + int32_t pad_h, int32_t pad_w, + int32_t stride_h, int32_t stride_w, + int32_t dilation_h, int32_t dilation_w, + float* col_data) noexcept; @@ -void conv2d_fp32_im2col_weight(const float* src_weight, float* packed_weight, int out_channels, int in_channels, int kernel_h, - int kernel_w); +void conv2d_fp32_im2col_weight(const float* src_weight, float* packed_weight, + int32_t out_channels, int32_t in_channels, + int32_t kernel_h, int32_t kernel_w) noexcept;mllm/core/aops/LayerNorm2DOp.hpp (2)
11-14: Initialize options and avoid magic number.Give num_channels a safe default and name eps.
-struct LayerNorm2DOpOptions : public BaseOpOptions<LayerNorm2DOpOptions> { - int32_t num_channels; - float eps = 1e-6; -}; +struct LayerNorm2DOpOptions : public BaseOpOptions<LayerNorm2DOpOptions> { + static constexpr float kDefaultEps = 1e-6f; + int32_t num_channels{-1}; + float eps{kDefaultEps}; +};
32-36: Add const accessors to avoid unintended mutation.Expose const refs alongside mutable ones.
- inline Tensor& weight() { return weight_; } - inline Tensor& bias() { return bias_; } - inline LayerNorm2DOpOptions& options() { return options_; } + inline Tensor& weight() { return weight_; } + inline const Tensor& weight() const { return weight_; } + inline Tensor& bias() { return bias_; } + inline const Tensor& bias() const { return bias_; } + inline LayerNorm2DOpOptions& options() { return options_; } + inline const LayerNorm2DOpOptions& options() const { return options_; }mllm/models/deepseek_ocr/tokenization_deepseek_ocr.hpp (1)
45-50: Initialize BPE explicitly (clarity).Make construction explicit and consider wiring file_path when implemented.
- preprocessor::BPE bpe_; + preprocessor::BPE bpe_{};mllm/backends/cpu/kernels/arm/layernorm2d.cpp (2)
60-79: Optional: readability of loop index 'c'.Declare 'int c = 0;' once before NEON sections to avoid reusing scope across separated blocks.
-#if defined(__ARM_NEON) - float32x4_t inv_std_vec = vdupq_n_f32(inv_std); - c = 0; +#if defined(__ARM_NEON) + int c = 0; + float32x4_t inv_std_vec = vdupq_n_f32(inv_std);
11-85: Optional: aliasing/perf hints.If your toolchain supports it, annotate pointers with restrict (project macro) and consider a fused mean/var pass to reduce memory traffic. Not required for functionality.
mllm/core/aops/LayerNorm2DOp.cpp (2)
16-21: Validate pulled tensors before view; enforce channel count.Protect against missing params and mismatched sizes before reshaping.
void LayerNorm2DOp::load(const ParameterFile::ptr_t& ploader) { - weight_ = ploader->pull(getName() + ".weight"); - weight_ = weight_.view({options_.num_channels}); - bias_ = ploader->pull(getName() + ".bias"); - bias_ = bias_.view({options_.num_channels}); + weight_ = ploader->pull(getName() + ".weight"); + bias_ = ploader->pull(getName() + ".bias"); + + if (!weight_) { MLLM_ERROR_EXIT(ExitCode::kCoreError, "Missing parameter: {}.weight", getName()); } + if (!bias_) { MLLM_ERROR_EXIT(ExitCode::kCoreError, "Missing parameter: {}.bias", getName()); } + + MLLM_RT_ASSERT_EQ(weight_.numel(), options_.num_channels); + MLLM_RT_ASSERT_EQ(bias_.numel(), options_.num_channels); + weight_ = weight_.view({options_.num_channels}); + bias_ = bias_.view({options_.num_channels}); }
49-54: Avoid pushing empty params.Skip pushing absent tensors.
- p->push(getName() + ".weight", weight_); - p->push(getName() + ".bias", bias_); + if (weight_) p->push(getName() + ".weight", weight_); + if (bias_) p->push(getName() + ".bias", bias_);mllm/core/aops/Conv2DOp.cpp (3)
52-53: Remove unused local_op.Avoid unused-variable warnings.
- auto _op = ir_ctx->create<ir::linalg::Conv2DOp>(shared_from_this(), i_irs, o_irs); + ir_ctx->create<ir::linalg::Conv2DOp>(shared_from_this(), i_irs, o_irs);
63-68: Unreachable fallback after fatal error macro.
MLLM_ERROR_EXITlikely terminates; fallback tensor creation is dead code. Prefer one behavior.- if (ishape.size() != 4) { - MLLM_ERROR_EXIT(ExitCode::kCoreError, "Conv2DOp expects 4D input, got {} D", ishape.size()); - outputs.emplace_back(Tensor::empty(i.shape(), i.dtype(), i.device())); - return; - } + if (ishape.size() != 4) { + MLLM_ERROR_EXIT(ExitCode::kCoreError, "Conv2DOp expects 4D input, got {} D", ishape.size()); + return; + }
106-111: Skip pushing absent bias.Avoid writing an empty bias tensor when
options_.biasis true butbias_wasn’t loaded.- p->push(getName() + ".weight", weight_); - if (options_.bias) { p->push(getName() + ".bias", bias_); } + if (weight_) p->push(getName() + ".weight", weight_); + if (options_.bias && bias_) { p->push(getName() + ".bias", bias_); }tests/cpu/Conv2DKernelTest.hpp (4)
13-16: Reduce swappable-parameter risk innaive_conv2d.Bundle shape/stride/pad/dilation into a struct to avoid call‑site mix‑ups and simplify the signature.
-void naive_conv2d(const float* input_data, const float* weight_data, const float* bias_data, float* output_data, - int in_channels, int in_h, int in_w, int out_channels, int kernel_h, int kernel_w, int pad_h, int pad_w, - int stride_h, int stride_w, int dilation_h, int dilation_w) { +struct Conv2DArgs { + int in_channels, in_h, in_w, out_channels; + int kernel_h, kernel_w, pad_h, pad_w, stride_h, stride_w, dilation_h, dilation_w; +}; +void naive_conv2d(const float* input_data, const float* weight_data, const float* bias_data, float* output_data, + const Conv2DArgs& a) {…and replace uses (
in_channels→a.in_channels, etc.). Update the sole call site accordingly.
53-57: Avoid partially-initialized module; remove default ctor.
Conv2DModule()isn’t used and leavesconv2d_unconfigured. Delete it to prevent accidental misuse.-class Conv2DModule : public nn::Module { +class Conv2DModule : public nn::Module { nn::Conv2D conv2d_; public: - Conv2DModule() = default; + Conv2DModule() = delete;
86-94: Don’t push bias when unused.Keep test inputs aligned with op options.
- param->push("emb.bias", bias_param); + if (bias) { param->push("emb.bias", bias_param); }
95-103: Add deterministic seed for reproducibility.Reduce flaky diffs across runs.
- auto input = Tensor::random({1, in_channel, I_H, I_W}, -1, 1, kFloat32, kCPU); + test::setRandomSeed(42); + auto input = Tensor::random({1, in_channel, I_H, I_W}, -1, 1, kFloat32, kCPU);If
test::setRandomSeedisn’t available, use the project’s equivalent or seed the RNG used byTensor::random.mllm/core/aops/Conv2DOp.hpp (1)
60-65: Offer const accessors to preserve invariants.Expose read-only views by default; keep mutable ones only if needed.
- inline Tensor& weight() { return weight_; } - inline Tensor& bias() { return bias_; } - inline Conv2DOpOptions& options() { return options_; } + inline const Tensor& weight() const { return weight_; } + inline const Tensor& bias() const { return bias_; } + inline const Conv2DOpOptions& options() const { return options_; } + inline Tensor& weight() { return weight_; } // keep if mutation needed + inline Tensor& bias() { return bias_; } // keep if mutation needed + inline Conv2DOpOptions& options() { return options_; }mllm/models/deepseek_ocr/configuration_deepseek_ocr.hpp (1)
14-99: Harden config parsing with defaults/guards.Several reads assume presence; add
contains()checks or helper getters with defaults to avoid exceptions on missing keys.I can draft a small
get_or<T>(json, key, default)helper to wrapcontains/is_nulland casting across the repeated fields. Do you want that?mllm/models/deepseek_ocr/deepencoder.hpp (3)
185-189: Avoid hard-coded 1024; use embed_dim_.Keeps code future-proof if CLIP dims change.
- auto class_embeds = class_embedding_.weight().view({1, 1, 1024}); + auto class_embeds = class_embedding_.weight().view({1, 1, embed_dim_});
404-423: Reserved identifier: rename __interpolateLinear1d.Leading double underscore is reserved; also update call site.
- Tensor __interpolateLinear1d(const Tensor& input, int output_size) { + Tensor interpolateLinear1d(const Tensor& input, int output_size) { @@ - rel_pos_resized = __interpolateLinear1d(rel_pos, max_rel_dist); + rel_pos_resized = interpolateLinear1d(rel_pos, max_rel_dist);Also applies to: 431-433
64-79: Prefer member-initializers and constants/config instead of magic numbers.Move embed/image/patch sizes to compile-time constants or DpskOcrConfig; initialize members in the ctor initializer list.
- CLIPVisionEmbeddings(const std::string& name, const DpskOcrConfig& config) : nn::Module(name) { - embed_dim_ = 1024; - image_size_ = 224; - patch_size_ = 14; + CLIPVisionEmbeddings(const std::string& name, const DpskOcrConfig& config) + : nn::Module(name), + embed_dim_(/*config.clip_embed_dim*/1024), + image_size_(/*config.clip_image_size*/224), + patch_size_(/*config.clip_patch*/14) {If config already carries these, wire them instead of literals. Based on learnings.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (39)
mllm/backends/cpu/CPUBackend.cpp(2 hunks)mllm/backends/cpu/kernels/Kernels.hpp(1 hunks)mllm/backends/cpu/kernels/arm/conv2d.cpp(1 hunks)mllm/backends/cpu/kernels/arm/conv2d.hpp(1 hunks)mllm/backends/cpu/kernels/arm/layernorm2d.cpp(1 hunks)mllm/backends/cpu/kernels/arm/layernorm2d.hpp(1 hunks)mllm/backends/cpu/ops/Conv2DOp.cpp(1 hunks)mllm/backends/cpu/ops/Conv2DOp.hpp(1 hunks)mllm/backends/cpu/ops/LayerNorm2DOp.cpp(1 hunks)mllm/backends/cpu/ops/LayerNorm2DOp.hpp(1 hunks)mllm/compile/ir/GeneratedRTTIKind.hpp(2 hunks)mllm/compile/ir/NodeRTTIClassOfImpl.hpp(2 hunks)mllm/compile/ir/linalg/Op.cpp(1 hunks)mllm/compile/ir/linalg/Op.hpp(2 hunks)mllm/compile/ir/rtti_kind_gen.py(1 hunks)mllm/core/OpTypes.hpp(2 hunks)mllm/core/Tensor.cpp(2 hunks)mllm/core/Tensor.hpp(2 hunks)mllm/core/aops/Conv2DOp.cpp(1 hunks)mllm/core/aops/Conv2DOp.hpp(1 hunks)mllm/core/aops/LayerNorm2DOp.cpp(1 hunks)mllm/core/aops/LayerNorm2DOp.hpp(1 hunks)mllm/models/deepseek_ocr/README.md(1 hunks)mllm/models/deepseek_ocr/configuration_deepseek_ocr.hpp(1 hunks)mllm/models/deepseek_ocr/conversation.hpp(1 hunks)mllm/models/deepseek_ocr/deepencoder.hpp(1 hunks)mllm/models/deepseek_ocr/modeling_deepseek_ocr.hpp(1 hunks)mllm/models/deepseek_ocr/tokenization_deepseek_ocr.hpp(1 hunks)mllm/nn/Functional.cpp(1 hunks)mllm/nn/Functional.hpp(1 hunks)mllm/nn/Module.hpp(1 hunks)mllm/nn/Nn.hpp(2 hunks)mllm/nn/layers/Conv2D.cpp(1 hunks)mllm/nn/layers/Conv2D.hpp(1 hunks)mllm/nn/layers/LayerNorm2D.cpp(1 hunks)mllm/nn/layers/LayerNorm2D.hpp(1 hunks)tests/cpu/Conv2DKernelTest.hpp(1 hunks)tests/cpu/FlashAttentionKernelTest.hpp(0 hunks)tests/cpu/KernelTest.cpp(1 hunks)
💤 Files with no reviewable changes (1)
- tests/cpu/FlashAttentionKernelTest.hpp
🧰 Additional context used
🪛 Clang (14.0.6)
mllm/nn/layers/LayerNorm2D.hpp
[error] 6-6: 'mllm/nn/Layer.hpp' file not found
(clang-diagnostic-error)
[error] 17-17: parameter 'num_channels' is const-qualified in the function declaration; const-qualification of parameters only has an effect in function definitions
(readability-avoid-const-params-in-decls,-warnings-as-errors)
[error] 17-17: 1e-6 is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
mllm/backends/cpu/kernels/arm/layernorm2d.hpp
[error] 6-6: 'mllm/utils/CPUArchHelper.hpp' file not found
(clang-diagnostic-error)
mllm/models/deepseek_ocr/tokenization_deepseek_ocr.hpp
[error] 11-11: 'vector' file not found
(clang-diagnostic-error)
[error] 23-23: constructor does not initialize these fields: bpe_
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
mllm/nn/layers/Conv2D.hpp
[error] 3-3: 'mllm/nn/Layer.hpp' file not found
(clang-diagnostic-error)
mllm/backends/cpu/kernels/arm/conv2d.cpp
[error] 3-3: 'mllm/backends/cpu/kernels/arm/conv2d.hpp' file not found
(clang-diagnostic-error)
mllm/backends/cpu/ops/LayerNorm2DOp.cpp
[error] 4-4: 'cstring' file not found
(clang-diagnostic-error)
mllm/nn/layers/LayerNorm2D.cpp
[error] 4-4: 'mllm/core/aops/LayerNorm2DOp.hpp' file not found
(clang-diagnostic-error)
mllm/core/aops/LayerNorm2DOp.hpp
[error] 6-6: 'mllm/core/BaseOp.hpp' file not found
(clang-diagnostic-error)
[error] 11-11: constructor does not initialize these fields: num_channels
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
[error] 13-13: 1e-6 is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
[error] 39-39: member variable 'weight_' has protected visibility
(cppcoreguidelines-non-private-member-variables-in-classes,-warnings-as-errors)
[error] 40-40: member variable 'bias_' has protected visibility
(cppcoreguidelines-non-private-member-variables-in-classes,-warnings-as-errors)
[error] 41-41: member variable 'options_' has protected visibility
(cppcoreguidelines-non-private-member-variables-in-classes,-warnings-as-errors)
mllm/backends/cpu/ops/Conv2DOp.cpp
[error] 4-4: 'cstring' file not found
(clang-diagnostic-error)
mllm/backends/cpu/kernels/arm/layernorm2d.cpp
[error] 3-3: 'mllm/backends/cpu/kernels/arm/conv2d.hpp' file not found
(clang-diagnostic-error)
mllm/compile/ir/linalg/Op.hpp
[error] 217-217: parameter name 'p' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 217-217: class 'LayerNorm2DOp' defines a non-default destructor but does not define a copy constructor, a copy assignment operator, a move constructor or a move assignment operator
(cppcoreguidelines-special-member-functions,-warnings-as-errors)
mllm/backends/cpu/ops/LayerNorm2DOp.hpp
[error] 6-6: 'mllm/core/BaseOp.hpp' file not found
(clang-diagnostic-error)
mllm/nn/layers/Conv2D.cpp
[error] 1-1: 'mllm/nn/layers/Conv2D.hpp' file not found
(clang-diagnostic-error)
mllm/core/aops/Conv2DOp.hpp
[error] 6-6: 'mllm/core/BaseOp.hpp' file not found
(clang-diagnostic-error)
[error] 15-15: constructor does not initialize these fields: in_channels, out_channels, kernel_size, stride, padding, dilation
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
[error] 29-29: variable name 'it' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 39-39: variable name 'it' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 67-67: member variable 'weight_' has protected visibility
(cppcoreguidelines-non-private-member-variables-in-classes,-warnings-as-errors)
[error] 68-68: member variable 'bias_' has protected visibility
(cppcoreguidelines-non-private-member-variables-in-classes,-warnings-as-errors)
[error] 69-69: member variable 'options_' has protected visibility
(cppcoreguidelines-non-private-member-variables-in-classes,-warnings-as-errors)
mllm/core/Tensor.hpp
[error] 539-539: 0x7fffffff is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
mllm/backends/cpu/kernels/arm/conv2d.hpp
[error] 6-6: 'mllm/utils/CPUArchHelper.hpp' file not found
(clang-diagnostic-error)
mllm/backends/cpu/ops/Conv2DOp.hpp
[error] 6-6: 'mllm/core/BaseOp.hpp' file not found
(clang-diagnostic-error)
mllm/models/deepseek_ocr/deepencoder.hpp
[error] 5-5: 'cmath' file not found
(clang-diagnostic-error)
[error] 51-51: constructor does not initialize these fields: embed_dim_, image_size_, patch_size_, class_embedding_, patch_embedding_, num_patches_, num_positions_, position_embedding_
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
[error] 64-64: constructor does not initialize these fields: class_embedding_, patch_embedding_, position_embedding_
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
[error] 64-64: 2 adjacent parameters of 'CLIPVisionEmbeddings' of similar type ('const int &') are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 65-65: 'embed_dim_' should be initialized in a member initializer of the constructor
(cppcoreguidelines-prefer-member-initializer,-warnings-as-errors)
[error] 65-65: 1024 is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
[error] 66-66: 'image_size_' should be initialized in a member initializer of the constructor
(cppcoreguidelines-prefer-member-initializer,-warnings-as-errors)
[error] 66-66: 224 is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
[error] 67-67: 'patch_size_' should be initialized in a member initializer of the constructor
(cppcoreguidelines-prefer-member-initializer,-warnings-as-errors)
[error] 67-67: 14 is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
[error] 68-68: 'num_patches_' should be initialized in a member initializer of the constructor
(cppcoreguidelines-prefer-member-initializer,-warnings-as-errors)
[error] 69-69: 'num_positions_' should be initialized in a member initializer of the constructor
(cppcoreguidelines-prefer-member-initializer,-warnings-as-errors)
[error] 81-81: 2 adjacent parameters of 'getAbsPos' of similar type ('int') are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 103-103: variable 'channels' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 108-108: parameter name 'x' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 109-109: variable name 'a' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 109-109: floating point literal has suffix 'f', which is not uppercase
(readability-uppercase-literal-suffix,-warnings-as-errors)
[error] 111-111: floating point literal has suffix 'f', which is not uppercase
(readability-uppercase-literal-suffix,-warnings-as-errors)
[error] 112-112: 2.0f is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
[error] 112-112: floating point literal has suffix 'f', which is not uppercase
(readability-uppercase-literal-suffix,-warnings-as-errors)
[error] 112-112: 3.0f is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
[error] 112-112: floating point literal has suffix 'f', which is not uppercase
(readability-uppercase-literal-suffix,-warnings-as-errors)
[error] 112-112: floating point literal has suffix 'f', which is not uppercase
(readability-uppercase-literal-suffix,-warnings-as-errors)
[error] 113-113: do not use 'else' after 'return'
(readability-else-after-return,-warnings-as-errors)
[error] 113-113: 2.0f is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
[error] 113-113: floating point literal has suffix 'f', which is not uppercase
(readability-uppercase-literal-suffix,-warnings-as-errors)
[error] 114-114: 5.0f is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
[error] 114-114: floating point literal has suffix 'f', which is not uppercase
(readability-uppercase-literal-suffix,-warnings-as-errors)
[error] 114-114: 8.0f is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
[error] 114-114: floating point literal has suffix 'f', which is not uppercase
(readability-uppercase-literal-suffix,-warnings-as-errors)
[error] 114-114: 4.0f is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
[error] 114-114: floating point literal has suffix 'f', which is not uppercase
(readability-uppercase-literal-suffix,-warnings-as-errors)
[error] 116-116: floating point literal has suffix 'f', which is not uppercase
(readability-uppercase-literal-suffix,-warnings-as-errors)
[error] 120-120: variable 'scale_y' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 121-121: variable 'scale_x' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 123-123: loop variable name 'c' is too short, expected at least 2 characters
(readability-identifier-length,-warnings-as-errors)
[error] 124-124: variable 'src_channel_ptr' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 125-125: variable 'dst_channel_ptr' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 163-163: 2 adjacent parameters of 'forward' of similar type ('const int &') are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 169-169: repeated branch in conditional chain
(bugprone-branch-clone,-warnings-as-errors)
[error] 194-194: constructor does not initialize these fields: fc1_, fc2_, act_
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
[error] 202-202: constructor does not initialize these fields: fc1_, fc2_, act_
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
[error] 202-202: 4 adjacent parameters of 'NoTPFeedForward' of similar type are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 208-208: 2 adjacent parameters of 'forward' of similar type ('const int &') are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 213-213: constructor does not initialize these fields: num_heads_, n_local_heads_, head_dim_, max_seq_len_, qkv_proj_, out_proj_
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
[error] 224-224: constructor does not initialize these fields: qkv_proj_, out_proj_
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
[error] 224-224: 2 adjacent parameters of 'NoTPAttention' of similar type ('const int &') are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 225-225: 'num_heads_' should be initialized in a member initializer of the constructor
(cppcoreguidelines-prefer-member-initializer,-warnings-as-errors)
[error] 225-225: 16 is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
[error] 226-226: 'n_local_heads_' should be initialized in a member initializer of the constructor
(cppcoreguidelines-prefer-member-initializer,-warnings-as-errors)
[error] 226-226: 16 is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
[error] 227-227: 'head_dim_' should be initialized in a member initializer of the constructor
(cppcoreguidelines-prefer-member-initializer,-warnings-as-errors)
[error] 227-227: 1024 is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
[error] 227-227: 16 is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
[error] 228-228: 'max_seq_len_' should be initialized in a member initializer of the constructor
(cppcoreguidelines-prefer-member-initializer,-warnings-as-errors)
[error] 228-228: 256 is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
[error] 234-234: 2 adjacent parameters of 'forward' of similar type ('const int &') are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 236-236: variable name 'x' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 256-256: constructor does not initialize these fields: n_heads_, dim_, head_dim_, layer_norm1_, layer_norm2_, layer_id_
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
[error] 266-266: member variable 'layer_id_' has public visibility
(cppcoreguidelines-non-private-member-variables-in-classes,-warnings-as-errors)
[error] 270-270: constructor does not initialize these fields: layer_norm1_, layer_norm2_, layer_id_
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
[error] 270-270: 2 adjacent parameters of 'NoTPTransformerBlock' of similar type ('const int &') are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 271-271: 'n_heads_' should be initialized in a member initializer of the constructor
(cppcoreguidelines-prefer-member-initializer,-warnings-as-errors)
[error] 271-271: 16 is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
[error] 272-272: 'dim_' should be initialized in a member initializer of the constructor
(cppcoreguidelines-prefer-member-initializer,-warnings-as-errors)
[error] 272-272: 1024 is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
[error] 273-273: 'head_dim_' should be initialized in a member initializer of the constructor
(cppcoreguidelines-prefer-member-initializer,-warnings-as-errors)
[error] 273-273: 1024 is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
[error] 273-273: 16 is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
[error] 280-280: 2 adjacent parameters of 'forward' of similar type ('const int &') are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 281-281: variable name 'x' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 283-283: variable name 'h' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 289-289: constructor does not initialize these fields: num_layers_, layers_
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
[error] 296-296: constructor does not initialize these fields: layers_
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
[error] 296-296: 2 adjacent parameters of 'NoTPTransformer' of similar type ('const int &') are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 297-297: 'num_layers_' should be initialized in a member initializer of the constructor
(cppcoreguidelines-prefer-member-initializer,-warnings-as-errors)
[error] 297-297: 24 is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
[error] 302-302: 2 adjacent parameters of 'forward' of similar type ('const int &') are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 310-310: constructor does not initialize these fields: pre_layernorm_
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
[error] 318-318: constructor does not initialize these fields: pre_layernorm_
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
[error] 318-318: 2 adjacent parameters of 'VitModel' of similar type ('const int &') are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 327-327: 2 adjacent parameters of 'forward' of similar type ('const int &') are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 328-328: variable name 'x' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 341-341: constructor does not initialize these fields: proj_
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
[error] 347-347: constructor does not initialize these fields: proj_
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
[error] 347-347: 2 adjacent parameters of 'PatchEmbed' of similar type ('const int &') are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 352-352: 2 adjacent parameters of 'forward' of similar type ('const int &') are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 353-353: variable name 'x' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 359-359: constructor does not initialize these fields: lin1_, lin2_, act_
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
[error] 367-367: constructor does not initialize these fields: lin1_, lin2_, act_
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
[error] 367-367: 4 adjacent parameters of 'MLPBlock' of similar type are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 373-373: 2 adjacent parameters of 'forward' of similar type ('const int &') are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 378-378: constructor does not initialize these fields: num_heads_, use_rel_pos_, qkv_, proj_, rel_pos_h_, rel_pos_w_
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
[error] 390-390: constructor does not initialize these fields: qkv_, proj_, rel_pos_h_, rel_pos_w_
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
[error] 390-390: 3 adjacent parameters of 'Attention' of similar type are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 391-391: all parameters should be named in a function
(readability-named-parameter,-warnings-as-errors)
[error] 393-393: 'num_heads_' should be initialized in a member initializer of the constructor
(cppcoreguidelines-prefer-member-initializer,-warnings-as-errors)
[error] 394-394: 'use_rel_pos_' should be initialized in a member initializer of the constructor
(cppcoreguidelines-prefer-member-initializer,-warnings-as-errors)
[error] 404-404: declaration uses identifier '__interpolateLinear1d', which is a reserved identifier
(bugprone-reserved-identifier,-warnings-as-errors)
[error] 404-404: 2 adjacent parameters of '__interpolateLinear1d' of similar type are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 406-406: variable 'input_size' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 407-407: narrowing conversion from 'int' to 'float'
(bugprone-narrowing-conversions,cppcoreguidelines-narrowing-conversions,-warnings-as-errors)
[error] 410-410: narrowing conversion from 'int' to 'float'
(bugprone-narrowing-conversions,cppcoreguidelines-narrowing-conversions,-warnings-as-errors)
[error] 411-411: variable name 'x0' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 412-412: variable 'x1' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 412-412: variable name 'x1' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 413-413: variable name 'w1' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 413-413: narrowing conversion from 'int' to 'float'
(bugprone-narrowing-conversions,cppcoreguidelines-narrowing-conversions,-warnings-as-errors)
[error] 414-414: variable name 'w0' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 414-414: floating point literal has suffix 'f', which is not uppercase
(readability-uppercase-literal-suffix,-warnings-as-errors)
[error] 426-426: 2 adjacent parameters of 'getRelPos' of similar type are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 428-428: variable 'rel_pos_resized' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 430-430: if with identical then and else branches
(bugprone-branch-clone,-warnings-as-errors)
[error] 439-439: variable 'q_scale' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 440-440: variable 'k_scale' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 446-446: narrowing conversion from 'int' to 'float'
(bugprone-narrowing-conversions,cppcoreguidelines-narrowing-conversions,-warnings-as-errors)
[error] 447-447: variable 'embedding_dim' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 452-452: variable 'relative_coord_float' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 453-453: variable 'relative_coord_long' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 458-458: loop variable name 'd' is too short, expected at least 2 characters
(readability-identifier-length,-warnings-as-errors)
[error] 470-470: 2 adjacent parameters of 'addDecomposedRelPos' of similar type are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 470-470: 2 adjacent parameters of 'addDecomposedRelPos' of similar type are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 475-475: variable name 'Rh' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 475-475: invalid case style for variable 'Rh'
(readability-identifier-naming,-warnings-as-errors)
[error] 476-476: variable name 'Rw' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 476-476: invalid case style for variable 'Rw'
(readability-identifier-naming,-warnings-as-errors)
[error] 478-478: variable name 'B' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 478-478: invalid case style for variable 'B'
(readability-identifier-naming,-warnings-as-errors)
[error] 487-487: variable 'r_q_ptr' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 488-488: variable 'Rh_ptr' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 488-488: invalid case style for variable 'Rh_ptr'
(readability-identifier-naming,-warnings-as-errors)
[error] 489-489: variable 'rel_h_ptr' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 508-508: variable 'r_q_ptr' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 509-509: variable 'Rw_ptr' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 509-509: invalid case style for variable 'Rw_ptr'
(readability-identifier-naming,-warnings-as-errors)
[error] 510-510: variable 'rel_w_ptr' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 535-535: 2 adjacent parameters of 'forward' of similar type ('const int &') are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 536-536: variable name 'x' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 537-537: variable name 'B' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 537-537: invalid case style for variable 'B'
(readability-identifier-naming,-warnings-as-errors)
[error] 538-538: variable name 'H' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 538-538: invalid case style for variable 'H'
(readability-identifier-naming,-warnings-as-errors)
[error] 539-539: variable name 'W' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 539-539: invalid case style for variable 'W'
(readability-identifier-naming,-warnings-as-errors)
[error] 580-580: constructor does not initialize these fields: norm1_, norm2_, window_size_
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
[error] 590-590: constructor does not initialize these fields: norm1_, norm2_
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
[error] 590-590: 4 adjacent parameters of 'Block' of convertible types are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 591-591: all parameters should be named in a function
(readability-named-parameter,-warnings-as-errors)
[error] 599-599: 'window_size_' should be initialized in a member initializer of the constructor
(cppcoreguidelines-prefer-member-initializer,-warnings-as-errors)
[error] 662-662: constructor does not initialize these fields: blocks_
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
[error] 668-668: constructor does not initialize these fields: blocks_
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
[error] 668-668: 4 adjacent parameters of 'Blocks' of similar type are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 671-671: variable 'is_in' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 672-672: 14 is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
[error] 681-681: constructor does not initialize these fields: pos_embed_, neck_, net_2_, net_3_
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
[error] 692-692: constructor does not initialize these fields: pos_embed_, neck_, net_2_, net_3_
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
[error] 692-692: 2 adjacent parameters of 'ImageEncoderViT' of similar type ('const int &') are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 719-719: 2 adjacent parameters of 'forward' of similar type ('const int &') are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 720-720: variable name 'x' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
mllm/core/aops/LayerNorm2DOp.cpp
[error] 4-4: 'mllm/core/aops/LayerNorm2DOp.hpp' file not found
(clang-diagnostic-error)
mllm/core/aops/Conv2DOp.cpp
[error] 4-4: 'mllm/core/aops/Conv2DOp.hpp' file not found
(clang-diagnostic-error)
tests/cpu/Conv2DKernelTest.hpp
[error] 3-3: 'unordered_map' file not found
(clang-diagnostic-error)
[error] 13-13: function 'naive_conv2d' has cognitive complexity of 33 (threshold 25)
(readability-function-cognitive-complexity,-warnings-as-errors)
[error] 13-13: 2 adjacent parameters of 'naive_conv2d' of similar type ('const float *') are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 14-14: 2 adjacent parameters of 'naive_conv2d' of similar type ('int') are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 14-14: 3 adjacent parameters of 'naive_conv2d' of similar type ('int') are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 14-14: 2 adjacent parameters of 'naive_conv2d' of similar type ('int') are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 14-14: 2 adjacent parameters of 'naive_conv2d' of similar type ('int') are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 15-15: 2 adjacent parameters of 'naive_conv2d' of similar type ('int') are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 21-21: floating point literal has suffix 'f', which is not uppercase
(readability-uppercase-literal-suffix,-warnings-as-errors)
[error] 25-25: variable name 'iy' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 26-26: variable name 'ix' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 32-32: do not use pointer arithmetic
(cppcoreguidelines-pro-bounds-pointer-arithmetic,-warnings-as-errors)
[error] 32-32: do not use pointer arithmetic
(cppcoreguidelines-pro-bounds-pointer-arithmetic,-warnings-as-errors)
[error] 38-38: do not use pointer arithmetic
(cppcoreguidelines-pro-bounds-pointer-arithmetic,-warnings-as-errors)
[error] 41-41: do not use pointer arithmetic
(cppcoreguidelines-pro-bounds-pointer-arithmetic,-warnings-as-errors)
[error] 47-47: constructor does not initialize these fields: conv2d_
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
[error] 53-53: constructor does not initialize these fields: conv2d_
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
[error] 53-53: 3 adjacent parameters of 'Conv2DModule' of similar type ('int') are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 53-53: 2 adjacent parameters of 'Conv2DModule' of similar type ('int') are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 53-53: 2 adjacent parameters of 'Conv2DModule' of similar type ('int') are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 59-59: 2 adjacent parameters of 'forward' of similar type ('const int &') are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
[error] 65-65: class 'Conv2DKernelTest' defines a default destructor but does not define a copy constructor, a copy assignment operator, a move constructor or a move assignment operator
(cppcoreguidelines-special-member-functions,-warnings-as-errors)
[error] 70-70: method 'testConv2DOnce' can be made static
(readability-convert-member-functions-to-static,-warnings-as-errors)
[error] 73-73: invalid case style for variable 'I_H'
(readability-identifier-naming,-warnings-as-errors)
[error] 74-74: invalid case style for variable 'I_W'
(readability-identifier-naming,-warnings-as-errors)
[error] 75-75: invalid case style for variable 'K_H'
(readability-identifier-naming,-warnings-as-errors)
[error] 76-76: invalid case style for variable 'K_W'
(readability-identifier-naming,-warnings-as-errors)
[error] 77-77: invalid case style for variable 'S_H'
(readability-identifier-naming,-warnings-as-errors)
[error] 78-78: invalid case style for variable 'S_W'
(readability-identifier-naming,-warnings-as-errors)
[error] 79-79: invalid case style for variable 'P_H'
(readability-identifier-naming,-warnings-as-errors)
[error] 80-80: invalid case style for variable 'P_W'
(readability-identifier-naming,-warnings-as-errors)
[error] 98-98: variable name 'GT' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 98-98: invalid case style for variable 'GT'
(readability-identifier-naming,-warnings-as-errors)
[error] 107-107: redundant boolean literal in conditional return statement
(readability-simplify-boolean-expr,-warnings-as-errors)
[error] 113-113: method 'testConv2D' can be made static
(readability-convert-member-functions-to-static,-warnings-as-errors)
[error] 113-113: all parameters should be named in a function
(readability-named-parameter,-warnings-as-errors)
mllm/compile/ir/NodeRTTIClassOfImpl.hpp
[error] 186-186: function-like macro 'RTTI_RK_OP_LINALGIROP_LAYERNORM2DOP_IMPL' used; consider a 'constexpr' template function
(cppcoreguidelines-macro-usage,-warnings-as-errors)
mllm/models/deepseek_ocr/configuration_deepseek_ocr.hpp
[error] 5-5: 'mllm/core/aops/LinearOp.hpp' file not found
(clang-diagnostic-error)
[error] 11-11: constructor does not initialize these fields: clip_linear_impl_type, sam_linear_impl_type
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
[error] 14-14: constructor does not initialize these fields: clip_linear_impl_type, sam_linear_impl_type
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
[error] 139-139: 3.7362 is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
tests/cpu/KernelTest.cpp
[error] 873-873: all parameters should be named in a function
(readability-named-parameter,-warnings-as-errors)
[error] 876-876: 1024 is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
[error] 877-877: 224 is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
[error] 878-878: 224 is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
[error] 879-879: 14 is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
[error] 880-880: 14 is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
[error] 881-881: 14 is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
[error] 882-882: 14 is a magic number; consider replacing it with a named constant
(cppcoreguidelines-avoid-magic-numbers,readability-magic-numbers,-warnings-as-errors)
mllm/nn/Functional.cpp
[error] 121-121: 4 adjacent parameters of 'scaledDotProductAttention' of similar type ('const int &') are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
🪛 markdownlint-cli2 (0.18.1)
mllm/models/deepseek_ocr/README.md
51-51: Images should have alternate text (alt text)
(MD045, no-alt-text)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: build-android
- GitHub Check: build-macos
| @@ -0,0 +1,89 @@ | |||
| // Copyright (c) MLLM Team. | |||
| // Licensed under the MIT License. | |||
| #include "mllm/backends/cpu/kernels/arm/conv2d.hpp" | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove unrelated conv2d include; include CPUArchHelper directly.
This header isn’t used here and caused a missing-header error in tools. Include the arch helper instead.
-#include "mllm/backends/cpu/kernels/arm/conv2d.hpp"
+#include "mllm/utils/CPUArchHelper.hpp"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| #include "mllm/backends/cpu/kernels/arm/conv2d.hpp" | |
| #include "mllm/utils/CPUArchHelper.hpp" |
🧰 Tools
🪛 Clang (14.0.6)
[error] 3-3: 'mllm/backends/cpu/kernels/arm/conv2d.hpp' file not found
(clang-diagnostic-error)
🤖 Prompt for AI Agents
In mllm/backends/cpu/kernels/arm/layernorm2d.cpp around line 3, remove the
unrelated #include "mllm/backends/cpu/kernels/arm/conv2d.hpp" and replace it
with the CPU arch helper header; add an include for the CPUArchHelper header
(e.g. #include "mllm/backends/cpu/cpu_arch_helper.hpp") so the file uses the
correct helper instead of pulling in conv2d.hpp.
| // Pack data | ||
| switch (options_.impl_type) { | ||
| case aops::Conv2DOpImplType::kDefault: { | ||
| // We will do im2col algorithm when using default impl. We will packing weight here. | ||
| MLLM_INFO("Packing Conv2D weight to im2col format. kh={}, kw={}, pw={}, ph={}, dw={}, dh={}, sw={}, sh={}", | ||
| kernel_size[0], kernel_size[1], padding[0], padding[1], dilation[0], dilation[1], stride[0], stride[1]); | ||
| auto packed_weight = Tensor::empty( | ||
| { | ||
| options_.out_channels, | ||
| options_.in_channels * kernel_size[0] * kernel_size[1], | ||
|
|
||
| }, | ||
| weight_.dtype(), weight_.device()) | ||
| .alloc(); | ||
| #if defined(MLLM_HOST_ARCH_ARM64) || defined(MLLM_HOST_ARCH_ARM) | ||
| arm::conv2d_fp32_im2col_weight(weight_.ptr<mllm_fp32_t>(), packed_weight.ptr<mllm_fp32_t>(), options_.out_channels, | ||
| options_.in_channels, kernel_size[0], kernel_size[1]); | ||
| #else | ||
| MLLM_ERROR_EXIT(ExitCode::kCoreError, "Unsupported architecture for packing conv2d weight into im2col format."); | ||
| #endif | ||
| weight_ = packed_weight; | ||
| break; | ||
| } | ||
| default: { | ||
| NYI("Unsupported impl type") | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
ARM-only weight packing; add baseline fallback or guard earlier.
Currently exits on non-ARM. If x86 builds are supported anywhere, provide a scalar packing path or guard the op factory registration to avoid selecting Conv2D on unsupported hosts.
-#else
- MLLM_ERROR_EXIT(ExitCode::kCoreError, "Unsupported architecture for packing conv2d weight into im2col format.");
+#else
+ NYI("conv2d weight packing (im2col) not implemented on this host arch");
+ return;
#endif🤖 Prompt for AI Agents
In mllm/backends/cpu/ops/Conv2DOp.cpp around lines 41 to 67, the current weight
packing only has an ARM-specific path and calls MLLM_ERROR_EXIT for other
architectures; implement a non-ARM fallback or prevent registration on non-ARM
hosts. Fix by either (A) adding a portable scalar im2col weight-packing
implementation using simple nested loops to transform weight_ into the packed
layout and use that when not on ARM, or (B) move the architecture check earlier
and guard the Conv2D op factory/registration so this implementation is never
selected on non-ARM builds; choose one approach and remove the MLLM_ERROR_EXIT,
ensuring weight_ is set to the packed tensor in all compiled configurations.
| #if defined(MLLM_HOST_ARCH_ARM64) || defined(MLLM_HOST_ARCH_ARM) | ||
| // step 1. im2col inputs to tmp | ||
| auto packed_inputs = Tensor::empty({MATMUL_K, MATMUL_N}, input.dtype(), input.device()).alloc(); | ||
| arm::conv2d_fp32_im2col_input(input.ptr<mllm_fp32_t>(), options_.in_channels, input.shape()[1], input.shape()[2], | ||
| kernel_size[0], kernel_size[1], padding[0], padding[1], stride[0], stride[1], | ||
| dilation[0], dilation[1], packed_inputs.ptr<mllm_fp32_t>()); | ||
| // step 2. Do matmul |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Bug: wrong H/W arguments and batch dimension ignored in im2col.
- You pass input.shape()[1] (C) where H is expected.
- Batch dimension B is ignored, producing wrong results for B > 1.
- auto packed_inputs = Tensor::empty({MATMUL_K, MATMUL_N}, input.dtype(), input.device()).alloc();
- arm::conv2d_fp32_im2col_input(input.ptr<mllm_fp32_t>(), options_.in_channels, input.shape()[1], input.shape()[2],
- kernel_size[0], kernel_size[1], padding[0], padding[1], stride[0], stride[1],
- dilation[0], dilation[1], packed_inputs.ptr<mllm_fp32_t>());
+ const int B = input.shape()[0];
+ auto packed_inputs =
+ Tensor::empty({MATMUL_K, B * MATMUL_N}, input.dtype(), input.device()).alloc();
+ const int in_h = input.shape()[2], in_w = input.shape()[3];
+ const int out_h = output.shape()[2], out_w = output.shape()[3];
+ const int in_plane = options_.in_channels * in_h * in_w;
+ for (int b = 0; b < B; ++b) {
+ const mllm_fp32_t* in_ptr = input.ptr<mllm_fp32_t>() + b * in_plane;
+ mllm_fp32_t* out_ptr = packed_inputs.ptr<mllm_fp32_t>() + b * (MATMUL_K * out_h * out_w);
+ arm::conv2d_fp32_im2col_input(in_ptr, options_.in_channels, in_h, in_w,
+ kernel_size[0], kernel_size[1],
+ padding[0], padding[1],
+ stride[0], stride[1],
+ dilation[0], dilation[1],
+ out_ptr);
+ }🏁 Script executed:
#!/bin/bash
# Confirm conv2d_fp32_im2col_input signature (param order)
rg -nP 'conv2d_fp32_im2col_input\s*\(' -S -C3 mllm | sed -n '1,120p'Length of output: 2469
I've confirmed the function signature. Now I need to understand the context around MATMUL_K, MATMUL_N, and the output tensor to fully verify the proposed fix.
Now let me check the actual code context to see how the packed_inputs are used after the im2col call:
Based on my verification:
Bug confirmed: Critical H/W parameter swap and batch dimension missing.
The review comment is correct. The function signature expects:
conv2d_fp32_im2col_input(input_data, channels, **height**, **width**, ...)
But the current call passes:
input.ptr(), options_.in_channels, input.shape()[1], input.shape()[2], ...
This passes channelsksizeksize (im2col expects (C, H, W) per image), but:
input.shape()[1]is C (channels), not Hinput.shape()[2]is H (height), not W- Correct should be
input.shape()[2]for H andinput.shape()[3]for W
Additionally, batch processing requires looping over batch samples per the standard im2col pattern, which the current code omits entirely—only the first batch element is processed.
The proposed fix correctly addresses both issues with batch-aware tensor allocation and per-batch im2col calls.
🤖 Prompt for AI Agents
In mllm/backends/cpu/ops/Conv2DOp.cpp around lines 98–104, the im2col call
passes wrong spatial dims and omits batching: swap the height/width arguments so
you pass input.shape()[2] as height and input.shape()[3] as width (not
shape()[1] and shape()[2]), allocate the packed_inputs tensor to include the
batch dimension (size MATMUL_K per image times batch) and then loop over the
batch: for each batch index compute the per-image input pointer offset, call
conv2d_fp32_im2col_input with that per-image pointer and correct H/W, and write
the output into the corresponding slice of packed_inputs so subsequent matmul
uses all images.
| void CPULayerNorm2DOp::forward(const std::vector<Tensor>& inputs, std::vector<Tensor>& outputs) { | ||
| auto& i = inputs[0]; | ||
| auto& o = outputs[0]; | ||
|
|
||
| auto i_shape = i.shape(); | ||
| auto N = i_shape[0]; | ||
| auto C = i_shape[1]; | ||
| auto H = i_shape[2]; | ||
| auto W = i_shape[3]; | ||
|
|
||
| switch (i.dtype()) { | ||
| case kFloat32: { | ||
| #if defined(MLLM_HOST_ARCH_X86_64) || defined(MLLM_HOST_ARCH_X86) | ||
| NYI("Not impl for x86 64"); | ||
| #elif defined(MLLM_HOST_ARCH_ARM64) || defined(MLLM_HOST_ARCH_ARM) | ||
| arm::layernorm2d_fp32(i.ptr<mllm_fp32_t>(), weight_.ptr<mllm_fp32_t>(), bias_.ptr<mllm_fp32_t>(), o.ptr<mllm_fp32_t>(), N, | ||
| C, H, W, options_.eps); | ||
| #endif | ||
| break; | ||
| } | ||
| default: { | ||
| NYI("Not support data type"); | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add shape/dtype checks and handle optional gamma/beta.
- Enforce 4D NCHW inputs and matching output shape.
- Guard against missing/empty weight_/bias_ by passing nullptrs.
- Validate C matches options_.num_channels if set.
- Keep ARM path; return NYI for others, but report clearer error.
void CPULayerNorm2DOp::forward(const std::vector<Tensor>& inputs, std::vector<Tensor>& outputs) {
- auto& i = inputs[0];
- auto& o = outputs[0];
+ MLLM_CHECK(inputs.size() == 1 && outputs.size() == 1, "LayerNorm2D expects 1 input and 1 output");
+ const auto& i = inputs[0];
+ auto& o = outputs[0];
+ MLLM_CHECK(i.ndim() == 4, "LayerNorm2D expects 4D NCHW input");
+ MLLM_CHECK(o.shape() == i.shape(), "LayerNorm2D output must match input shape");
auto i_shape = i.shape();
auto N = i_shape[0];
auto C = i_shape[1];
auto H = i_shape[2];
auto W = i_shape[3];
switch (i.dtype()) {
case kFloat32: {
#if defined(MLLM_HOST_ARCH_X86_64) || defined(MLLM_HOST_ARCH_X86)
- NYI("Not impl for x86 64");
+ NYI("LayerNorm2D kFloat32 is not implemented on x86 yet");
#elif defined(MLLM_HOST_ARCH_ARM64) || defined(MLLM_HOST_ARCH_ARM)
- arm::layernorm2d_fp32(i.ptr<mllm_fp32_t>(), weight_.ptr<mllm_fp32_t>(), bias_.ptr<mllm_fp32_t>(), o.ptr<mllm_fp32_t>(), N,
- C, H, W, options_.eps);
+ const mllm_fp32_t* gamma = weight_.empty() ? nullptr : weight_.ptr<mllm_fp32_t>();
+ const mllm_fp32_t* beta = bias_.empty() ? nullptr : bias_.ptr<mllm_fp32_t>();
+ MLLM_CHECK(options_.num_channels == 0 || options_.num_channels == C,
+ "LayerNorm2D: options_.num_channels={} mismatches input C={}", options_.num_channels, C);
+ MLLM_CHECK(options_.eps > 0.f, "LayerNorm2D: eps must be > 0");
+ arm::layernorm2d_fp32(i.ptr<mllm_fp32_t>(), gamma, beta, o.ptr<mllm_fp32_t>(), N, C, H, W, options_.eps);
#endif
break;
}
default: {
NYI("Not support data type");
}
}
}| if (options_.bias) { bias_ = ploader->pull(getName() + ".bias"); } | ||
| break; | ||
| } | ||
| default: NYI("Unsupported model file version") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing semicolon after NYI macro in default case.
This will fail to compile if NYI doesn’t inject a trailing semicolon. Add it explicitly.
- default: NYI("Unsupported model file version")
+ default: NYI("Unsupported model file version");📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| default: NYI("Unsupported model file version") | |
| default: NYI("Unsupported model file version"); |
🤖 Prompt for AI Agents
In mllm/core/aops/Conv2DOp.cpp around line 36, the default case uses
NYI("Unsupported model file version") without a trailing semicolon which can
cause a compile error if the NYI macro doesn't supply one; add an explicit
semicolon after the NYI(...) call in the default case so it reads as a complete
statement.
| #include <vector> | ||
| #include <unordered_map> | ||
|
|
||
| #include "mllm/models/ARGeneration.hpp" | ||
| #include "mllm/preprocessor/tokenizers/BPE.hpp" | ||
| #include "mllm/preprocessor/tokenizers/Unicode.hpp" | ||
| #include "mllm/preprocessor/tokenizers/AutoTokenizer.hpp" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Include missing standard headers used by declarations.
Add and for std::string/int64_t.
#include <vector>
#include <unordered_map>
+#include <string>
+#include <cstdint>📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| #include <vector> | |
| #include <unordered_map> | |
| #include "mllm/models/ARGeneration.hpp" | |
| #include "mllm/preprocessor/tokenizers/BPE.hpp" | |
| #include "mllm/preprocessor/tokenizers/Unicode.hpp" | |
| #include "mllm/preprocessor/tokenizers/AutoTokenizer.hpp" | |
| #include <vector> | |
| #include <unordered_map> | |
| #include <string> | |
| #include <cstdint> | |
| #include "mllm/models/ARGeneration.hpp" | |
| #include "mllm/preprocessor/tokenizers/BPE.hpp" | |
| #include "mllm/preprocessor/tokenizers/Unicode.hpp" | |
| #include "mllm/preprocessor/tokenizers/AutoTokenizer.hpp" |
🧰 Tools
🪛 Clang (14.0.6)
[error] 11-11: 'vector' file not found
(clang-diagnostic-error)
🤖 Prompt for AI Agents
In mllm/models/deepseek_ocr/tokenization_deepseek_ocr.hpp around lines 11 to 17,
the header is missing standard includes for types used in declarations; add
#include <string> and #include <cstdint> to the include block (place them
alongside <vector> and <unordered_map>) so std::string and int64_t are available
to the header.
| class DpskOcrTokenizer final : public mllm::preprocessor::AutoTokenizer { | ||
| explicit DpskOcrTokenizer(const std::string& file_path) { preprocessor::initLocal(); } | ||
|
|
||
| std::vector<std::wstring> _tokenize(const std::string& str) override { | ||
| // TODO | ||
| return {}; | ||
| } | ||
|
|
||
| std::vector<std::wstring> tokenize(const std::string& str) override { | ||
| // TODO | ||
| return {}; | ||
| } | ||
|
|
||
| std::wstring _detokenize(int64_t pos_idx) override { | ||
| // TODO | ||
| return L""; | ||
| } | ||
|
|
||
| std::wstring detokenize(int64_t pos_idx) override { | ||
| // TODO | ||
| return _detokenize(pos_idx); | ||
| } | ||
|
|
||
| Tensor convert2Ids(const std::vector<std::wstring>& strs) override { return Tensor::nil(); } | ||
|
|
||
| private: | ||
| // For text | ||
| preprocessor::BPE bpe_; | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Constructor and overrides are private; add public:.
Without a public section, you can't instantiate the tokenizer and overrides become inaccessible.
-class DpskOcrTokenizer final : public mllm::preprocessor::AutoTokenizer {
- explicit DpskOcrTokenizer(const std::string& file_path) { preprocessor::initLocal(); }
+class DpskOcrTokenizer final : public mllm::preprocessor::AutoTokenizer {
+ public:
+ explicit DpskOcrTokenizer(const std::string& file_path)
+ : mllm::preprocessor::AutoTokenizer(), bpe_() {
+ preprocessor::initLocal();
+ (void)file_path; // TODO: load BPE from file_path
+ }
@@
- Tensor convert2Ids(const std::vector<std::wstring>& strs) override { return Tensor::nil(); }
+ Tensor convert2Ids(const std::vector<std::wstring>& strs) override { return Tensor::nil(); }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| class DpskOcrTokenizer final : public mllm::preprocessor::AutoTokenizer { | |
| explicit DpskOcrTokenizer(const std::string& file_path) { preprocessor::initLocal(); } | |
| std::vector<std::wstring> _tokenize(const std::string& str) override { | |
| // TODO | |
| return {}; | |
| } | |
| std::vector<std::wstring> tokenize(const std::string& str) override { | |
| // TODO | |
| return {}; | |
| } | |
| std::wstring _detokenize(int64_t pos_idx) override { | |
| // TODO | |
| return L""; | |
| } | |
| std::wstring detokenize(int64_t pos_idx) override { | |
| // TODO | |
| return _detokenize(pos_idx); | |
| } | |
| Tensor convert2Ids(const std::vector<std::wstring>& strs) override { return Tensor::nil(); } | |
| private: | |
| // For text | |
| preprocessor::BPE bpe_; | |
| }; | |
| class DpskOcrTokenizer final : public mllm::preprocessor::AutoTokenizer { | |
| public: | |
| explicit DpskOcrTokenizer(const std::string& file_path) | |
| : mllm::preprocessor::AutoTokenizer(), bpe_() { | |
| preprocessor::initLocal(); | |
| (void)file_path; // TODO: load BPE from file_path | |
| } | |
| std::vector<std::wstring> _tokenize(const std::string& str) override { | |
| // TODO | |
| return {}; | |
| } | |
| std::vector<std::wstring> tokenize(const std::string& str) override { | |
| // TODO | |
| return {}; | |
| } | |
| std::wstring _detokenize(int64_t pos_idx) override { | |
| // TODO | |
| return L""; | |
| } | |
| std::wstring detokenize(int64_t pos_idx) override { | |
| // TODO | |
| return _detokenize(pos_idx); | |
| } | |
| Tensor convert2Ids(const std::vector<std::wstring>& strs) override { return Tensor::nil(); } | |
| private: | |
| // For text | |
| preprocessor::BPE bpe_; | |
| }; |
🧰 Tools
🪛 Clang (14.0.6)
[error] 23-23: constructor does not initialize these fields: bpe_
(cppcoreguidelines-pro-type-member-init,-warnings-as-errors)
🤖 Prompt for AI Agents
In mllm/models/deepseek_ocr/tokenization_deepseek_ocr.hpp around lines 22 to 50,
the class members (constructor and virtual override methods) are declared under
the default private section, making the tokenizer uninstantiable and overrides
inaccessible; move the constructor and all override methods
(DpskOcrTokenizer(const std::string&), _tokenize, tokenize, _detokenize,
detokenize, convert2Ids) under a public: section so they are publicly accessible
while keeping bpe_ private.
| Tensor scaledDotProductAttention(const Tensor& Q, const Tensor& K, const Tensor& V, const Tensor& mask) { | ||
| auto scale = Q.size(-1); | ||
| scale = (1.f / sqrtf(scale)); | ||
| auto attn_weight = matmul(Q, K, false, true) * scale; | ||
| if (mask) { attn_weight = attn_weight + mask; } | ||
| attn_weight = softmax(attn_weight, -1); | ||
| return matmul(attn_weight, V); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Improve type clarity in scale computation.
The variable scale is reused for two different purposes (dimension size, then scaling factor), which hurts readability. Additionally, the implicit conversion from int32_t to float could be made explicit.
Apply this diff for clearer code:
Tensor scaledDotProductAttention(const Tensor& Q, const Tensor& K, const Tensor& V, const Tensor& mask) {
- auto scale = Q.size(-1);
- scale = (1.f / sqrtf(scale));
+ const auto d_k = static_cast<float>(Q.size(-1));
+ const float scale = 1.0f / sqrtf(d_k);
auto attn_weight = matmul(Q, K, false, true) * scale;
if (mask) { attn_weight = attn_weight + mask; }
attn_weight = softmax(attn_weight, -1);
return matmul(attn_weight, V);
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| Tensor scaledDotProductAttention(const Tensor& Q, const Tensor& K, const Tensor& V, const Tensor& mask) { | |
| auto scale = Q.size(-1); | |
| scale = (1.f / sqrtf(scale)); | |
| auto attn_weight = matmul(Q, K, false, true) * scale; | |
| if (mask) { attn_weight = attn_weight + mask; } | |
| attn_weight = softmax(attn_weight, -1); | |
| return matmul(attn_weight, V); | |
| } | |
| Tensor scaledDotProductAttention(const Tensor& Q, const Tensor& K, const Tensor& V, const Tensor& mask) { | |
| const auto d_k = static_cast<float>(Q.size(-1)); | |
| const float scale = 1.0f / sqrtf(d_k); | |
| auto attn_weight = matmul(Q, K, false, true) * scale; | |
| if (mask) { attn_weight = attn_weight + mask; } | |
| attn_weight = softmax(attn_weight, -1); | |
| return matmul(attn_weight, V); | |
| } |
🧰 Tools
🪛 Clang (14.0.6)
[error] 121-121: 4 adjacent parameters of 'scaledDotProductAttention' of similar type ('const int &') are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
🤖 Prompt for AI Agents
In mllm/nn/Functional.cpp around lines 121 to 128, the variable `scale` is
reused first as a dimension and then as the floating-point scaling factor which
reduces readability and relies on implicit int-to-float conversion; change this
by introducing an integer dimension variable (e.g., `int dim = Q.size(-1);`) and
then compute a separate float scaling factor (e.g., `float scale = 1.f /
sqrtf(static_cast<float>(dim));`), and use that `scale` in the attention
computation; keep the rest of the logic unchanged.
| Tensor Conv2D::weight() const { return std::static_pointer_cast<aops::Conv2DOp>(impl()->getInstancedOp())->weight(); } | ||
|
|
||
| Tensor Conv2D::bias() const { return std::static_pointer_cast<aops::Conv2DOp>(impl()->getInstancedOp())->bias(); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use dynamic cast and check the underlying op type.
Static cast can UB if impl() is not aops::Conv2DOp. Prefer dynamic cast with a clear check.
-Tensor Conv2D::weight() const { return std::static_pointer_cast<aops::Conv2DOp>(impl()->getInstancedOp())->weight(); }
-
-Tensor Conv2D::bias() const { return std::static_pointer_cast<aops::Conv2DOp>(impl()->getInstancedOp())->bias(); }
+Tensor Conv2D::weight() const {
+ auto op = std::dynamic_pointer_cast<aops::Conv2DOp>(impl()->getInstancedOp());
+ MLLM_CHECK(op, "Conv2D underlying op is not aops::Conv2DOp");
+ return op->weight();
+}
+
+Tensor Conv2D::bias() const {
+ auto op = std::dynamic_pointer_cast<aops::Conv2DOp>(impl()->getInstancedOp());
+ MLLM_CHECK(op, "Conv2D underlying op is not aops::Conv2DOp");
+ return op->bias();
+}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| Tensor Conv2D::weight() const { return std::static_pointer_cast<aops::Conv2DOp>(impl()->getInstancedOp())->weight(); } | |
| Tensor Conv2D::bias() const { return std::static_pointer_cast<aops::Conv2DOp>(impl()->getInstancedOp())->bias(); } | |
| Tensor Conv2D::weight() const { | |
| auto op = std::dynamic_pointer_cast<aops::Conv2DOp>(impl()->getInstancedOp()); | |
| MLLM_CHECK(op, "Conv2D underlying op is not aops::Conv2DOp"); | |
| return op->weight(); | |
| } | |
| Tensor Conv2D::bias() const { | |
| auto op = std::dynamic_pointer_cast<aops::Conv2DOp>(impl()->getInstancedOp()); | |
| MLLM_CHECK(op, "Conv2D underlying op is not aops::Conv2DOp"); | |
| return op->bias(); | |
| } |
🤖 Prompt for AI Agents
In mllm/nn/layers/Conv2D.cpp around lines 25 to 27, the code uses
std::static_pointer_cast to convert impl()->getInstancedOp() to aops::Conv2DOp
which can cause undefined behavior if the underlying op is not that type;
replace the static cast with std::dynamic_pointer_cast<aops::Conv2DOp>, check
the resulting pointer for null, and handle the error (e.g., throw a descriptive
std::runtime_error or assert) before calling weight() or bias() so you only
dereference when the cast succeeds.
mllm/nn/Module.hpp
Outdated
| template<typename T> | ||
| class ModuleLists; | ||
| class ModuleList; | ||
|
|
||
| template<typename T> | ||
| class ModuleListsSuffix; | ||
| class ModuleListSuffix; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix the forward declaration mismatch.
The forward declaration at line 48 declares ModuleListSuffix, but the actual class definition at line 202 is named ModuleListSuffixed (with "ed" at the end). This naming inconsistency will cause compilation errors if the forward-declared type is referenced before its definition.
Apply this diff to align the forward declaration with the class name:
template<typename T>
-class ModuleListSuffix;
+class ModuleListSuffixed;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| template<typename T> | |
| class ModuleLists; | |
| class ModuleList; | |
| template<typename T> | |
| class ModuleListsSuffix; | |
| class ModuleListSuffix; | |
| template<typename T> | |
| class ModuleList; | |
| template<typename T> | |
| class ModuleListSuffixed; |
🤖 Prompt for AI Agents
In mllm/nn/Module.hpp around lines 44 to 48, the forward declaration declares
ModuleListSuffix but the actual class is named ModuleListSuffixed; update the
forward declaration to match the real class name (declare template<typename T>
class ModuleListSuffixed;) so the forward declaration and definition are
consistent and avoid compilation errors.
…ng utilities - Add `Conversation` class to manage prompt templates and conversation history - Support multiple separator styles: DeepSeek, DeepSeekV2, PLAIN, ALIGNMENT - Implement methods for generating prompts, converting to Gradio/OpenAI formats - Add global registry for conversation templates with initialization function - Include image loading functionality from conversation messages - Refactor NoTPAttention module with proper tensor operations and attention mechanism - Implement padding logic in Block class for window-based processing - Add cubic interpolation and absolute position embedding resizing in ImageEncoderViT
… modes and padding strategies Added comprehensive interpolation and padding support for CPU backend: - InterpolateOp with nearest, linear, bilinear, bicubic, and trilinear modes - PadOp with constant, reflect, replicate, and circular padding modes - Updated CPU backend to register new operations - Fixed conv2d weight packing and bias addition logic - Added documentation for new functional APIs The implementation supports various tensor dimensions and aligns with PyTorch-style behavior for consistent results.
- Implement `dynamic_preprocess` to generate image tiles with aspect ratio handling - Add `Image::crop` with PIL-style out-of-bounds padding - Introduce `ImageTransform` system with composable transforms (Resize, CenterCrop, etc.) - Support common preprocessing pipelines via `BasicImageTransform` - Enable thumbnail generation and grid-based tiling for OCR models feat(tokenizer): remove unused header include - Remove unnecessary `ARGeneration.hpp` include in tokenization header chore(deepseek_ocr): add missing `<limits>` header - Include `<limits>` for `std::numeric_limits` usage in aspect ratio calculation
- Replace `keep_aspect_ratio` option with `antialias` in interpolate operations - Implement Gaussian blur utility for antialiasing in bilinear and bicubic modes - Update documentation to reflect the new `antialias` parameter - Refactor internal interpolate functions to use new API with antialiasing - Fix incorrect OpType in InterpolateOp constructor - Simplify position embedding interpolation using new functional API - Remove custom padding logic in favor of `nn::functional::pad`
- Add `mlp_projector_linear_impl_type` to `DpskOcrConfig` for configuring the linear implementation type in MLP projector - Update conversation preprocessing templates: - Replace `<|sft▁begin|>` with `<|sft begin|>` - Adjust separator tags and role markers in deepseek and deepseekv2 templates - Refactor `dynamic_preprocess` to `dynamicPreprocess`: - Return both processed images and target aspect ratio - Improve tiling logic using candidate aspect ratios within min/max limits - Support optional thumbnail generation based on block count - Introduce `MlpProjector` module for mapping visual tokens to text token space - Include necessary headers (`<set>`, `<algorithm>`, `<cmath>`) in preprocessing header
- Implement `formatMessages` function to process conversation JSON with trimming and role/content extraction - Add `DeepseekOCRForCausalLM` class with basic inference logic and image handling - Update tokenizer implementation to support special tokens and proper tokenization - Remove unused `<unordered_map>` include in tokenizer header
- Add tokenizers-cpp as a submodule under mllm/ext/vendors - Add opencv-mobile as a submodule under mllm/ext/vendors - Update .gitmodules to include the new submodules - Set update policy to 'none' for tokenizers-cpp - Add README.md to document the purpose of mllm extension - Add CMakeLists.txt for deepseek_ocr example - Register deepseek_ocr example in examples/CMakeLists.txt
- Add llvm-project as a submodule with none update strategy - Update conversation preprocessing with corrected special tokens - Register conversation templates with optional end tokens - Initialize templates in model inference and log processed prompt
… model - Implement `StackOp` for CPU backend with optimized memory copy for contiguous tensors and fallback for non-contiguous cases - Register `StackOp` in IR and core op system with proper RTTI and factory integration - Update DeepSeek-OCR model preprocessing to use `stack` for image tensor batching - Add `stack` functional API in `nn::functional` - Extend tokenizer utilities and string splitting helpers for better text processing - Introduce image padding utility that mirrors PIL's ImageOps.pad behavior - Add extension options in CMake for future LLVM and tokenizers-cpp integrations - Update example inference call with grounding prompt for OCR task
- Integrate utfcpp library for proper UTF-8 handling - Replace wide string-based tokenization with UTF-8 compatible logic - Update tokenizer to inherit from AutoTokenizerUTF8 - Add BPEUTF8 class for byte-level BPE tokenization with UTF-8 support - Modify preprocessing logic to handle Unicode characters correctly - Update CMakeLists.txt to include utfcpp headers and install rules This change enables the DeepSeek OCR model to correctly process Unicode text inputs, including CJK characters and other multibyte UTF-8 sequences, improving internationalization support.
- Remove tokenizers-cpp submodule from .gitmodules - Add new tokenizers submodule pointing to meta-pytorch/tokenizers - Update README.md to reflect the change in tokenizer dependency - Update subproject commit reference for the new tokenizers module
…support - Add `DpskOcrTokenizer` with byte-level BPE tokenization - Support encoding/decoding using UTF-8 code points - Integrate regex-based pre-tokenizer from GPT-2 standard - Update `AutoTokenizerUTF8` interface to include encode/decode methods - Refactor `BPEUTF8` to work with UTF-32 code point vectors - Add unicode processing utilities from llama.cpp - Enable meta-torch-tokenizers extension in CMake fix(tokenizer): replace tokenize with encode in deepseek-ocr model - Use `encode` instead of `tokenize` for proper id conversion - Adjust sequence mask generation accordingly build(cmake): add extension support and tokenizer options - Introduce `MLLM_EXT_ENABLE` option - Rename `MLLM_EXT_ENABLE_TOKENIZERS_CPP` to `MLLM_EXT_ENABLE_META_TORCH_TOKENIZERS` - Add subdirectory for external components when enabled test(tokenizer): add print statements for tokenizer debugging - Print tokenization and encoding results for various inputs - Include space, newline, and unicode characters chore(fmt): add formatter for vector<string> and vector<int64_t> - Enable formatted printing of string and int64_t vectors - Support cleaner debug output in examples
… tokens - Add new special tokens including `<image>`, `<|grounding|>`, `<tr>`, and `</tr>` to `DpskOcrTokenizer` - Implement `TrieUTF8` to properly handle UTF-8 encoded special tokens in tokenization - Update `tokenize()` and `decode()` methods to correctly process and detokenize strings with special tokens - Remove outdated debug print statements from example code - Modify `fmt::format_to` to avoid extra quotes around vector elements - Refactor tokenization logic to integrate regex and byte-level processing within special token handling This change enhances the DeepSeek OCR tokenizer's ability to work with complex input sequences involving special tokens and improves overall string handling.
Summary by CodeRabbit
New Features
Tests