Skip to content

Inconsistent Embedding Array Structure for Posts vs. Terms Needs Normalization on SaveΒ #992

@faisal-alvi

Description

@faisal-alvi

While working on #962, I noticed that AI suggestions for tags and categories were intermittently failing. After debugging, I found that sometimes the $embeddings variable was populated with a single array, and other times with multiple nested arrays. I’ve added a potential fix to normalize this behavior (5e1c38a and 0d2564e) but later removed (3a20812) after #962 (comment).

Current behavior:

  • Terms: $embeddings is always a single array.
  • Posts: $embeddings can be multiple arrays (depending on post length) or a single array.

This inconsistency appears to originate from how embeddings are generated in Smart404EPIntegration.php#L214.

Per feedback, normalization should happen at save time when storing embedding data in post/term meta, rather than at the point of use.

Proposed Fix:

  • Investigate embedding save process to ensure consistent normalization (always stored as a predictable format, regardless of single vs. multiple arrays).
  • Determine backward compatibility strategy for any sites that may already have mixed/legacy data in their DB.
  • Add tests to confirm consistent structure for both terms and posts.

Impact:

  • Prevents intermittent failures when consuming embeddings.
  • Ensures consistent data structure for downstream features.

Metadata

Metadata

Assignees

No one assigned

    Labels

    help wantedExtra attention is neededneeds:engineeringThis requires engineering to resolve.

    Projects

    Status

    Incoming

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions