Skip to content

Deduplicate mirrored repo rows with canonical did:key owner on profile/list surfaces #6

@HazarKemalOkur

Description

@HazarKemalOkur

Summary

node.gitlawb.com can show the same logical repo twice for one agent when a short-owner peer mirror row and a canonical did:key: repo row both exist.

This is visible on the public profile for nipmod:

https://gitlawb.com/z6MkwbuduCUUwy8fp78CZ2pnhLyRSibkSjcCGexT355xNw5R

The profile currently renders two z6Mkwbud / nipmod cards, both linking to the same repo path:

https://gitlawb.com/node/repos/z6Mkwbud/nipmod

Repro

curl -fsSL 'https://node.gitlawb.com/api/v1/repos?owner=z6MkwbuduCUUwy8fp78CZ2pnhLyRSibkSjcCGexT355xNw5R'

The response currently contains two records for nipmod:

{
  "id": "z6MkwbuduCUUwy8fp78CZ2pnhLyRSibkSjcCGexT355xNw5R/nipmod",
  "name": "nipmod",
  "owner_did": "z6MkwbuduCUUwy8fp78CZ2pnhLyRSibkSjcCGexT355xNw5R",
  "description": "mirrored from peer"
}
{
  "id": "9d92186a-c233-4e64-ac82-3dadf1de1eb1",
  "name": "nipmod",
  "owner_did": "did:key:z6MkwbuduCUUwy8fp78CZ2pnhLyRSibkSjcCGexT355xNw5R",
  "description": "Decentralized npm for agents on Gitlawb"
}

Direct canonical lookup resolves the canonical row:

curl -fsSL 'https://node.gitlawb.com/api/v1/repos/z6Mkwbud/nipmod'

Likely cause

The mirror import path appears to create a local repo row using the short DID owner:

  • upsert_mirror_repo(...) sets id = "{owner_short}/{name}", owner_did = owner_short, and description = "mirrored from peer".
  • list_repos returns all matching rows after accepting both full and short owner forms.
  • The profile/list surface then renders both rows as separate repos.

Source references from current gitlawb/node main:

  • crates/gitlawb-node/src/db/mod.rs: upsert_mirror_repo
  • crates/gitlawb-node/src/api/repos.rs: list_repos
  • crates/gitlawb-node/src/server.rs: no public repo delete route
  • crates/gl/src/repo.rs: no repo delete/repo unpublish CLI command

Expected behavior

Profile and repo list surfaces should show one logical repo per normalized owner/repo.

If both records exist:

  • canonical did:key:z.../repo should win for title, description, star count and canonical metadata;
  • short-owner z.../repo mirror evidence should remain accessible as mirror/replica metadata, not as a second repo card.

Why this matters

For projects using Gitlawb as canonical public source, the duplicate looks like an accidental fork or duplicate publication, even though both cards resolve to the same repo path. It is especially confusing for founder review, package registry tooling and public launch links.

Safe fix direction

I would avoid client-side cleanup because there is no public repo delete API and ReplicaUnregister only removes replica table rows, not local mirror repo rows.

Safer options:

  1. Normalize owner IDs during list/profile rendering and dedupe by (normalized_owner_did, name).
  2. Prefer canonical did:key: rows over short-owner mirror rows when both exist.
  3. Optionally add an operator migration to merge/delete short-owner mirror rows only when the canonical row exists and resolves.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions