Skip to content

Conversation

@JAORMX
Copy link
Contributor

@JAORMX JAORMX commented Nov 5, 2025

Summary

Implements the RegisterEntity gRPC endpoint to provide a unified, synchronous API for registering any entity type (repositories, releases, artifacts, pull requests) in Minder.

This PR extracts common entity creation logic into a reusable EntityCreator service that eliminates code duplication between synchronous and asynchronous entity registration flows.

Key Changes

Core Implementation

  • RegisterEntity RPC Handler - New generic endpoint at POST /api/v1/entity
  • EntityCreator Service - Unified entity creation service (internal/entities/service/entity_creator.go)
    • Handles property fetching, validation, provider registration, database persistence
    • Used by both sync (RegisterEntity) and async (webhook) flows
    • Implements cleanup on failure (webhook deregistration)
  • Pluggable Validator Framework - RepositoryValidator with extensible design
  • Proto Update - Changed identifier_property from string to google.protobuf.Struct for type safety

Refactoring

  • RepositoryService.CreateRepository() - Reduced from ~90 lines to ~30 lines
  • addOriginatingEntityStrategy.GetEntity() - Reduced from ~80 lines to ~30 lines
  • Eliminated ~170 lines of duplicated entity creation logic

Security Improvements

  • Input validation: max 100 properties, max 200 char keys
  • Context cancellation protection in cleanup operations
  • Improved error wrapping with context for debugging

Test Coverage

Added 27 new tests across 5 test files:

  • entity_creator_simple_test.go - Provider validation tests (4 tests)
  • repository_validator_test.go - Validator logic tests (6 tests)
  • handlers_entity_instances_test.go - RegisterEntity handler tests (12 tests)
  • service_integration_test.go - RepositoryService integration tests (5 tests)

All tests passing ✅

Benefits

  1. Unified API - Single endpoint for all entity types instead of entity-specific RPCs
  2. Code Simplification - Reduced duplication between entity-specific services
  3. Extensibility - Easy to add new entity types without new RPCs
  4. Consistency - Standardized entity creation patterns across the codebase
  5. Testability - Clearer separation of concerns enables better testing

Backward Compatibility

Fully backward compatible

  • Existing RegisterRepository RPC continues to work unchanged
  • All existing tests for other functionality pass
  • New functionality is additive, not breaking

Code Review Notes

Both automated code quality and security reviews were conducted:

  • ✅ Clean architecture with proper separation of concerns
  • ✅ No critical security vulnerabilities
  • ✅ Proper transaction management with cleanup
  • ✅ Authorization configured via proto options
  • ⚠️ Legacy RepositoryService tests will need updating (they test old implementation details)

🤖 Generated with Claude Code

@JAORMX JAORMX requested a review from a team as a code owner November 5, 2025 10:09
@JAORMX JAORMX force-pushed the feature/implement-register-entity branch 4 times, most recently from 2ed5dc3 to 5760cfa Compare November 5, 2025 12:39
@evankanderson evankanderson self-assigned this Nov 5, 2025
Copy link
Member

@evankanderson evankanderson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry about the delay, it took a little while to go over the related code and understand what was going on here.

Despite the number of comments, I'm pretty bullish on this change -- thanks for doing it!

projectDeleter projects.ProjectDeleter,
projectCreator projects.ProjectCreator,
entityService entitySvc.EntityService,
entityCreator entitySvc.EntityCreator,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is entityCreator separate from entityService?

In particular, it seems like we might want sub-interfaces like EntityCreator and EntityReader for mocks / other parts, but it feels like there should be a rolled-up interface that can do all the CRUD operations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They serve different purposes - EntityCreator orchestrates the full creation lifecycle (fetch → validate → register webhook → persist → events), while EntityService handles reads (list, get, delete). Creation is complex enough that separating them felt cleaner, but we could create a rolled-up interface if you prefer.

@JAORMX
Copy link
Contributor Author

JAORMX commented Dec 1, 2025

OK, I somehow missed that you had reviewed this! I'll get back to this. Sorry about that

@github-actions
Copy link
Contributor

github-actions bot commented Jan 1, 2026

This PR needs additional information before we can continue. It is now marked as stale because it has been open for 30 days with no activity. Please provide the necessary details to continue or it will be closed in 30 days.

@github-actions github-actions bot added the Stale label Jan 1, 2026
@JAORMX JAORMX force-pushed the feature/implement-register-entity branch from 5760cfa to b52ed29 Compare January 7, 2026 09:58
JAORMX added a commit to JAORMX/minder that referenced this pull request Jan 7, 2026
- Use proto.Size() check for identifying_properties validation
  (more robust than counting properties, catches large values)
- Remove CustomValidators from EntityCreationOptions
  (no planned use case, reduces complexity)
- Use ReplaceAllProperties instead of SaveAllProperties
  (ensures clean slate for entity creation)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@github-actions github-actions bot removed the Stale label Jan 8, 2026
Copy link
Member

@evankanderson evankanderson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again for getting back to this. I'll have a little research on whether properties needs to be a Struct and should have an answer tomorrow. I'll also see about guarding the RegisterEntity on existing providers so we can call it unconditionally, which seems like a bit of a nice simplicity win.

Comment on lines 113 to 107
// 2. Check if provider supports this entity type
if !prov.SupportsEntity(entityType) {
return nil, fmt.Errorf("provider %s does not support entity type %s",
provider.Name, entityType)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about it, if we move the validation into the provider, SupportsEntity could actually do the validation if we pass extra parameters to it. Right now, we're trying to cram both the GitHub and GitLab registration into a single validator, but I'm wondering if it makes more sense to have a common library that both call (so, for example, if there's another variation where registration doesn't make sense in forgejo, like "mirrored", that could go in the forgejo code rather than needing to extend the core for that case).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good evolution path. For now I've implemented the registry pattern you suggested, but having providers contribute their own validators during setup would let forgejo handle 'mirrored' repos without touching core. Could be a natural extension once we have more providers with unique validation needs.

JAORMX and others added 4 commits January 19, 2026 16:34
Implements the RegisterEntity gRPC endpoint to provide a unified,
synchronous API for registering any entity type (repositories, releases,
artifacts, pull requests) in Minder.

This change extracts common entity creation logic into a reusable
EntityCreator service that is used by both synchronous (RegisterEntity)
and asynchronous (webhook-based) entity registration flows.

Key changes:
- Add RegisterEntity RPC handler with generic entity creation
- Create EntityCreator service to unify entity creation logic
- Implement pluggable validator framework (RepositoryValidator)
- Refactor RepositoryService to use EntityCreator (reduced from ~90 to ~30 lines)
- Refactor async entity handler to use EntityCreator
- Update proto to use google.protobuf.Struct for type-safe properties
- Add comprehensive test coverage (27 new tests)

Security improvements:
- Input validation for property count (max 100) and key length (max 200)
- Context cancellation protection in cleanup operations
- Improved error wrapping for better debugging

The implementation maintains backward compatibility with existing
RegisterRepository RPC while providing a foundation for registering
other entity types through a single unified API.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Use errors.As pattern for UserVisibleError passthrough in handlers
- Convert validator errors to util.UserVisibleError for proper gRPC responses
- Reduce cyclomatic complexity in CreateEntity by extracting helpers:
  - runValidators for validator loop
  - cleanupProviderRegistration for cleanup logic
- Remove unused functions:
  - upsertLegacyEntity in add_originating_entity.go
  - pushReconcilerEvent and persistRepository in service.go
  - Unused test helpers in handler_test.go and service_test.go
- Fix unused parameters in test files
- Update tests to mock EntityCreator instead of internal services
- Update expected error messages to match new UserVisibleError format

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Use proto.Size() check for identifying_properties validation
  (more robust than counting properties, catches large values)
- Remove CustomValidators from EntityCreationOptions
  (no planned use case, reduces complexity)
- Use ReplaceAllProperties instead of SaveAllProperties
  (ensures clean slate for entity creation)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Change identifying_properties from Struct to map<string, Value>
  for better TypeScript codegen (per Evan's feedback)
- Implement ValidatorRegistry with AddValidator/RemoveValidator
  pattern for entity-type-specific validation
- Update RepositoryValidator to new interface (no entity type param)
- Add clarifying comments for transaction boundaries and child
  entity reconciliation in add_originating_entity.go
- Update all tests and mocks for new patterns

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@JAORMX JAORMX force-pushed the feature/implement-register-entity branch from 1e93e07 to 2ba6e8f Compare January 19, 2026 15:24
Copy link
Member

@evankanderson evankanderson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments; I'm happy to put together a PR for:

  • update SupportsEntity / add an EntityCreationOptions function to the providers to vary by type.
  • Ensure the same no-op DeregisterEntity behavior for providers that implement it if the resource isn't ever registered.
  • Add a DefaultProviderImpl base for existing providers.

If that would help with simplifying the logic here.

}

// Validate total size to prevent resource exhaustion
// We sum the proto.Size of each value since map itself isn't a proto message
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine, but there are a couple ways to simplify at the cost of precision:

  1. You could check the size of the RegisterEntityRequest. Since proto doesn't provide a way to e.g. reference one field's content in another (like yaml &reference), this is an upper bound on the map's size.
  2. Additionally, gRPC has a default 4MB MaxRecvMsgSize option, which applies before the deserialization and will prevent truly enormous messages from exhausting the server memory. I agree that 32KB of properties is probably more than sufficient, but the gRPC limit is a nice backup.


// 5. Create entity using EntityCreator service
ewp, err := s.entityCreator.CreateEntity(ctx, provider, projectID,
in.GetEntityType(), identifyingProps, nil) // Use default options
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that this path with nil options needs to work, can we remove the need for options from CreateEntity altogether?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Options are needed for originated entities (artifacts, pull requests) that need to pass the parent entity ID via OriginatingEntityID. The nil check provides sensible defaults for the common case (direct RegisterEntity calls), but internal code can override when needed (e.g., AddOriginatingEntityStrategy passes the parent repo ID).

Comment on lines +227 to +235
// If the error is already a UserVisibleError, pass it through directly.
// This allows providers and EntityCreator to add user-visible errors
// without needing to update this allow-list.
var userErr *util.NiceStatus
if errors.As(err, &userErr) {
return nil, err
}
return nil, util.UserVisibleError(codes.Internal,
"unable to register entity: %v", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, I found this pattern elsewhere, and I plan to fix it so that we can write this as:

return nil, fmt.Errorf("unable to register entity: %w", err)

And then the interceptor wrapper will use errors.As to extract a NiceStatus if it's in the wrapping chain.

There's nothing to do here at the moment, though.

cleanupCtx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()

cleanupErr := prov.DeregisterEntity(cleanupCtx, entityType, registeredProps)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just give Deregister the same behavior requirement as Register? (I'll check that and PR that tomorrow-ish if needed.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that would be cleaner! If you're adding that requirement to Deregister, we can remove the nil check here. Deregister returning (nil, nil) for unsupported entities would be consistent with Register.

Comment on lines +198 to +201
// For now, only repositories have reconciliation events
if ewp.Entity.Type != pb.Entity_ENTITY_REPOSITORIES {
return nil
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the check on line 181, do we need this one as well?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Alternatively, this could be a switch for now, but I'd love to figure out how to make this provider responsibility in the future...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both checks are needed - line 185 checks if the caller wants reconciliation (opts.PublishReconciliationEvent), line 203 enforces that only repos can actually trigger it. Without line 203, a caller could mistakenly set PublishReconciliationEvent: true for artifacts and we'd try to create a repo reconciler message with an artifact ID. Agree this should eventually be provider responsibility - providers could have a PublishReconciliationEvent(entity) method.

Comment on lines 104 to 111
entries := r.validators[handle.entityType]
for i, entry := range entries {
if entry.id == handle.id {
// Remove by creating a new slice without this element
r.validators[handle.entityType] = append(entries[:i], entries[i+1:]...)
return
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would simplify this (in combination with the simplication of ValidatorHandle) by using slices.DeleteFunc:

Suggested change
entries := r.validators[handle.entityType]
for i, entry := range entries {
if entry.id == handle.id {
// Remove by creating a new slice without this element
r.validators[handle.entityType] = append(entries[:i], entries[i+1:]...)
return
}
}
entries := r.validators[handle.entityType]
for k, typeValidators := range r.validators {
r.validators[k] = slices.DeleteFunc(typeValidators, func(v validatorEntry) bool {
return v.id == handle.id
})
}

JAORMX and others added 7 commits January 21, 2026 13:22
Remove the entityType field from ValidatorHandle to reduce error surface
area. The handle now only contains the validator ID, and RemoveValidator
searches all entity types to find the validator. This simplifies the API
and prevents potential misuse where the entityType could point to the
wrong validator list.

Addresses feedback from Evan Anderson in PR mindersec#5959.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
The RegisterEntity RPC now includes the entity properties in the
response, avoiding the need for a separate GetEntityById call to
fetch them. Since CreateEntity already fetches and returns properties,
this improves API ergonomics without additional cost.

Addresses feedback from Evan Anderson in PR mindersec#5959.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add CreationOptions() method to Provider interface that returns
default options for creating entities of each type. This moves the
decision about registration and reconciliation from core code to
providers, where it belongs.

- Providers now specify whether entities need webhook registration
- Providers control whether reconciliation events are published
- EntityCreator uses provider defaults and allows caller overrides
- Repositories: register webhooks and trigger reconciliation
- Other entities (PRs, artifacts, releases): no-op registration

This addresses Evan Anderson's feedback about letting providers
control their own behavior rather than hardcoding entity-type-specific
logic in core code.

Addresses feedback from Evan Anderson in PR mindersec#5959.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Add CreationOptions to TestKit provider
- Remove unused receivers in dockerhub and ghcr providers
- Remove empty else branch in entity_creator

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Update generated code after adding CreationOptions() method to
Provider interface.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Make a copy of the validator entries slice while holding the read lock
to prevent data races when iterating after releasing the lock. The slice
header is just a pointer reference, so concurrent modifications by
AddValidator/RemoveValidator could cause races without this copy.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Signed-off-by: Juan Antonio Osorio <ozz@stacklok.com>
@coveralls
Copy link

coveralls commented Jan 21, 2026

Coverage Status

coverage: 58.008% (-0.3%) from 58.311%
when pulling f279976 on JAORMX:feature/implement-register-entity
into 7c76bb2 on mindersec:main.

The test goroutine always removes the validator handle it adds,
not just sometimes.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants