Add federated site search functionality#2534
Conversation
…e_url is not available.
|
Thanks; I like the general approach. Please separate the unrelated dependency changes (poetry.lock / pyproject.toml) into another pull request. I think it'd probably be more manageable to define the models in the search app instead of the project app. It doesn't look like there's anything here that is specifically tied to the project app. We should think carefully about what stuff we are storing in the FederatedProject model. We don't want this to be too tightly coupled to the physionet-build data model.
What is your thinking behind retaining "stale" objects? If a foreign project no longer exists, why wouldn't we just want to delete it? |
For sure, I'll revert these changes in the current branch.
The reasoning behind keeping it in the project folder is - we are storing projects. Search app implements search functionalities while the project app is the repository containing all the code for project management (local/federated).
Fully agreed - they are completely separated in terms of models. The only thing that is reused is the types, and the reason is so that we can maintain the same search UI, and don't have to refactor for string based implementation. Since we are only federating across physionet instances, this should not be an issue, and won't drift apart.
I kinda agree, and would like to add uuid to the published project. The only reason I want to avoid doi is - not all projects might have doi. Slug + Version is something that is critical as that is what created the individual project pages in the first place, and hence this is always going to be unique.
I was thinking of keeping the system version-aware and hence the stale objects as well (similar workflow to how things are implemented for local projects). But I agree with your point. We might not want to keep stale and only give the latest versions in our search. I'll make the necessary implementation changes and update the pr. |
|
As discussed, I'll update the implementation to put the models into search app, add uuid, and leave the refresh logic as-is. |
As I think about this more, the issue is not so much about how the information is stored in the FederatedProject model; the issue is how the information is represented in the public JSON API. The API is not just for federating between sites running identical software - it's meant to be used by everybody. These fields aren't currently in the API at all, which means we would need to add them. If we add them to the API, they should be added in a way that is extensible and future-proof. I think perhaps it would be better to split this into two pull requests: one to add additional stuff to the API, and one to add the ability to sync/search federated sites. |
|
Superseded by #2546 |
Enhance API responses by including resource type, access policy, and topics. Implement federated search functionality with UUID support for project models. ## API Updates - Expose public_project_uuid in PublishedProjectSerializer and PublishedProjectDetailSerializer - Update API documentation in export/views.py to include UUID field ## Federation Models Migration - Move FederatedSite, FederatedProject, and FederationSyncLog models from project app to search app - Update FederatedProject model: * Add public_project_uuid field for stable identification * Change resource_type and access_policy to string fields (from integer codes) * Remove is_stale field (using full refresh strategy instead) * Update unique_together to use public_project_uuid instead of slug+version ## Search Functionality - Create search/federation.py with federated search logic - Update resource_type filtering to work with string values - Remove is_stale filtering (full refresh approach) ## Management Commands - Create sync_federated_sites command in search app - Update sync logic to: * Use public_project_uuid for identification * Validate presence of UUID in API responses * Use full refresh (delete + recreate) instead of stale marking * Support string values for resource_type and access_policy ## Admin Interface - Register federation models in search/admin.py with comprehensive admin classes - Add list displays, filters, and fieldsets for all federation models Based on PR #2534 feedback from @bemoody: - UUID replaces slug+version as stable identifier - String values for resource_type and access_policy for API extensibility - Models moved to search app (proper separation of concerns) - Removed stale objects logic in favor of batch delete+add
Implement a fully working federated site search feature, allowing users to include results from registered federated sites alongside local projects. This includes necessary backend changes, new forms, and UI components for managing federated sites.