Skip to content

Add federated search and enhance API responses#2546

Merged
Rutvikrj26 merged 24 commits intoMIT-LCP:devfrom
T-CAIREM:Rutvikrj26/federated-site-search-implementation-v2
Feb 25, 2026
Merged

Add federated search and enhance API responses#2546
Rutvikrj26 merged 24 commits intoMIT-LCP:devfrom
T-CAIREM:Rutvikrj26/federated-site-search-implementation-v2

Conversation

@Rutvikrj26
Copy link
Copy Markdown
Contributor

@Rutvikrj26 Rutvikrj26 commented Dec 19, 2025

Enhance API responses by including resource type, access policy, and topics. Implement federated search functionality with UUID support for project models.

API Updates

  • Expose public_project_uuid in PublishedProjectSerializer and PublishedProjectDetailSerializer
  • Update API documentation in export/views.py to include UUID field

Federation Models Migration

  • Move FederatedSite, FederatedProject, and FederationSyncLog models from project app to search app
  • Update FederatedProject model:
    • Add public_project_uuid field for stable identification
    • Change resource_type and access_policy to string fields (from integer codes)
    • Remove is_stale field (using full refresh strategy instead)
    • Update unique_together to use public_project_uuid instead of slug+version

Search Functionality

  • Create search/federation.py with federated search logic
  • Update resource_type filtering to work with string values
  • Remove is_stale filtering (full refresh approach)

Management Commands

  • Create sync_federated_sites command in search app
  • Update sync logic to:
    • Use public_project_uuid for identification
    • Validate presence of UUID in API responses
    • Use full refresh (delete + recreate) instead of stale marking
    • Support string values for resource_type and access_policy

Admin Interface

  • Register federation models in search/admin.py with comprehensive admin classes
  • Add list displays, filters, and fieldsets for all federation models

Based on PR #2534 feedback from @bemoody:

  • UUID replaces slug+version as stable identifier
  • String values for resource_type and access_policy for API extensibility
  • Models moved to search app (proper separation of concerns)
  • Removed stale objects logic in favor of batch delete+add

@Rutvikrj26 Rutvikrj26 force-pushed the Rutvikrj26/federated-site-search-implementation-v2 branch from 44cbc67 to d7c4158 Compare January 6, 2026 16:55
@Rutvikrj26 Rutvikrj26 marked this pull request as ready for review January 6, 2026 17:04
Copilot AI review requested due to automatic review settings January 6, 2026 17:04
@Rutvikrj26 Rutvikrj26 marked this pull request as draft January 6, 2026 17:06
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements federated search functionality to enable cross-site project discovery across multiple PhysioNet instances, while also enhancing the public API with additional metadata fields.

Key Changes:

  • Add persistent UUID identifiers to published projects for stable cross-site references
  • Implement federation models (FederatedSite, FederatedProject, FederationSyncLog) in the search app
  • Enhance API responses to include resource_type, access_policy, and topics as human-readable strings
  • Create management command for synchronizing project metadata from federated sites

Reviewed changes

Copilot reviewed 17 out of 17 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
physionet-django/project/modelcomponents/publishedproject.py Adds public_project_uuid field to PublishedProject model for persistent identification
physionet-django/project/migrations/0087_publishedproject_public_project_uuid.py Creates nullable UUID field on PublishedProject
physionet-django/project/migrations/0088_backfill_public_project_uuid.py Backfills UUIDs for existing published projects
physionet-django/project/migrations/0089_alter_publishedproject_public_project_uuid.py Enforces unique constraint and default value on UUID field
physionet-django/project/migrations/0090_merge_20260106_1200.py Merge migration resolving parallel development branches
physionet-django/search/modelcomponents/federation.py Defines federation models for sites, projects, and sync logs
physionet-django/search/modelcomponents/init.py Package initialization for federation model components
physionet-django/search/models.py Imports and exposes federation models
physionet-django/search/migrations/0001_initial.py Creates database tables for federation models
physionet-django/search/management/commands/sync_federated_sites.py Management command to synchronize metadata from federated sites
physionet-django/search/management/commands/init.py Package initialization for management commands
physionet-django/search/management/init.py Package initialization for management module
physionet-django/search/federation.py Implements federated search query logic and scoring
physionet-django/search/admin.py Registers federation models with Django admin interface
physionet-django/export/serializers.py Adds ProjectFieldsMixin and exposes new fields in API responses
physionet-django/export/views.py Updates API documentation and adds context to detail view
physionet-django/export/tests/test_views.py Adds comprehensive tests for new API field serialization

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@Rutvikrj26 Rutvikrj26 force-pushed the Rutvikrj26/federated-site-search-implementation-v2 branch from c2655ff to d545350 Compare January 20, 2026 15:56
@Rutvikrj26 Rutvikrj26 force-pushed the Rutvikrj26/federated-site-search-implementation-v2 branch from d545350 to 5617ea0 Compare January 20, 2026 16:16
@Rutvikrj26 Rutvikrj26 marked this pull request as ready for review January 20, 2026 16:26
@Rutvikrj26 Rutvikrj26 requested a review from tompollard January 21, 2026 18:06
@Rutvikrj26
Copy link
Copy Markdown
Contributor Author

Rutvikrj26 commented Jan 21, 2026

@tompollard, @bemoody Please take a look - I think the reverse migrations is failing because of the retroactive uuid addition for published projects. Apart from that, things seems to be good to go.

@bemoody
Copy link
Copy Markdown
Collaborator

bemoody commented Jan 27, 2026

The 0090 migration needs to have MIGRATE_AFTER_INSTALL = True. That's how the two-step upgrade works: the first step (pre-install) adds a new column that is nullable and non-unique, then the second step (post-install) changes it to be unique and non-nullable.

Copy link
Copy Markdown
Member

@tompollard tompollard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool @Rutvikrj26. I successfully implemented federated search locally.

Image

I added some comments inline.

<small>Source: {{ published_project.source_site.site_name }}</small>
</p>
<div style="margin-bottom: 1rem;">
{{ published_project.abstract|safe|truncatechars_html:250 }}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we remove "safe" here? I don't think we want unescaped HTML from external sites?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it is a good idea.

  • This might break the visual hierarchy since the local projects will still have the safe tag, but not the federated projects.
  • The federated projects are trusted physionet instances. The idea is that those projects seamlessly blend in with the local projects, reducing the difference, making the search truly federated.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sanitize the incoming API response HTML content with bleach.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The abstract is defined in FederatedProject as a TextField. One approach to sanitizing would be:

(a) change TextField to SafeHTMLField, and

(b) invoke full_clean before creating FederatedProject.

FederatedProject.objects.create does not invoke full_clean; it bypasses all model/field validation.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good Catch, Thanks!
Will open a quick pr for this.

bemoody
bemoody previously requested changes Feb 5, 2026
Copy link
Copy Markdown
Collaborator

@bemoody bemoody left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't fully reviewed but here are some style things:

Put import statements at the top level; don't put them inside functions or classes. (console/forms.py, console/views.py)

Please don't do unrelated reformatting (project/modelcomponents/metadata.py, project/test_views.py, project/views.py, maybe other places?). Make those changes, if needed, in a separate pull request.

Don't use integer keys in URLs (console/urls.py). FederatedSite has a "site_identifier"; use that in the URL instead.

@Rutvikrj26 Rutvikrj26 force-pushed the Rutvikrj26/federated-site-search-implementation-v2 branch from a6917ae to 890ccbc Compare February 12, 2026 21:26
@bemoody
Copy link
Copy Markdown
Collaborator

bemoody commented Feb 18, 2026

Since pull #2574 was merged, there will be a migration conflict. You should be able to fix this by editing the dependencies in 0088_publishedproject_public_project_uuid.py:

--- a/physionet-django/project/migrations/0088_publishedproject_public_project_uuid.py
+++ b/physionet-django/project/migrations/0088_publishedproject_public_project_uuid.py
@@ -5,7 +5,7 @@
 
 class Migration(migrations.Migration):
     dependencies = [
-        ("project", "0087_anonymousaccess_hide_authors"),
+        ("project", "0089_alter_activeproject_is_on_hold"),
     ]
 
     operations = [

@tompollard
Copy link
Copy Markdown
Member

Hey @Rutvikrj26 please could you update the migrations when you get a chance?

@Rutvikrj26 Rutvikrj26 force-pushed the Rutvikrj26/federated-site-search-implementation-v2 branch from 8dcaf33 to 11142a7 Compare February 24, 2026 18:56
@Rutvikrj26
Copy link
Copy Markdown
Contributor Author

@tompollard I've recreated migrations.

@tompollard
Copy link
Copy Markdown
Member

@Rutvikrj26 Can you check the migrations? Looks like the upgrade test is failing.

@Rutvikrj26
Copy link
Copy Markdown
Contributor Author

@tompollard just fixed it.

Copy link
Copy Markdown
Member

@tompollard tompollard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Rutvikrj26!

Copy link
Copy Markdown
Member

@tompollard tompollard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I notice a couple of issues.

@Rutvikrj26
Copy link
Copy Markdown
Contributor Author

@tompollard Thanks for catching these.

@Rutvikrj26 Rutvikrj26 force-pushed the Rutvikrj26/federated-site-search-implementation-v2 branch from 4283e37 to daa8c4d Compare February 24, 2026 22:21
@Rutvikrj26 Rutvikrj26 enabled auto-merge February 25, 2026 04:41
@Rutvikrj26 Rutvikrj26 dismissed bemoody’s stale review February 25, 2026 04:42

Approved - merging.

@Rutvikrj26 Rutvikrj26 added this pull request to the merge queue Feb 25, 2026
Merged via the queue into MIT-LCP:dev with commit a031e5c Feb 25, 2026
7 checks passed
@bemoody
Copy link
Copy Markdown
Collaborator

bemoody commented Feb 25, 2026

My comment here (and Tom's) was not addressed. #2546 (comment)

@tompollard
Copy link
Copy Markdown
Member

Good catch @bemoody, sorry for missing this. @Rutvikrj26 please could you raise an issue and add a fix in a new PR?

github-merge-queue bot pushed a commit that referenced this pull request Mar 19, 2026
… database (#2599)

Implement SafeHTMLField for the project abstract to ensure HTML content
is sanitized when creating federated projects. This change enhances
security by preventing potential XSS vulnerabilities.

Resolving the comments left out in  pr #2546
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants