-
Notifications
You must be signed in to change notification settings - Fork 2.6k
feat: Create a composite unique key to remove duplicate data #4203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,37 @@ | ||
| # Generated by Django 5.2.7 on 2025-10-16 03:21 | ||
|
|
||
| from django.db import migrations | ||
| from django.db.models.functions import RowNumber | ||
|
|
||
|
|
||
| def remove_duplicates(apps, schema_editor): | ||
| from django.db.models import Window, F | ||
| workspace_user_resource_permission_model = apps.get_model('system_manage', 'WorkspaceUserResourcePermission') | ||
|
|
||
| duplicates = workspace_user_resource_permission_model.objects.annotate( | ||
| row_num=Window( | ||
| expression=RowNumber(), | ||
| partition_by=[F('workspace_id'), F('user'), F('auth_target_type'), F('target')], | ||
| order_by=[F('create_time').desc()], | ||
| ) | ||
| ).filter(row_num__gt=1) | ||
|
|
||
| ids_to_delete = list(duplicates.values_list('id', flat=True)) | ||
| if ids_to_delete: | ||
| workspace_user_resource_permission_model.objects.filter(id__in=ids_to_delete).delete() | ||
|
|
||
| class Migration(migrations.Migration): | ||
|
|
||
| dependencies = [ | ||
| ('system_manage', '0003_alter_workspaceuserresourcepermission_target'), | ||
| ('users', '0001_initial'), | ||
| ] | ||
|
|
||
| operations = [ | ||
| migrations.RunPython(remove_duplicates, | ||
| ), | ||
| migrations.AlterUniqueTogether( | ||
| name='workspaceuserresourcepermission', | ||
| unique_together={('workspace_id', 'user', 'auth_target_type', 'target')}, | ||
| ), | ||
| ] | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -57,3 +57,4 @@ class WorkspaceUserResourcePermission(models.Model): | |
|
|
||
| class Meta: | ||
| db_table = "workspace_user_resource_permission" | ||
| unique_together = ('workspace_id', 'user', 'auth_target_type', 'target') | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The provided code snippet is mostly correct for a Django model to manage permissions related to workspace user resources. However, there are a few suggestions for improvement:
Here's improved version with the suggested changes: from django.db import models
class WorkspaceUserResourcePermission(models.Model):
# Attributes
workspace = models.ForeignKey('Workspaces', on_delete=models.CASCADE)
user = models.ForeignKey('WorkspaceUsers', on_delete=models.CASCADE)
auth_target_type = models.CharField(max_length=100) # Assuming this should be specific like 'permission_group'
target = models.CharField(max_length=255)
# Metadata
class Meta:
db_table = "workspace_user_resource_permission"
unique_together = ('workspace_id', 'user', 'auth_target_type', 'target')
# Optional ordering
# ordering = ['-created_at'] # For example, sort by creation time
def __str__(self):
return f"{self.user} {self.auth_target_type}.{self.target}"Explanation:
If you plan to include additional fields or methods in your |
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The provided code checks for duplicate entries in the
WorkspaceUserResourcePermissionmodel based on specific fields:workspace_id,user,auth_target_type, andtarget. Then deletes all but one occurrence of each duplicate. Here are some points to consider:Duplicate Check: The query uses the
RowNumber()function fromdjango.db.models.functions.RowNumber(). This is correct for filtering out duplicates.Field Partitioning: The partitioning criteria are set by
F('workspace_id'),F('user'),F('auth_target_type'), andF('target'), which ensures that duplicates within these groups are identified.Order By and Filter: The ordering is done by
_time.desc()to ensure the latest entry remains, while duplicates with earlier creation times are filtered out.Run Python Operation: The
migrations.RunPythonoperation correctly calls theremove_duplicatesfunction during migration.Alter Unique Together Constraint: After removing duplicates, the unique-together constraint is added back with the same fields.
No Optimization Suggestions: Based on the current implementation, there aren't many areas where optimization can be applied directly. However, consider creating an index on the combination of
workspace_user_resource_permission_model._meta.fieldsto improve performance if the table grows significantly.Overall, the code appears well-structured and should work as intended for deduplicating entries according to specified rules.