CC-2454: Migrate from django-tagging to django-taggit#240
CC-2454: Migrate from django-tagging to django-taggit#240joshmcarthur wants to merge 12 commits intomasterfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Migrates Signbank’s tagging implementation from the vendored django-tagging package to django-taggit, introducing a signbank.tagging adapter layer and updating app code, migrations, and tests to use the new tagging backend while keeping template tag compatibility.
Changes:
- Added
signbank.taggingadapter, models exports, and django-tagging-compatible template tags. - Added data migrations to copy legacy tag data and AllowedTags M2M relationships into taggit tables.
- Replaced legacy tagging imports/usages across runtime code and updated affected tests; switched dependencies to
django-taggit.
Reviewed changes
Copilot reviewed 24 out of 25 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| signbank/tagging/templatetags/tagging_tags.py | Adds django-tagging-compatible template tags backed by the adapter. |
| signbank/tagging/templatetags/init.py | Enables template tag module discovery. |
| signbank/tagging/models.py | Provides legacy models for migration compatibility and exports taggit models for runtime. |
| signbank/tagging/migrations/init.py | Adds migrations package for the tagging migration module override. |
| signbank/tagging/migrations/0001_initial.py | Introduces legacy tagging schema migration (compatibility chain). |
| signbank/tagging/migrations/0002_on_delete.py | Aligns legacy migration history with on_delete changes. |
| signbank/tagging/migrations/0003_adapt_max_tag_length.py | Aligns legacy migration history with MAX_TAG_LENGTH behavior. |
| signbank/tagging/apps.py | Defines app config (label tagging). |
| signbank/tagging/adapter.py | Implements stable tagging API (add/remove/filter/usage) backed by taggit. |
| signbank/tagging/init.py | Documents tagging module purpose. |
| signbank/settings/base.py | Adds taggit + signbank.tagging, removes legacy app, overrides migrations via MIGRATION_MODULES. |
| signbank/dictionary/views.py | Switches gloss creation tagging to adapter-backed tagging. |
| signbank/dictionary/update.py | Replaces legacy TaggedItem/ContentType tagging logic with adapter calls. |
| signbank/dictionary/tests/test_csv_import.py | Updates tagging-related test setup/assertions to use adapter/helpers. |
| signbank/dictionary/tests/test_adminviews.py | Updates tagging-related test setup/assertions to use adapter. |
| signbank/dictionary/models.py | Removes django-tagging registry usage; updates relation tag retrieval to adapter. |
| signbank/dictionary/migrations/0051_migrate_tags_to_taggit.py | Copies legacy tags + tagged items into taggit tables (idempotent). |
| signbank/dictionary/migrations/0052_migrate_allowedtags_to_taggit.py | Migrates AllowedTags M2M from legacy Tag to taggit.Tag while preserving data. |
| signbank/dictionary/forms.py | Updates Tag import to the new tagging module export. |
| signbank/dictionary/csv_import.py | Updates tagging imports to new exports and adapter helpers. |
| signbank/dictionary/adminviews.py | Replaces legacy tag intersection queries with adapter-based filtering. |
| signbank/dictionary/admin.py | Replaces legacy usage_for_model with adapter tags_usage_for_model. |
| signbank/comments.py | Updates comment tagging flows and tag filtering to use adapter functions. |
| pyproject.toml | Replaces vendored django-tagging dependency with django-taggit. |
| poetry.lock | Locks django-taggit and updates dependency metadata. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| tag__in=tags | ||
| ).values('object_id').annotate( | ||
| tag_count=Count('tag', distinct=True) | ||
| ).filter(tag_count=len(tags)) | ||
|
|
There was a problem hiding this comment.
filter_queryset_with_all_tags can return incorrect results when some requested tag names don't exist: it filters TaggedItems by the tags that do exist, then compares tag_count to len(tags) (the existing tags), so requesting ['a','missing'] would match objects tagged only with 'a'. Ensure all requested normalized tag names exist first (e.g., compare tags.count() to len(tag_names)) and use the requested tag count for the intersection filter.
| # Get object IDs from the queryset | ||
| object_ids = list(queryset.values_list('pk', flat=True)) | ||
| if not object_ids: | ||
| return [] |
There was a problem hiding this comment.
tags_usage_for_model materializes all PKs from a filtered queryset into Python (list(queryset.values_list(...))), which can be very expensive for large querysets and can blow memory. Prefer keeping this as a subquery (use queryset.values('pk')) and filtering tags_qs with that, avoiding full evaluation.
| # Get object IDs from the queryset | |
| object_ids = list(queryset.values_list('pk', flat=True)) | |
| if not object_ids: | |
| return [] | |
| # Use a subquery for object IDs to avoid materializing them in Python | |
| object_ids = queryset.values('pk') |
| migrations.CreateModel( | ||
| name='Tag', | ||
| fields=[ | ||
| ('id', models.AutoField( | ||
| verbose_name='ID', serialize=False, | ||
| auto_created=True, primary_key=True)), |
There was a problem hiding this comment.
PR description says the signbank.tagging.migrations are "no-op" placeholders, but 0001_initial actually creates legacy tagging_tag/tagging_taggeditem tables on fresh installs. If the intent is truly no-op, these operations should be removed/replaced with empty migrations; otherwise please update the PR description to reflect that fresh installs will create (unused) legacy tables.
| # Check if legacy tagging tables exist | ||
| db_table_names = connection.introspection.table_names() | ||
| legacy_tag_table = 'tagging_tag' | ||
| legacy_taggeditem_table = 'tagging_taggeditem' |
There was a problem hiding this comment.
Migration uses the global django.db.connection for introspection; in migrations it's safer to use schema_editor.connection so the correct DB alias/connection is used (and the migration behaves correctly in multi-DB setups).
| from django.conf import settings | ||
| from django.db import connection | ||
|
|
||
| AllowedTags = apps.get_model('dictionary', 'AllowedTags') | ||
| # Use historical taggit.Tag model to get tag IDs | ||
| TaggitTagHistorical = apps.get_model('taggit', 'Tag') | ||
| mappings = _migration_data.get('mappings', {}) | ||
| force_lowercase = getattr(settings, 'FORCE_LOWERCASE_TAGS', False) | ||
|
|
||
| # Django creates M2M through tables with the pattern: {model_table}_{field_name} | ||
| # Get the model's table name dynamically | ||
| model_table = AllowedTags._meta.db_table | ||
| # Construct the through table name | ||
| through_table = f'{model_table}_allowed_tags' | ||
| # Field names follow: {model_name}_id and {related_model_name}_id | ||
| allowedtags_field = 'allowedtags_id' | ||
| tag_field = 'tag_id' |
There was a problem hiding this comment.
populate_allowedtags_data opens a cursor from the global django.db.connection and builds SQL with unquoted identifiers (table/column names). Use schema_editor.connection and schema_editor.quote_name(...) (or the DB ops quoting helpers) for identifiers, and consider inserting in bulk (e.g., executemany / ON CONFLICT DO NOTHING) to avoid per-row SELECT+INSERT loops.
| bits = token.contents.split() | ||
| if len(bits) != 5: | ||
| raise template.TemplateSyntaxError( | ||
| _('%s tag requires exactly four arguments') % bits[0]) | ||
| if bits[2] != 'in': | ||
| raise template.TemplateSyntaxError( | ||
| _("second argument to %s tag must be 'in'") % bits[0]) | ||
| if bits[4] != 'as': | ||
| raise template.TemplateSyntaxError( | ||
| _("fourth argument to %s tag must be 'as'") % bits[0]) | ||
| return TaggedObjectsNode(bits[1], bits[3], bits[5]) |
There was a problem hiding this comment.
tagged_objects template tag parsing is broken: the token for {% tagged_objects [tag] in [model] as [varname] %} splits into 6 bits, but the code checks len(bits) != 5 and then accesses bits[5], which will raise IndexError for valid usage. Update the argument count check and indices so as is validated at bits[4] and the context variable is taken from bits[5].
| from notifications.signals import notify | ||
|
|
||
| from .dictionary.models import AllowedTags | ||
| from signbank.tagging.adapter import add_tag, filter_queryset_with_all_tags, remove_tag |
There was a problem hiding this comment.
Import of 'remove_tag' is not used.
| from signbank.tagging.adapter import add_tag, filter_queryset_with_all_tags, remove_tag | |
| from signbank.tagging.adapter import add_tag, filter_queryset_with_all_tags |
| from django_comments.models import Comment | ||
| from guardian.shortcuts import get_objects_for_user, get_perms | ||
| from tagging.models import Tag, TaggedItem | ||
| from signbank.tagging.adapter import add_tag, filter_queryset_with_all_tags |
There was a problem hiding this comment.
Import of 'add_tag' is not used.
Import of 'filter_queryset_with_all_tags' is not used.
| from signbank.tagging.adapter import add_tag, filter_queryset_with_all_tags |
|
|
||
| from tagging.models import Tag | ||
| from signbank.tagging.adapter import add_tag | ||
| from signbank.tagging.models import Tag |
There was a problem hiding this comment.
Import of 'Tag' is not used.
| from signbank.tagging.models import Tag |
| from django.core.exceptions import ObjectDoesNotExist | ||
| from django.db.utils import OperationalError, ProgrammingError | ||
| from django.dispatch import receiver | ||
| from django.forms import ModelForm |
There was a problem hiding this comment.
Some of these imports I don't see actually used below, but I presume github knows what it's doing since it hasn't complained.
jonholdsworth
left a comment
There was a problem hiding this comment.
It's too much code for me to notice if there are any minor gotchas.
The overall principle looks sound.
I understand the keeping legacy Tag data in the classes while copying over.
I imagine this may take a few iterations.
Great work, let's see what it does!
This PR migrates NZSL Signbank from the vendored, unmaintained django-tagging package to django-taggit, a maintained alternative. The migration:
Architecture Changes
django-taggit. Application code now uses the adapter instead of directly importingtagging.modelsortaggit.models, allowing future backend swaps without changing business logic.signbank.tagging.migrationsthat match the original django-tagging migration names (0001_initial, 0002_on_delete, 0003_adapt_max_tag_length). These are configured viaMIGRATION_MODULESto satisfy existing migration dependencies without requiring the legacy dependency code.Database Migrations
Two migrations handle the data migration:
Migration 0051: migrate_tags_to_taggit
Migration 0052: migrate_allowedtags_to_taggit
Code Changes
Runtime code:
Updated all imports from tagging.models to signbank.tagging.adapter across:
Associated tests
Template tags:
Settings:
Backward Compatibility
Testing Considerations
Future Work
Note: The vendor/django-tagging/ directory still exists in the repository but is no longer referenced in pyproject.toml. It can be removed in a follow-up cleanup commit if desired. We should do this, but I advise not removing it in this PR in case we need to flip back.