Support setting-aware project identifier casing with case-insensitive uniqueness by akabiru · Pull Request #22406 · opf/openproject

akabiru · 2026-03-18T21:59:33Z

Ticket

https://community.openproject.org/wp/73205

What are you trying to accomplish?

Add case-insensitive uniqueness enforcement for Project.identifier, ensuring no two projects can have identifiers that differ only in case (e.g. "proj" and "PROJ").

Includes:

Implementation/73146 update admin flow for sematic work package identifiers against project historical identifiers model #22442

What approach did you choose and why?

The core of this change is a LOWER(identifier) functional index on the projects table, paired with a uniqueness: { case_sensitive: false } validation.

Merge checklist

Added/updated tests
Added/updated documentation in Lookbook (patterns, previews, etc)
Tested major browsers (Chrome, Firefox, Edge, ...)

Copilot

Pull request overview

Adds setting-aware normalization and case-insensitive uniqueness for Project.identifier, aligning storage casing with Setting::WorkPackageIdentifier and updating related autofix preview and MCP search behaviors.

Changes:

Normalize Project.identifier to uppercase (alphanumeric mode) or lowercase (numeric mode) and make uniqueness case-insensitive.
Add a LOWER(identifier) unique index (including a concurrent migration) and update initial table definition for fresh installs.
Extend identifier-autofix preview classification + translations, and update MCP tool schemas/specs for case-insensitive identifier searches.

Reviewed changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
app/models/projects/identifier.rb	Implements casing normalization, case-insensitive uniqueness, FriendlyId input normalization, and updates historical identifier checks.
lib/open_project/acts_as_url/adapter/op_active_record.rb	Adds optional `post_process` support to apply controlled casing transforms to generated slugs.
db/migrate/20260319120000_add_case_insensitive_uniqueness_for_project_identifiers.rb	Adds concurrent migration switching to a `LOWER(identifier)` unique index.
db/migrate/tables/projects.rb	Updates initial schema definition to use the `LOWER(identifier)` unique index.
app/services/work_packages/identifier_autofix/preview_query.rb	Expands “problematic identifier” detection and error classification for semantic identifiers.
config/locales/en.yml	Adds new autofix preview error labels for additional classifications.
spec/services/work_packages/identifier_autofix/preview_query_spec.rb	Updates preview query specs to bypass normalization and cover new classifications.
spec/models/projects/identifier_spec.rb	Adds normalization and case-insensitive collision specs; switches to `rails_helper`.
spec/features/projects/create_spec.rb	Strengthens feature coverage for alphanumeric identifier auto-suggestion behavior.
app/services/mcp_tools/search_projects.rb	Updates tool schema description to case-insensitive identifier matching.
app/services/mcp_tools/search_programs.rb	Updates tool schema description to case-insensitive identifier matching.
app/services/mcp_tools/search_portfolios.rb	Updates tool schema description to case-insensitive identifier matching.
spec/requests/mcp/mcp_tools/search_projects_spec.rb	Updates request spec to expect case-insensitive identifier matching.
spec/requests/mcp/mcp_tools/search_programs_spec.rb	Updates request spec to expect case-insensitive identifier matching.
spec/requests/mcp/mcp_tools/search_portfolios_spec.rb	Updates request spec to expect case-insensitive identifier matching.
spec/models/project_spec.rb	Removes a stray blank line at EOF.

app/models/projects/identifier.rb

config/locales/en.yml

spec/requests/mcp/mcp_tools/search_projects_spec.rb

spec/requests/mcp/mcp_tools/search_programs_spec.rb

spec/requests/mcp/mcp_tools/search_portfolios_spec.rb

spec/services/work_packages/identifier_autofix/preview_query_spec.rb

Introduce case-insensitive handling for project identifiers: - Add `normalizes :identifier` to automatically upcase (alphanumeric mode) or downcase (numeric mode) identifiers on assignment - Add `parse_friendly_id` to normalize FriendlyId lookups for case-insensitive finder queries - Switch uniqueness validation to `case_sensitive: false` - Replace inline `exclusion:` validator with explicit `identifier_not_reserved` that checks case-insensitively - Consolidate alphanumeric format validators into a single `identifier_alphanumeric_format` method with early return to prevent cascading error messages - Use case-insensitive LOWER() comparison in historical identifier reservation check - Add `post_process` support to the OpActiveRecord acts_as_url adapter with an allowlist of safe transforms (upcase/downcase) - Add migration to replace the unique index on projects.identifier with a case-insensitive LOWER(identifier) index - Update table definition to match the new index Includes corresponding test additions for normalization, case- insensitive uniqueness, reserved identifier rejection, and the create_spec fixture fix for alphanumeric mode.

Expand PreviewQuery to detect additional alphanumeric identifier format violations: case-mismatch, underscores, leading digits, and purely numerical identifiers. Restructure error classification with FORMAT_RULES for clarity and add corresponding locale strings. Update MCP tool descriptions for search_projects, search_programs, and search_portfolios to reflect case-insensitive identifier matching, with updated test expectations.

Separate scope-building, identifier classification, and exclusion set logic into a reusable class that both PreviewQuery (admin UI) and the future ApplyHandlesJob (batch migration) can compose from. Key changes: - ProblematicIdentifiers owns FORMAT_RULES, problematic scope, error classification, and a dual-mode exclusion set (DB-backed ExclusionSet for preview, preloaded Set for batch) - PreviewQuery becomes a thin orchestrator delegating to ProblematicIdentifiers - Add deterministic ordering (.order(:id)) to preview results - Rename :not_uppercase → :not_fully_uppercased to match scope method naming, update corresponding en.yml translation key - ExclusionSet uses raw SQL to bypass Rails normalizes on :identifier which would transform query parameters based on current setting mode

Replace the TODO stub in ProblematicIdentifiers#reserved_identifiers with a real query against the friendly_id_slugs table. Historical slugs (identifiers a project used in the past but has since changed) are now excluded from suggestion generation and classified as :reserved in error_reason output. The query excludes slugs that match a current active project identifier (those are already covered by the in_use exclusion path).

Drop the DB-backed ExclusionSet in favour of a simple Set from pluck. This is a one-off admin migration — the brief memory cost of loading all non-problematic identifiers is not worth the added complexity of a custom duck-typed wrapper. Also adds performance notes documenting index considerations for the regex scope conditions and the eager-load trade-off.

app/models/projects/identifier.rb

The production spec explicitly allows underscores in alphanumeric identifiers. PreviewQuery's :special_characters rule and SQL scope incorrectly flagged them. Updated regex from [^a-zA-Z0-9] to [^a-zA-Z0-9_] in both the FORMAT_RULES lambda and the contains_non_alphanumeric scope. Also added a comment to set_raw_identifier explaining why Arel.sql is necessary (update_all applies normalizes in this Rails version).

…served LOWER(slug) = LOWER(?) is unnecessary because normalizes :identifier ensures consistent casing before FriendlyId records the slug. Plain equality uses the existing composite index on (slug, sluggable_type).

Corrected error_too_long message from "fewer than 5" to "10 or fewer" to match SEMANTIC_IDENTIFIER_MAX_LENGTH. Replaced be_present with specific assertions on item count and identifier in MCP search_{projects,programs,portfolios} specs to ensure the correct resource is returned, not just any resource.

Per code review: the model should not auto-transform identifier casing on every read/write. Format validators already enforce correct casing (lowercase for numeric, uppercase for alphanumeric), and the background job will handle migrating existing identifiers. Restores LOWER() in identifier_not_historically_reserved since without normalizes, slugs and identifiers may differ in case.

Without parse_friendly_id normalizing case, MCP search lookups are case-sensitive again. Reverts description changes and removes the case-variant identifier test contexts.

These were built on the normalizes premise which has been removed. In alphanumeric mode, SuggestionGenerator handles identifier generation rather than acts_as_url. The API PR (22417) will handle the proper wiring between modes.

The underscore fix (allowing _ in identifiers per spec) needs to target ProblematicIdentifiers instead of PreviewQuery after the extraction refactor on the target branch.

judithroth · 2026-03-23T15:01:30Z

db/migrate/20260319120000_add_case_insensitive_uniqueness_for_project_identifiers.rb

+    add_index :projects, "LOWER(identifier)",
+              unique: true,
+              name: "index_projects_on_lower_identifier",
+              algorithm: :concurrently,


I wonder about this. Do you think it's necessary to have concurrently built index here? I mean, the projects table should be rather small (<10.000), so index creation should be fast...

https://www.postgresql.org/docs/current/sql-createindex.html#SQL-CREATEINDEX-CONCURRENTLY

Creating an index can interfere with regular operation of a database. Normally PostgreSQL locks the table to be indexed against writes and performs the entire index build with a single scan of the table. Other transactions can still read the table, but if they try to insert, update, or delete rows in the table they will block until the index build is finished.

In a concurrent index build, the index is actually entered as an “invalid” index into the system catalogs in one transaction, then two table scans occur in two more transactions. [...]

If a problem arises while scanning the table, such as a deadlock or a uniqueness violation in a unique index, the CREATE INDEX command will fail but leave behind an “invalid” index. This index will be ignored for querying purposes because it might be incomplete; however it will still consume update overhead.

Not sure what the gain is but the pain could be potentially high (having an invalid index that does not work)

IIRC, there are instances with > 10k projects- I also see that we use concurrent indexes widely across historical migrations.

I asked Claude to do a risk analysis on this, and the takeaway is:

The risk of CONCURRENTLY leaving an invalid index on failure is real but manageable — you just drop the invalid index and rerun. The migration already uses if_not_exists / if_exists guards for re-runnability.

For small tables (<10k), the write-lock from a regular index build would be imperceptible, so CONCURRENTLY isn't strictly necessary there. But for larger instances (which do exist) it avoids any write-lock window, and it's consistent with the ~19 other migrations in the codebase that use this pattern.

That said -- you surfaced a much more important issue I'd missed: if any existing projects have case-colliding identifiers (e.g., MyProject and myproject), the unique index creation will fail regardless of CONCURRENTLY or not. I'll add a duplicate-cleanup step before the index creation, similar to what AddUniquenessForLdapAuthSourceNames does.

💡 TIL! Really nice catch.

judithroth · 2026-03-23T15:02:48Z

spec/models/projects/identifier_spec.rb

+
+      expect do
+        Project.where(id: create(:project).id).update_all(identifier: "EXISTING")
+      end.to raise_error(ActiveRecord::RecordNotUnique)


👍 thanks for adding this test!

judithroth · 2026-03-23T15:05:22Z

spec/services/work_packages/identifier_autofix/preview_query_spec.rb

  let(:display_count) { described_class::DISPLAY_COUNT }

+  # Store identifiers bypassing normalizes (which would downcase/upcase them)
+  def set_raw_identifier(project, identifier)


Do we still need that? At least the comment is outdated 😅

Good catch- updated. we still bypass validations via update_all in the units to reduce the need for explict work_packages_identifier: "numeric" | "alphanumeric" settings- when we need to simulate datasets in mixed modes.

judithroth · 2026-03-23T15:10:03Z

spec/models/projects/identifier_spec.rb

      expect(project).to be_valid
    end

+    it "rejects reserved identifiers case-insensitively" do


Also a nice catch! 🎇

thykel · 2026-03-23T14:59:46Z

app/models/projects/identifier.rb

                adapter: OpenProject::ActsAsUrl::Adapter::OpActiveRecord # use a custom adapter able to handle edge cases

-    ### Validators for the legacy underscored identifier format (e.g. "project_one")
    validates :identifier,


Nit: In order to make all this simpler, how about we also centralize all legacy slug validations into an identifier_slug_format?

Not sure I follow wym 👀

thykel · 2026-03-23T15:01:25Z

app/models/projects/identifier.rb

-              length: { maximum: SEMANTIC_IDENTIFIER_MAX_LENGTH },
-              if: ->(p) { p.identifier_changed? && p.identifier.present? && Setting::WorkPackageIdentifier.alphanumeric? }
+    # Validators for the alphanumeric identifier format (e.g. "PROJ1")
+    validate :identifier_alphanumeric_format,


Shall we already switch to semantic across the code? (We can still discuss the name further, but for now we should at least remain consistent.)

Of course this is probably better via a follow-up PR.

I know we haven't found naming scheme that would sit 100% well, but I would definitely at least get rid of "numeric/alphanumeric", since the "numeric" part does not really click within the context of this file.

I took the "numeric/alphanumeric" wording to be consistent with Setting::WorkPackageIdentifier.{numeric?,alphanumeric?}- which should make it easier to rename once we decide/adopt new names. I definitely prefer to keep the renaming isolated into a separate PR.

I agree. We should try to be consistent somehow. But renaming the setting probably requires a database migration 😅 so maybe it's easier if we settle on numeric / alphanumeric even if I don't like it as much 🥲

thykel · 2026-03-23T15:20:10Z

app/models/projects/identifier.rb


    already_existing = FriendlyId::Slug
-                         .where(slug: identifier, sluggable_type: self.class.to_s)
+                         .where("LOWER(slug) = LOWER(?)", identifier)


🤔 The reason for us leveraging that friendly_id history feature was that we will be able to resolve historical URLs.

If I am not mistaken, this lowercase conversion is addressing a case that shouldn't ever happen with links generated by OpenProject. Therefore, we're really only adding tolerance for typos in the URL. This sets an odd precedent, since the non-historical URLs do require proper casing

Unless there is a specific scenario we want to support, I suggest to drop ths.

This is only a validation. So this code is never involved when we are resolving values from URLs. This validation will only prevent clashes when an identifier is added / changed.

Consider the following scenario:
Project A with historic identifier "proj" (current identifier "sthelse")
Project B trying to set identifier "PROJ" should not work

just out of fear that it could be misleading for people if /projects/proj resolves to Project A but /projects/PROJ to Project B.

thykel · 2026-03-23T15:20:20Z

app/services/work_packages/identifier_autofix/preview_query.rb

-                    .select(:id, :name, :identifier)
-                    .limit(DISPLAY_COUNT)
-                    .to_a
+        analysis = ProblematicIdentifiers.new


Adds declarative specs for case-insensitive uniqueness, max length validation, and the LOWER(identifier) unique database index.

PostgreSQL enforces uniqueness constraints at index build time — when CREATE INDEX encounters duplicate key values, the entire operation fails. For concurrent index builds (algorithm: :concurrently), this is particularly problematic: a failed build leaves behind an "invalid" index entry in pg_class that is ignored for queries but still incurs write overhead on every INSERT/UPDATE, and must be explicitly dropped before retrying. Since this migration introduces a unique index on LOWER(identifier), any pre-existing case collisions (e.g. "MyProject" and "myproject") would cause the index build to fail. This adds a deduplication step that resolves collisions before the index is created: the oldest project (lowest id) keeps its identifier unchanged while newer duplicates receive a "-N" suffix via a window function partitioned over LOWER(identifier).

akabiru self-assigned this Mar 19, 2026

akabiru changed the title ~~[#73149] Support setting-aware identifier casing with case-insensitive uniqueness~~ [#73149] Support setting-aware project identifier casing with case-insensitive uniqueness Mar 19, 2026

akabiru force-pushed the open-point/73149-how-do-we-handle-project-identifiers-case-insensitive-storage branch from dc0a497 to dbed01e Compare March 19, 2026 06:11

Base automatically changed from implementation/72917-save-the-previous-value-when-a-project-identifier-changes to dev March 19, 2026 08:56

akabiru force-pushed the open-point/73149-how-do-we-handle-project-identifiers-case-insensitive-storage branch 4 times, most recently from 855b75c to 36bacc6 Compare March 20, 2026 11:25

akabiru changed the base branch from dev to refactor/extract-projects-identifier-concern March 20, 2026 11:27

akabiru force-pushed the refactor/extract-projects-identifier-concern branch from 54a420a to e55e2c6 Compare March 20, 2026 12:48

akabiru force-pushed the open-point/73149-how-do-we-handle-project-identifiers-case-insensitive-storage branch from 36bacc6 to df753f1 Compare March 20, 2026 13:06

akabiru changed the base branch from refactor/extract-projects-identifier-concern to refactor/strip-ascii-control-chars-from-identifier March 20, 2026 13:06

akabiru force-pushed the open-point/73149-how-do-we-handle-project-identifiers-case-insensitive-storage branch from df753f1 to 128ae0a Compare March 20, 2026 13:19

akabiru force-pushed the refactor/strip-ascii-control-chars-from-identifier branch from 7bb5773 to 1babb4b Compare March 20, 2026 13:54

akabiru force-pushed the open-point/73149-how-do-we-handle-project-identifiers-case-insensitive-storage branch from 128ae0a to 6541f9b Compare March 20, 2026 14:11

akabiru marked this pull request as ready for review March 20, 2026 14:13

akabiru requested a review from a team March 20, 2026 14:13

akabiru changed the title ~~[#73149] Support setting-aware project identifier casing with case-insensitive uniqueness~~ Support setting-aware project identifier casing with case-insensitive uniqueness Mar 20, 2026

opf deleted a comment from github-actions bot Mar 20, 2026

akabiru requested a review from Copilot March 20, 2026 14:27

Copilot started reviewing on behalf of akabiru March 20, 2026 14:28 View session

Copilot AI reviewed Mar 20, 2026

View reviewed changes

Base automatically changed from refactor/strip-ascii-control-chars-from-identifier to dev March 20, 2026 14:56

akabiru force-pushed the open-point/73149-how-do-we-handle-project-identifiers-case-insensitive-storage branch from 6541f9b to 6aedcab Compare March 20, 2026 15:01

opf deleted a comment from github-actions bot Mar 20, 2026

akabiru force-pushed the open-point/73149-how-do-we-handle-project-identifiers-case-insensitive-storage branch 2 times, most recently from a1d5166 to 2ca7187 Compare March 20, 2026 17:33

akabiru force-pushed the open-point/73149-how-do-we-handle-project-identifiers-case-insensitive-storage branch from 2ca7187 to 91362fb Compare March 20, 2026 18:35

akabiru added 3 commits March 20, 2026 22:55

akabiru mentioned this pull request Mar 23, 2026

[#73175] Generate default semantic identifier for Project #22417

Open

3 tasks

akabiru force-pushed the open-point/73149-how-do-we-handle-project-identifiers-case-insensitive-storage branch from 839cea0 to 79209e2 Compare March 23, 2026 06:22

judithroth reviewed Mar 23, 2026

View reviewed changes

app/models/projects/identifier.rb Outdated Show resolved Hide resolved

judithroth reviewed Mar 23, 2026

View reviewed changes

app/models/projects/identifier.rb Outdated Show resolved Hide resolved

app/models/projects/identifier.rb Show resolved Hide resolved

akabiru added 7 commits March 23, 2026 17:22

Revert MCP search tool case-insensitivity changes

f2f5caf

Without parse_friendly_id normalizing case, MCP search lookups are case-sensitive again. Reverts description changes and removes the case-variant identifier test contexts.

Apply underscore fix to ProblematicIdentifiers after rebase

5c7bed1

The underscore fix (allowing _ in identifiers per spec) needs to target ProblematicIdentifiers instead of PreviewQuery after the extraction refactor on the target branch.

akabiru force-pushed the open-point/73149-how-do-we-handle-project-identifiers-case-insensitive-storage branch from 0738f98 to 5c7bed1 Compare March 23, 2026 14:25

akabiru requested review from judithroth and thykel March 23, 2026 14:38

judithroth reviewed Mar 23, 2026

View reviewed changes

thykel reviewed Mar 23, 2026

View reviewed changes

Cleanup tests

2bf5740

akabiru force-pushed the open-point/73149-how-do-we-handle-project-identifiers-case-insensitive-storage branch from 7295e82 to 2bf5740 Compare March 23, 2026 15:21

akabiru added 2 commits March 23, 2026 18:53

Add shoulda-matcher assertions for identifier validations and indexes

d835427

Adds declarative specs for case-insensitive uniqueness, max length validation, and the LOWER(identifier) unique database index.

Conversation

akabiru commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Ticket

What are you trying to accomplish?

What approach did you choose and why?

Merge checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

akabiru Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

thykel Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

judithroth Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

4 participants

akabiru commented Mar 18, 2026 •

edited

Loading

akabiru Mar 23, 2026 •

edited

Loading

thykel Mar 23, 2026 •

edited

Loading

judithroth Mar 23, 2026 •

edited

Loading