-
Notifications
You must be signed in to change notification settings - Fork 3.2k
Add case-insensitive uniqueness enforcement for project identifier #22406
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Changes from all commits
e369227
91362fb
69818ad
30f9c7c
6c84213
ac50f72
87750c5
14a84b2
bfeee22
f2f5caf
2318522
5c7bed1
2bf5740
d835427
674b5cd
db932c0
2803d2d
f0ff803
4812974
1648e53
d801728
243c168
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -61,28 +61,22 @@ module Projects::Identifier | |
| if: -> { Setting::WorkPackageIdentifier.alphanumeric? && identifier.blank? } | ||
|
|
||
| ### ID validators | ||
| # Validators for the legacy underscored identifier format (e.g. "project_one") | ||
| # Shared validators for all identifier formats | ||
| validates :identifier, | ||
| presence: true, | ||
| uniqueness: { case_sensitive: true }, | ||
| uniqueness: { case_sensitive: false }, | ||
| length: { maximum: IDENTIFIER_MAX_LENGTH }, | ||
| exclusion: RESERVED_IDENTIFIERS, | ||
| if: ->(p) { p.persisted? || p.identifier.present? } | ||
| # Contains only a-z, 0-9, dashes and underscores but cannot consist of numbers only as it would clash with the id. | ||
| validates :identifier, | ||
| format: { with: /\A(?!^\d+\z)[a-z0-9\-_]+\z/ }, | ||
| if: ->(p) { | ||
| p.identifier_changed? && p.identifier.present? && Setting::WorkPackageIdentifier.numeric? | ||
| } | ||
|
|
||
| # Validators for the semantic identifier format | ||
| validates :identifier, | ||
| format: { with: /\A[A-Z]/, message: :must_start_with_letter }, | ||
| if: ->(p) { p.identifier_changed? && p.identifier.present? && Setting::WorkPackageIdentifier.alphanumeric? } | ||
| validates :identifier, | ||
| format: { with: /\A[A-Z][A-Z0-9_]*\z/, message: :no_special_characters }, | ||
| length: { maximum: SEMANTIC_IDENTIFIER_MAX_LENGTH }, | ||
| if: ->(p) { p.identifier_changed? && p.identifier.present? && Setting::WorkPackageIdentifier.alphanumeric? } | ||
| # Validators for the numeric (legacy) identifier format (e.g. "my-project", "project_one") | ||
| validate :identifier_numeric_format, | ||
| if: ->(p) { p.identifier_changed? && p.identifier.present? && Setting::WorkPackageIdentifier.numeric? } | ||
|
|
||
| # Validators for the semantic (alphanumeric) identifier format (e.g. "PROJ1") | ||
| validate :identifier_alphanumeric_format, | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Shall we already switch to Of course this is probably better via a follow-up PR.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I know we haven't found naming scheme that would sit 100% well, but I would definitely at least get rid of "numeric/alphanumeric", since the "numeric" part does not really click within the context of this file.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I took the "numeric/alphanumeric" wording to be consistent with
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree. We should try to be consistent somehow. But renaming the setting probably requires a database migration 😅 so maybe it's easier if we settle on numeric / alphanumeric even if I don't like it as much 🥲 |
||
| if: ->(p) { p.identifier_changed? && p.identifier.present? && Setting::WorkPackageIdentifier.alphanumeric? } | ||
|
|
||
| validate :identifier_not_reserved, if: -> { identifier.present? } | ||
|
|
||
| # Complements the uniqueness validation above: once an identifier has been used by a | ||
| # project, it remains reserved for that project even after the project moves to a new | ||
|
|
@@ -130,14 +124,42 @@ def validation_context | |
|
|
||
| private | ||
|
|
||
| # Contains only a-z, 0-9, dashes and underscores but cannot consist of numbers only | ||
| # as that would clash with the numeric id. | ||
| def identifier_numeric_format | ||
| unless identifier.match?(/\A(?!^\d+\z)[a-z0-9\-_]+\z/) | ||
| errors.add(:identifier, :invalid) | ||
| end | ||
| end | ||
|
|
||
| def identifier_alphanumeric_format | ||
| unless identifier.match?(/\A[A-Z]/) | ||
| errors.add(:identifier, :must_start_with_letter) | ||
akabiru marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| return | ||
| end | ||
|
|
||
| errors.add(:identifier, :no_special_characters) unless identifier.match?(/\A[A-Z][A-Z0-9_]*\z/) | ||
akabiru marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| if identifier.length > SEMANTIC_IDENTIFIER_MAX_LENGTH | ||
| errors.add(:identifier, :too_long, count: SEMANTIC_IDENTIFIER_MAX_LENGTH) | ||
| end | ||
| end | ||
|
|
||
| def identifier_not_reserved | ||
| if RESERVED_IDENTIFIERS.include?(identifier&.downcase) | ||
| errors.add(:identifier, :exclusion) | ||
| end | ||
| end | ||
|
|
||
| # Checks friendly_id_slugs for any project that previously used this identifier and | ||
| # has since changed it. It allows to switch back to an identifier the project itself | ||
| # has used before. | ||
| # has since changed it. It allows a project to switch back to an identifier it has | ||
| # used before. Uses LOWER() because slugs may be stored in a different case than the | ||
| # incoming identifier (e.g. old lowercase slug vs new uppercase alphanumeric identifier). | ||
| def identifier_not_historically_reserved | ||
| return if errors.any? { |error| error.attribute == :identifier && error.type == :taken } | ||
|
|
||
| already_existing = FriendlyId::Slug | ||
| .where(slug: identifier, sluggable_type: self.class.to_s) | ||
| .where("LOWER(slug) = LOWER(?)", identifier) | ||
akabiru marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| .where(sluggable_type: self.class.to_s) | ||
| .where.not(sluggable_id: id) | ||
| .exists? | ||
akabiru marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,118 @@ | ||
| # frozen_string_literal: true | ||
|
|
||
| #-- copyright | ||
| # OpenProject is an open source project management software. | ||
| # Copyright (C) the OpenProject GmbH | ||
| # | ||
| # This program is free software; you can redistribute it and/or | ||
| # modify it under the terms of the GNU General Public License version 3. | ||
| # | ||
| # OpenProject is a fork of ChiliProject, which is a fork of Redmine. The copyright follows: | ||
| # Copyright (C) 2006-2013 Jean-Philippe Lang | ||
| # Copyright (C) 2010-2013 the ChiliProject Team | ||
| # | ||
| # This program is free software; you can redistribute it and/or | ||
| # modify it under the terms of the GNU General Public License | ||
| # as published by the Free Software Foundation; either version 2 | ||
| # of the License, or (at your option) any later version. | ||
| # | ||
| # This program is distributed in the hope that it will be useful, | ||
| # but WITHOUT ANY WARRANTY; without even the implied warranty of | ||
| # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | ||
| # GNU General Public License for more details. | ||
| # | ||
| # You should have received a copy of the GNU General Public License | ||
| # along with this program; if not, write to the Free Software | ||
| # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. | ||
| # | ||
| # See COPYRIGHT and LICENSE files for more details. | ||
| #++ | ||
|
|
||
| module WorkPackages | ||
| module IdentifierAutofix | ||
| # Identifies projects whose identifiers violate the semantic identifier format | ||
| # and provides classification and exclusion sets for suggestion generation. | ||
| # | ||
| # For main use by admin UI preview and batch migration job. | ||
| # | ||
| # == Performance notes | ||
| # | ||
| # * +#exclusion_set+ loads all non-problematic identifiers and historical slugs | ||
| # into memory. Fine for a one-off admin migration; if this ever becomes a hot | ||
| # path, consider a DB-backed exclusion check instead. | ||
| # | ||
| # * The regex scope conditions (+identifier ~ ?+) and +UPPER(identifier)+ won't | ||
| # hit a regular index. If queries get slow on large tables, a functional index | ||
| # on +UPPER(identifier)+ or a +pg_trgm+ GIN index would help. | ||
| # | ||
| # | ||
| class ProblematicIdentifiers | ||
| # Priority-ordered format rules for identifier classification. | ||
| FORMAT_RULES = [ | ||
| [:too_long, ->(id, max) { id.length > max }], | ||
| [:numerical, ->(id, _) { id.match?(/\A\d+\z/) }], | ||
| [:starts_with_number, ->(id, _) { id.match?(/\A\d/) }], | ||
| [:special_characters, ->(id, _) { id.match?(/[^a-zA-Z0-9_]/) }], | ||
| [:not_fully_uppercased, ->(id, _) { id != id.upcase }] | ||
| ].freeze | ||
|
|
||
| def scope | ||
| @scope ||= exceeds_max_length | ||
| .or(contains_non_alphanumeric) | ||
| .or(starts_with_digit) | ||
| .or(not_fully_uppercased) | ||
| end | ||
|
|
||
| delegate :count, to: :scope | ||
|
|
||
| # Returns a symbol classifying why the identifier is problematic. | ||
| # Must handle all identifiers matched by #scope. | ||
| def error_reason(identifier) | ||
| format_error_reason(identifier) || collision_error_reason(identifier) || :unknown | ||
| end | ||
|
|
||
| # Returns a Set of identifiers that must not be suggested for new assignments. | ||
| # Combines currently active identifiers from non-problematic projects with | ||
| # historically reserved identifiers from FriendlyId slug history. | ||
| def exclusion_set | ||
| reserved_identifiers | in_use_identifiers | ||
| end | ||
|
|
||
| private | ||
|
|
||
| def exceeds_max_length = Project.where("length(identifier) > ?", max_identifier_length) | ||
| def contains_non_alphanumeric = Project.where("identifier ~ ?", "[^a-zA-Z0-9_]") | ||
| def starts_with_digit = Project.where("identifier ~ ?", "^[0-9]") | ||
| def not_fully_uppercased = Project.where("identifier != UPPER(identifier)") | ||
|
|
||
| def max_identifier_length = ProjectIdentifierSuggestionGenerator::IDENTIFIER_LENGTH[:max] | ||
|
|
||
| def format_error_reason(identifier) | ||
| FORMAT_RULES.each do |reason, check| | ||
| return reason if check.call(identifier, max_identifier_length) | ||
| end | ||
| nil | ||
| end | ||
|
|
||
| def collision_error_reason(identifier) | ||
| if in_use_identifiers.include?(identifier) | ||
| :in_use | ||
| elsif reserved_identifiers.include?(identifier) | ||
| :reserved | ||
| end | ||
| end | ||
|
|
||
| def in_use_identifiers | ||
| @in_use_identifiers ||= Project.where.not(id: scope.select(:id)).pluck(:identifier).to_set | ||
| end | ||
|
|
||
| def reserved_identifiers | ||
| @reserved_identifiers ||= FriendlyId::Slug | ||
| .where(sluggable_type: Project.name) | ||
| .where("LOWER(slug) NOT IN (SELECT LOWER(identifier) FROM projects)") | ||
| .pluck(:slug) | ||
| .to_set | ||
akabiru marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| end | ||
| end | ||
| end | ||
| end | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,75 @@ | ||
| # frozen_string_literal: true | ||
|
|
||
| #-- copyright | ||
| # OpenProject is an open source project management software. | ||
| # Copyright (C) the OpenProject GmbH | ||
| # | ||
| # This program is free software; you can redistribute it and/or | ||
| # modify it under the terms of the GNU General Public License version 3. | ||
| # | ||
| # OpenProject is a fork of ChiliProject, which is a fork of Redmine. The copyright follows: | ||
| # Copyright (C) 2006-2013 Jean-Philippe Lang | ||
| # Copyright (C) 2010-2013 the ChiliProject Team | ||
| # | ||
| # This program is free software; you can redistribute it and/or | ||
| # modify it under the terms of the GNU General Public License | ||
| # as published by the Free Software Foundation; either version 2 | ||
| # of the License, or (at your option) any later version. | ||
| # | ||
| # This program is distributed in the hope that it will be useful, | ||
| # but WITHOUT ANY WARRANTY; without even the implied warranty of | ||
| # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | ||
| # GNU General Public License for more details. | ||
| # | ||
| # You should have received a copy of the GNU General Public License | ||
| # along with this program; if not, write to the Free Software | ||
| # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. | ||
| # | ||
| # See COPYRIGHT and LICENSE files for more details. | ||
| #++ | ||
|
|
||
| class AddCaseInsensitiveUniquenessForProjectIdentifiers < ActiveRecord::Migration[8.0] | ||
| disable_ddl_transaction! | ||
|
|
||
| def up | ||
| deduplicate_case_colliding_identifiers | ||
| remove_index :projects, :identifier, unique: true, algorithm: :concurrently, if_exists: true | ||
| add_index :projects, "LOWER(identifier)", | ||
| unique: true, | ||
| name: "index_projects_on_lower_identifier", | ||
| algorithm: :concurrently, | ||
akabiru marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| if_not_exists: true | ||
| end | ||
|
|
||
| # Note: does not undo identifier renames from deduplication. Suffixed identifiers | ||
| # (e.g. "FOO_2") remain valid and unique under the restored case-sensitive index. | ||
| def down | ||
| remove_index :projects, name: "index_projects_on_lower_identifier", algorithm: :concurrently, if_exists: true | ||
| add_index :projects, :identifier, unique: true, algorithm: :concurrently | ||
| end | ||
|
|
||
| private | ||
|
|
||
| # Resolves any existing case-colliding identifiers (e.g. "Foo" and "foo") so that | ||
| # the unique LOWER(identifier) index can be created without violation errors. | ||
| # The oldest project (by id) keeps its identifier; duplicates get a "_N" suffix. | ||
| # | ||
| # The NOT EXISTS guard skips rows where the suffixed identifier would itself collide. | ||
| # In practice this is extremely unlikely (requires both case-colliding identifiers | ||
| # AND a pre-existing "_N" variant). If it occurs, the subsequent index creation | ||
| # will fail, surfacing the issue for manual resolution. | ||
| def deduplicate_case_colliding_identifiers | ||
| execute <<~SQL.squish | ||
| UPDATE projects SET identifier = projects.identifier || '_' || counter.rn | ||
| FROM ( | ||
| SELECT id, row_number() OVER (PARTITION BY LOWER(identifier) ORDER BY id) AS rn | ||
| FROM projects | ||
| ) AS counter | ||
| WHERE projects.id = counter.id AND counter.rn > 1 | ||
| AND NOT EXISTS ( | ||
| SELECT 1 FROM projects p2 | ||
| WHERE LOWER(p2.identifier) = LOWER(projects.identifier || '_' || counter.rn) | ||
akabiru marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| ); | ||
| SQL | ||
| end | ||
| end | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -606,5 +606,4 @@ | |
| end | ||
| end | ||
| end | ||
|
|
||
| end | ||
Uh oh!
There was an error while loading. Please reload this page.