Skip to content

Commit 3c32d48

Browse files
authored
Setup Azure storage for ActiveStorage (#8531)
## Trello card URL - https://trello.com/c/UtzBnbCV/2097-move-from-aws-s3-to-azure-storage ## Changes in this PR: This PR sets and backfills the synchronisation between AWS S3 and Azure Storage for all the service attachments. ### Sets up the connection to Azure storage services: - Include the necessary gem (azure-blob) - Set env variables from Terraform module. - Use env variables and configure storage in Rails Active Storage. ### Sets up mirroring between S3 and Azure storage - Define and point to [mirror services](https://guides.rubyonrails.org/active_storage_overview.html#mirror-service) so any new attachments will be pushed to S3 and also to Azure. So all new uploads will be sync. ### Add backfill task for existing attachments - A rake task will re-point all the existing attachments to the mirror services and then trigger a mirror request for all of them. This will do a backfill, syncing all the existing attachments into Azure. ## What will need to come as a follow-up After we are satisfied with the backfill, we will need to push a further PR: - Removing S3 and Mirror services and pointing instead to Azure services. - Adding a rake task to repoint all the existing blobs to the Azure services. ## Screenshots of UI changes: After setting up mirrors and backfilling with the rake task. The blobs in both storages match: ### For documents <img width="1559" height="733" alt="image" src="https://github.com/user-attachments/assets/2556e4ec-3cfd-47ef-9ef7-3581c70edeaa" /> ### For images and logos <img width="1397" height="1093" alt="image" src="https://github.com/user-attachments/assets/04ad0637-91a5-4569-b671-503fe6320fe3" /> ## Checklists: ### Data & Schema Changes If this PR modifies data structures or validations, check the following: - [ ] Adds/removes model validations - [ ] Adds/removes database fields - [ ] Modifies Vacancy enumerables (phases, working patterns, job roles, key stages, etc.) <details> <summary>If any of the above options has changed then the author must check/resolve all of the following...</summary> ### Integration Impact Does this change affect any of these integrations? - [ ] DfE Analytics platform - [ ] Legacy imports mappings - [ ] DWP Find a Job export mappings - [ ] Publisher ATS API (may require mapping updates or API versioning) ### User Experience & Data Integrity Could this change impact: - [ ] Existing subscription alerts (will legacy subscription search filters break?) - [ ] Legacy vacancy copying (will copied vacancies fail new validations?) - [ ] In-progress drafts for Vacancies or Job Applications </details>
1 parent 38ec388 commit 3c32d48

File tree

16 files changed

+402
-10
lines changed

16 files changed

+402
-10
lines changed

Dockerfile

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,8 @@ COPY . .
3737

3838
ENV DOCUMENTS_S3_BUCKET=throwaway_value
3939
ENV SCHOOLS_IMAGES_LOGOS_S3_BUCKET=throwaway_value
40+
ENV DOCUMENTS_AZURE_STORAGE_ACCESS_KEY=throwaway_value
41+
ENV IMAGES_LOGOS_AZURE_STORAGE_ACCESS_KEY=throwaway_value
4042

4143
RUN --mount=type=secret,id=master_key,env=RAILS_MASTER_KEY RAILS_ENV=production SECRET_KEY_BASE=required-to-run-but-not-used RAILS_SERVE_STATIC_FILES=1 bundle exec rake assets:precompile
4244

Gemfile

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@ gem "active_storage_validations"
2424
gem "addressable"
2525
gem "array_enum"
2626
gem "aws-sdk-s3"
27+
gem "azure-blob", require: false
2728
gem "business_time"
2829
gem "chartkick"
2930
gem "devise"

Gemfile.lock

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -154,6 +154,9 @@ GEM
154154
descendants_tracker (~> 0.0.4)
155155
ice_nine (~> 0.11.0)
156156
thread_safe (~> 0.3, >= 0.3.1)
157+
azure-blob (0.8.0)
158+
cgi
159+
rexml
157160
backport (1.2.0)
158161
base64 (0.3.0)
159162
bcrypt (3.1.21)
@@ -991,6 +994,7 @@ DEPENDENCIES
991994
aws-sdk-ssm
992995
axe-core-capybara
993996
axe-core-rspec
997+
azure-blob
994998
better_errors
995999
binding_of_caller
9961000
brakeman
@@ -1153,6 +1157,7 @@ CHECKSUMS
11531157
axe-core-capybara (4.11.1) sha256=5d374e6e08b1831520e2a24ca5d2158865b8ab86346060131bf773f7f9fd1449
11541158
axe-core-rspec (4.11.1) sha256=dc6c0e166405cd3a28c4a0937f6521ee5b511c12c0ca1627144a1ee7d5014aec
11551159
axiom-types (0.1.1) sha256=c1ff113f3de516fa195b2db7e0a9a95fd1b08475a502ff660d04507a09980383
1160+
azure-blob (0.8.0) sha256=5c8d50e5c8b4fa9228f6a8d6bf003aba9f33d03c15f525235adeca8ad7879c10
11561161
backport (1.2.0) sha256=912c7dfdd9ee4625d013ddfccb6205c3f92da69a8990f65c440e40f5b2fc7f75
11571162
base64 (0.3.0) sha256=27337aeabad6ffae05c265c450490628ef3ebd4b67be58257393227588f5a97b
11581163
bcrypt (3.1.21) sha256=5964613d750a42c7ee5dc61f7b9336fb6caca429ba4ac9f2011609946e4a2dcf
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
class ActiveStorage::MirrorJob < ActiveStorage::BaseJob
2+
queue_as { ActiveStorage.queues[:mirror] }
3+
4+
discard_on ActiveStorage::FileNotFoundError
5+
retry_on ActiveStorage::IntegrityError, attempts: 10, wait: :polynomially_longer
6+
7+
# Override to fix Rails bug where MirrorJob always uses the default service
8+
# instead of the blob's actual service. This causes mirroring to fail for
9+
# attachments using non-default ActiveStorage services (e.g., images_and_logos).
10+
#
11+
# See: https://github.com/rails/rails/issues/46806
12+
# Proposed fix: https://github.com/Sandgarden-Demo/rails/pull/31/changes
13+
def perform(key, checksum:)
14+
if (blob = ActiveStorage::Blob.find_by(key: key))
15+
blob.service.try(:mirror, blob.key, checksum: blob.checksum)
16+
else
17+
ActiveStorage::Blob.service.try(:mirror, key, checksum: checksum)
18+
end
19+
end
20+
end

app/models/job_application.rb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -141,7 +141,7 @@ class JobApplication < ApplicationRecord
141141

142142
validates :email_address, email_address: true, if: -> { email_address_changed? } # Allows data created prior to validation to still be valid
143143

144-
has_one_attached :baptism_certificate, service: :amazon_s3_documents
144+
has_one_attached :baptism_certificate, service: :mirror_documents
145145

146146
validate :status_transition, if: -> { status_changed? }
147147

app/models/organisation.rb

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,8 +12,8 @@ class Organisation < ApplicationRecord
1212

1313
friendly_id :slug_candidates, use: %i[slugged history]
1414

15-
has_one_attached :logo, service: :amazon_s3_images_and_logos
16-
has_one_attached :photo, service: :amazon_s3_images_and_logos
15+
has_one_attached :logo, service: :mirror_images_and_logos
16+
has_one_attached :photo, service: :mirror_images_and_logos
1717

1818
has_many :organisation_vacancies, dependent: :destroy
1919
has_many :vacancies, through: :organisation_vacancies

app/models/vacancy.rb

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -87,12 +87,12 @@ class Vacancy < ApplicationRecord
8787
valid_file_types: %i[PDF DOC DOCX],
8888
}.freeze
8989

90-
has_many_attached :supporting_documents, service: :amazon_s3_documents
90+
has_many_attached :supporting_documents, service: :mirror_documents
9191

9292
validates :supporting_documents, content_type: DOCUMENT_CONTENT_TYPES,
9393
size: { less_than: DOCUMENT_FILE_SIZE_LIMIT }, virus_free: true, if: -> { include_additional_documents }
9494

95-
has_one_attached :application_form, service: :amazon_s3_documents
95+
has_one_attached :application_form, service: :mirror_documents
9696

9797
has_many :saved_jobs, dependent: :destroy
9898
has_many :saved_by, through: :saved_jobs, source: :jobseeker

config/environments/production.rb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55

66
# The application uses multiple services for storing files. This sets up a default value which gets overridden
77
# in every specific use case.
8-
config.active_storage.service = :amazon_s3_documents
8+
config.active_storage.service = :mirror_documents
99

1010
# Configure the domains permitted to access coordinates API
1111
config.allowed_cors_origin = proc { "https://#{DOMAIN}" }

config/storage/development.yml

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,22 @@ amazon_s3_images_and_logos:
66
service: Disk
77
root: <%= Rails.root.join("storage") %>
88

9+
azure_storage_documents:
10+
service: Disk
11+
root: <%= Rails.root.join("tmp/storage") %>
12+
13+
azure_storage_images_and_logos:
14+
service: Disk
15+
root: <%= Rails.root.join("tmp/storage") %>
16+
17+
mirror_documents:
18+
service: Disk
19+
root: <%= Rails.root.join("tmp/storage") %>
20+
21+
mirror_images_and_logos:
22+
service: Disk
23+
root: <%= Rails.root.join("tmp/storage") %>
24+
925
local:
1026
service: Disk
1127
root: <%= Rails.root.join("storage") %>

config/storage/production.yml

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,3 +11,27 @@ amazon_s3_images_and_logos:
1111
secret_access_key: <%= ENV["SCHOOLS_IMAGES_LOGOS_ACCESS_KEY_SECRET"] %>
1212
bucket: <%= ENV["SCHOOLS_IMAGES_LOGOS_S3_BUCKET"] %>
1313
region: eu-west-2
14+
15+
azure_storage_documents:
16+
service: AzureBlob
17+
storage_account_name: <%= ENV["DOCUMENTS_AZURE_STORAGE_ACCOUNT_NAME"] %>
18+
storage_access_key: <%= ENV["DOCUMENTS_AZURE_STORAGE_ACCESS_KEY"] %>
19+
container: documents
20+
21+
azure_storage_images_and_logos:
22+
service: AzureBlob
23+
storage_account_name: <%= ENV["IMAGES_LOGOS_AZURE_STORAGE_ACCOUNT_NAME"] %>
24+
storage_access_key: <%= ENV["IMAGES_LOGOS_AZURE_STORAGE_ACCESS_KEY"] %>
25+
container: images-logos
26+
27+
mirror_documents:
28+
service: Mirror
29+
primary: amazon_s3_documents
30+
mirrors:
31+
- azure_storage_documents
32+
33+
mirror_images_and_logos:
34+
service: Mirror
35+
primary: amazon_s3_images_and_logos
36+
mirrors:
37+
- azure_storage_images_and_logos

0 commit comments

Comments
 (0)