THREESCALE-12408: Access tokens protection by jlledom · Pull Request #4236 · 3scale/porta

jlledom · 2026-02-26T17:01:44Z

What this PR does / why we need it:

This adds some protection to access tokens in our DB, so tokens are not visible in the case the DB is leaked.

This implements two measures to increase protection:

Increase token length to 48 bytes, 96 chars.
Hash the tokens with SHA-384

Since we want to make this backwards compatible, the current code still allows plain text tokens to exist and be used to authenticate.

The code is designed to be super simple. The day we want to remove the migration code and the support for plain text tokens, we'll only have to remove a few lines in the find_from_value method.

A brief explanation of the new design:

The value attribute will always unconditionally hold the value in the DB column. That is: plain text if unmigrated, hash if migrated
A new in-memory attribute plaintext_value holds the plain text value of a hashed token. It's not persisted so it's lost when the instance is removed. This attribute is designed to provide a way to return the original value once in the request response, be it UI or API. The user will have that only opportunity to store it somewhere else. After that, we won't hold the original value anywhere.
No DB migration, we are reusing the same column to hold the new value.
Authentication flow:
1. Hash the received token
2. Find by value = hash
3. If nothing found, find by value = received value
4. If found migrate old value to hash
If a DB is leaked, the hashes could be used directly as tokens and they would work because we support plain text tokens. In order to avoid that, we use the length to distinguish between original and hashed values. In our DB in SaaS, all tokens are exactly 64 chars long, while from now on, all migrated hashes will be 96 chars long.
If the received token is 96 chars long, we only find by hash. Otherwise we perform both lookups, hash and original.
As a trade-off, if some client on premises happened to have an access token exactly 96 characters long, that token will stop working after merging this. But that's extremely unlikely since they would have needed to edit the token manually.
The hashing algorithm is SHA-384, I discarded SHA-256 because it generates 64 characters-long hashes, so we couldn't distinguish between orig and hashed values.

Which issue(s) this PR fixes

https://issues.redhat.com/browse/THREESCALE-12408

Verification steps

Use and old token to performs a request
- It should work
- The token should be hashed in DB.
Use a new token
- It should work
You should be able to create a token via UI and API, and see the original value once.

Special notes for your reviewer:

How to review:

There are 90+ files edited in this PR but most of them are just the result of replacing access_token.value by access_token.plaintext_value in tests. This PR is better reviewed commit by commit:

dc93ca0: Increase tokens length to 48 to better resist quantum attacks.
d166b4e: The actual migration code.
a8af378: Edit the UI so the user can see the token. We render instead of redirecting so we keep the plain text value in memory.
901e603: Same for API endpoints.
d523dc2: The seed creates hashed tokens, that way we won't have to worry about this when we decide to remove the migration code.
d46fd2c: When receiving a token from apicast, store it hashed. For the same reason, to not have to worry after removing the migration.
e73ea5f: Add a few tests to test the particular features added in this PR.
9d65d4f: The big bunch of changes to make tests send the plain text token when performing requests.

codecov · 2026-02-27T10:08:55Z

Codecov Report

❌ Patch coverage is 64.58333% with 17 lines in your changes missing coverage. Please review.
✅ Project coverage is 88.30%. Comparing base (230cb62) to head (8df29cf).
⚠️ Report is 4 commits behind head on master.

Files with missing lines	Patch %	Lines
...rs/provider/admin/user/access_tokens_controller.rb	29.16%	17 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #4236      +/-   ##
==========================================
+ Coverage   88.23%   88.30%   +0.07%     
==========================================
  Files        1765     1765              
  Lines       44344    44361      +17     
  Branches      686      686              
==========================================
+ Hits        39126    39175      +49     
+ Misses       5202     5170      -32     
  Partials       16       16

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

jlledom · 2026-02-27T13:40:43Z

❌ Patch coverage is 82.35294% with 6 lines in your changes missing coverage.

This is not true, those lines are covered by https://github.com/3scale/porta/blob/513293b4e74515fced4aaef41a07e5e442fb6720/features/provider/admin/user/access_tokens.feature. But that's somehow ignored by Codecov. Probably a misconfiguration.

akostadinov · 2026-02-27T15:11:28Z

app/models/access_token.rb

+  def self.find_from_value(plaintext_value)
+    return nil if plaintext_value.blank?
+
+    scrubbed = plaintext_value.to_s.scrub
+    digest = compute_digest(scrubbed)

-  def self.find_from_value(value)
-    find_by(value: value.to_s.scrub)
+    # Fast path: find by digest (new/migrated tokens)
+    token = find_by(value: digest)
+    return token if token
+
+    # Legacy tokens can't be exactly 96 chars (that's our hash length)
+    # This prevents using a leaked hash directly as a token
+    return nil if scrubbed.length == HASHED_TOKEN_LENGTH
+
+    # Slow path: find by plaintext (legacy tokens)
+    token = find_by(value: scrubbed)
+    return nil unless token
+
+    # Migrate on use: replace plaintext with hash
+    token.migrate_to_hashed!(scrubbed)
+    token


Why not make a database migration instead of having another piece of crazy logic to handle legacy and new data nad never to be sure that the data was eventually fully converted or not and have to notify customers when we finally decide to remove the legacy support, wait for them, check they actually stopped using this, wait more, etc.

I agree on adding a DB migration or a migration script, that was my plan. But I don't think this is crazy logic or this can be replaced by a migration. Backwards compatibility is mandatory because otherwise we couldn't deploy without downtime, and the logic is really simple, just a few additional lines in a one and single method.

The safe path to proceed is to deploy code that is compatible with pre and post migration state, then run the migration at any moment, then in the next deploy, remove the code to support legacy tokens.

It's as simple as creating a jira issue to remove the legacy support and schedule it to the release after the one including this PR. I don't see any drama, just regular work.

Backwards compatibility is mandatory because otherwise we couldn't deploy without downtime

if we start with this, then pure database migration will not be really feasible as far as I can tell.

I'm not talking about ruby code migration but pure query migration that can execute very fast on the DB server side.

A short downtime is always acceptable if it is short. For example a couple of minutes is totally fine.

While a pure ruby migration might be super slow. On the other hand it can be lazily executed.

But this lazy execution can only reliably be done on clusters we maintain, not sure we can rely customers running this. That's why I think for on-prem much better would be a one-step migration without gimmicks.

A short downtime is always acceptable if it is short. For example a couple of minutes is totally fine.

Well, I think no down time is even better.

I'm not talking about ruby code migration but pure query migration that can execute very fast on the DB server side.

Like an SQL query that hashes the values? Is there a way to do it reliably on all the DBs, including Internet Explor... I mean Oracle? Is it really so fast?

Everything is trade-off. If I did it this way is because for me, providing a easy and less risky way to migrate is worth the super smaller extra complexity. Forcing the client to have a down time and execute a migration written in SQL that must work in all DBs and versions we support is more risky and complicated for them I think. And the only advantage would be that it's allegedly less work for us, and maybe not even that.

@mayorova @Madnialihussain Please untie.

From our discussion, I suggest this approach:

New tokens are created hashed

Hashed tokens are identified by the prefix with the algorithm

When receiving a request

Find by hashed

If it fails, find by clear

If it fails, reject

Clear tokens are not migrated on the fly

Deploy paths:

SaaS:

Deploy code

Leave it running to ensure we don't rollback

Run the migration

Two queries will only happen in the lapse between deploying code and running the migration

On premises:

Have downtime

Update the code

Run the migration

Start server

They will never run two queries per hash

Result:

One time deploy Backwards compatibility No downtime Rollback Leak-safe No DDL Migration No Corner cases

❌ ✅ ✅ ✅ ✅ ✅ ✅

WDYT?

I fact, it's almost one time deploy, bc the second phase is only a cleanup of dead code.

I'm fine with whatever you find yourself comfortable with. I'm only not sure about the 2 queries, but provided on first request the key will be migrated and 2 queries will consistently run only for missing keys, perhaps should be fine.

In my last suggestion, I added:

Clear tokens are not migrated on the fly

This is to provide a way back. But two queries will only happen until they run the migration.

On the other hand, I think we can actually deploy in two phases for SaaS and a single phase for on premises, before we can release the two phases altogether in one release if they are in fact having downtime anyway.

It is not ultimately down-gradable still, because new tokens may be created while in the new version. Still much higher downgradeablibity.

mayorova · 2026-03-04T12:10:21Z

app/controllers/provider/admin/user/access_tokens_controller.rb

        def create
-          @presenter = AccessTokensNewPresenter.new(current_account)
-          create! do |success, _failure|
+          create! do |success, failure|


This syntax belongs to inherited_resources 😅

As we're trying to get rid of it, I'd suggest to use a standard Rails approach to avoid conflicts.

Exactly, this is how I learnt about that gem.

As we're trying to get rid of it, I'd suggest to use a standard Rails approach to avoid conflicts.

Sure!

mayorova · 2026-03-04T13:01:46Z

app/models/access_token.rb


  def self.random_id
-    SecureRandom.hex(32)
+    SecureRandom.hex(48)


Hm,the existing 64-characters tokens are represented as 96-character long hash, it seems. Why is it necessary to also increase the generated plain value size? 🤔

While we can do it, but I think it's better if we can avoid it - maybe some customers also have some assumptions based on the token size (even though it might not be wise of them).

Because it's more resistant to quantum computers, if those actually work some day.

+1 to generate tokens that contain at least as many random bits as the hash used to store them. This is not expensive so should be noticeable in performance.

* New token are generated already hashed * Old tokens are hashed first time they are used Co-Authored-By: Claude <noreply@anthropic.com>

* Remove dead code * Render on create instead of redirecting, so the user can see the token while still in memory * New template to show the generated token. It was embedded in the index template before

Co-Authored-By: Claude <noreply@anthropic.com>

qltysh · 2026-03-10T13:12:04Z

❌ 55 blocking issues (55 total)

Tool	Category	Rule	Count
reek	Lint	AccessToken#generate_value calls 'self.class' 2 times	49	❌
rubocop	Lint	Avoid using `update\_columns` because it skips validations.	2	❌
rubocop	Lint	Action `edit` should appear before `create`.	1	❌
reek	Lint	AccessToken#self.find_from_value has approx 8 statements	1	❌
rubocop	Lint	Block has too many lines. [44/25]	1	❌
rubocop	Lint	Class has too many lines. [207/200]	1	❌

jlledom · 2026-03-11T10:25:50Z

I made a few changes, my last proposal:

Changes:

New tokens are created hashed
Hashed tokens are identified by the prefix with the algorithm
When receiving a request
Find by hashed
If it fails, find by clear
If it fails, reject
Clear tokens are NOT migrated on the fly
Hashing is done via Rails DB migration using SQL native hashing.

Deploy paths:

SaaS:
1. Deploy code
2. Leave it running to ensure we don't rollback
  - Since tokens are not migrated on the fly, we can run the migration or rollback when we want
3. Run the migration
  - Two queries will only happen in the lapse between deploying code and running the migration
  - Two queries will still happen for invalid tokens, even after the migration
4. Write a follow up PR to remove the second query, not needed after migration
5. Merge, deploy again.
On premises:
1. We'll release this PR and the follow-up PR altogether
  - No backwards compatibiity.
2. Have downtime
3. Update the code
4. Run the migration
5. Start server
  - They will never execute two queries per request

Support table:

Environment	One time deploy	Backwards compatibility	No downtime	Rollback	Leak-safe	No DDL Migration	No Corner cases
SaaS	❌	✅	✅	✅	✅	✅	✅
On Premises	✅	❌	❌	❌	✅	✅	✅

Performance tests:
I told claude to write a few scripts to test performance. Take into account I'm running this on a 4 year old laptop.

SaaS has about 42K tokens in DB. In my benchmark, I insert 75K tokens in the table.
I tried this in all three DB engines multiple times.

Results:

About 25-28 request per second before migrating when all 75k tokens are plain text and we need two queries to auth.
The migration takes about 10 seconds to migrate the 75k tokens.
During migration, most of time there are no requests failing. Two times I saw between 1 and 2 requests failing. Race conditions probably.
A slightly greater amount of RPS about 27-29 after the migration, when only 1 query is needed to auth.

@akostadinov @mayorova @Madnialihussain

akostadinov

Awesome work!

I have a few nitpick comments but overall very solid, thanks!

akostadinov · 2026-03-12T19:22:47Z

app/models/access_token.rb


-  def self.find_from_value(value)
-    find_by(value: value.to_s.scrub)
+    scrubbed = plaintext_value.to_s.scrub


do we need scrub?

Just a nitpick for waste of compute resources.

yeah we need it because we re-use it for the legacy comparison. We can remove it in the follow-up PR.

akostadinov · 2026-03-12T19:32:36Z

app/views/provider/admin/user/access_tokens/index.html.slim

+section id="access-tokens"
+  h2 Access Tokens
+  p
+    ' Access tokens are personal tokens that let you authenticate against the Account Management API, the Analytics API and the Billing API through HTTP Basic Auth. You can create multiple access tokens with custom scopes and permissions. We suggest you create tokens with the minimal scopes & permissions needed for the task at hand. Use Access Tokens from within the


Can we use i18n for all strings? 👼

But not blocking merge can be done later.

akostadinov · 2026-03-12T19:50:27Z

app/controllers/provider/admin/user/access_tokens_controller.rb


        def new
          @presenter = AccessTokensNewPresenter.new(current_account)
+          @access_token = access_tokens.build


I assume we can't easily avoid instantiating an object here generating a key, but at the same time generating random data should be fast enough nowadays...

akostadinov · 2026-03-12T19:58:16Z

db/migrate/20260310134934_hash_access_token_values.rb

+        "WHERE value NOT LIKE '#{DIGEST_PREFIX}%' LIMIT #{BATCH_SIZE}"
+    elsif System::Database.postgres?
+      "UPDATE access_tokens SET value = '#{DIGEST_PREFIX}' || encode(sha384(value::bytea), 'hex') " \
+        "WHERE id IN (SELECT id FROM access_tokens WHERE value NOT LIKE '#{DIGEST_PREFIX}%' LIMIT #{BATCH_SIZE})"


Now I'm thinking, postgres may potentially not have the crypto extension enabled. Maybe we should have some failsafe solution for this case 😬
But can be discussed after merge to see if we need one and what exactly.

We don't use the extension, that's for old postgres versions, in PG 14 there's a builtin alternative, the sha384() function I'm using here.

https://www.postgresql.org/docs/14/functions-binarystring.html

akostadinov · 2026-03-12T20:04:58Z

test/models/access_token_test.rb

+    assert_equal legacy_value, @token.reload.read_attribute(:value)
+  end
+
+  def test_find_from_value_rejects_leaked_hash_as_token


this one seems to be a dup with authentication with leaked database hash fails

Not dupped, one is unit and the other is integration test. They test different layers

akostadinov · 2026-03-12T20:05:40Z

test/models/access_token_test.rb

+    assert @token.reload.read_attribute(:value).start_with?(AccessToken::DIGEST_PREFIX)
+  end
+
+  def test_find_from_value_finds_legacy_token


this seems to be a dup with authentication with legacy unmigrated token succeeds

Same. Different level testing.

akostadinov · 2026-03-13T09:48:41Z

app/models/access_token.rb

@@ -1,10 +1,14 @@
 class AccessToken < ApplicationRecord
+  DIGEST_PREFIX = 'SHA384|'.freeze


Overnight I had an insight that this prefix is not ideal. It will be a constant nuisance to use it on command line:

$ echo SHA384|asdasd bash: asdasd: command not found...

The | character needs to always be quoted or escaped. It will be much more user friendly to have a shorter prefix without special characters, for example s2_

$ echo s2_asdasd s2_asdasd

The _ isn't good neither, because it's a wildcard in mysql:

https://dev.mysql.com/doc/refman/8.4/en/string-comparison-functions.html#operator_like

Let's use $ which is what bcrypt uses and we are already using in our DB for passwords. 65a75ea

I hope you can sleep now.

jlledom self-assigned this Feb 27, 2026

jlledom force-pushed the THREESCALE-12408-acess-tokens-protection branch from b2037c0 to 9d65d4f Compare February 27, 2026 12:56

jlledom requested review from Madnialihussain, akostadinov and mayorova February 27, 2026 13:30

akostadinov reviewed Feb 27, 2026

View reviewed changes

jlledom marked this pull request as ready for review March 4, 2026 08:34

mayorova reviewed Mar 4, 2026

View reviewed changes

jlledom force-pushed the THREESCALE-12408-acess-tokens-protection branch from 9d65d4f to bae2dcc Compare March 10, 2026 13:09

jlledom and others added 11 commits March 10, 2026 14:10

Increase token length to 48

6136740

Access Tokens: Hash values to SHA-384

345cac7

* New token are generated already hashed * Old tokens are hashed first time they are used Co-Authored-By: Claude <noreply@anthropic.com>

Fix Access Token UI controller

72dbb47

* Remove dead code * Render on create instead of redirecting, so the user can see the token while still in memory * New template to show the generated token. It was embedded in the index template before

Access Tokens API: Show token on creation

702d072

Fix seeds script to hash tokens

d468142

RestoreApicastMasterTokenWorker hashes the token

6d31325

Add tests for new logic

36587f5

Co-Authored-By: Claude <noreply@anthropic.com>

Fix existing tests

c939be6

Remove inherited_resources from controller

d880155

Implement discussed changes

8ba8a79

Co-Authored-By: Claude <noreply@anthropic.com>

Fix tests

e58951c

jlledom force-pushed the THREESCALE-12408-acess-tokens-protection branch from c3be649 to e58951c Compare March 10, 2026 13:11

New migration to hash tokens

8df29cf

akostadinov previously approved these changes Mar 12, 2026

View reviewed changes

akostadinov reviewed Mar 13, 2026

View reviewed changes

Move strings to locale file

cf72be2

Change prefix to "SHA384$"

65a75ea

jlledom dismissed akostadinov’s stale review via 65a75ea March 13, 2026 13:22

		@@ -1,10 +1,14 @@
		class AccessToken < ApplicationRecord
		DIGEST_PREFIX = 'SHA384\|'.freeze

Conversation

jlledom commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

jlledom commented Feb 27, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jlledom Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

akostadinov Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

qltysh bot commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jlledom commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

akostadinov left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

akostadinov Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

jlledom commented Feb 26, 2026 •

edited

Loading

codecov bot commented Feb 27, 2026 •

edited

Loading

jlledom Mar 2, 2026 •

edited

Loading

akostadinov Mar 4, 2026 •

edited

Loading

qltysh bot commented Mar 10, 2026 •

edited

Loading

jlledom commented Mar 11, 2026 •

edited

Loading

akostadinov Mar 13, 2026 •

edited

Loading