-
Notifications
You must be signed in to change notification settings - Fork 71
Stripe rate limit handling #4216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Large diffs are not rendered by default.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -18,5 +18,18 @@ def message | |
|
|
||
| CreditCardPurchaseFailed = Class.new(GatewayError) | ||
|
|
||
| # Rate limit error - should be retried immediately, not treated as payment failure | ||
| class StripeRateLimitError < ActiveMerchant::ActiveMerchantError | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Even when this only happens in Stripe, I think it's better to not make it stripe specific, because the concept of rate limiting is more general. Also I don't see anything in this error definition that would force to limit it to Stripe. |
||
| attr_reader :response, :payment_metadata | ||
|
|
||
| def initialize(response, payment_metadata) | ||
| @response = response | ||
| @payment_metadata = payment_metadata | ||
| end | ||
|
|
||
| def message | ||
| response.try!(:message) || 'Rate limit exceeded - too many requests to payment gateway' | ||
| end | ||
| end | ||
| end | ||
| end | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -89,6 +89,10 @@ def self.daily(options = {}) | |
| results.start(billing_strategy) | ||
| ignoring_find_each_scope { billing_strategy.daily(now: now, buyer_ids: options[:buyer_ids], skip_notifications: skip_notifications) } | ||
| results.success(billing_strategy) | ||
| rescue Finance::Payment::StripeRateLimitError => e | ||
| # Rate limit errors should bubble up to Sidekiq for immediate retry with exponential backoff | ||
| # Don't treat these as payment failures - they are temporary gateway issues | ||
| raise e | ||
|
Comment on lines
+92
to
+95
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What if, instead of having to reference |
||
| rescue => e | ||
| results.failure(billing_strategy) | ||
| name = billing_strategy.provider.try!(:name) | ||
|
|
@@ -352,6 +356,10 @@ def bill_and_charge_each(options = {}) | |
| buyer_accounts.find_each(:batch_size => 20) do |buyer| | ||
| begin | ||
| ignoring_find_each_scope { yield(buyer) } | ||
| rescue Finance::Payment::StripeRateLimitError => exception | ||
| # Rate limit errors should bubble up to trigger Sidekiq retry with exponential backoff | ||
| # Do NOT catch and swallow - let the job fail and retry immediately | ||
| raise exception | ||
| rescue => exception | ||
| name = buyer.name | ||
| buyer_id = buyer.id | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -451,6 +451,10 @@ def charge!(automatic = true) | |
| logger.info("Invoice #{id} (buyer #{buyer_account_id}) was not charged") | ||
| false | ||
| end | ||
| rescue Finance::Payment::StripeRateLimitError => e | ||
| # Rate limit errors should bubble up to Sidekiq for immediate retry with exponential backoff | ||
| # Don't treat these as payment failures - they are temporary gateway issues | ||
| raise e | ||
| rescue Finance::Payment::CreditCardError, ActiveMerchant::ActiveMerchantError | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What if we add the exception here, in this block? then it will be subject to retries? |
||
| provider.billing_strategy&.error("Error when charging invoice #{id} (buyer #{buyer_account_id})", buyer) | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -16,18 +16,26 @@ def self.call(account_id, options = {}) | |
| new(account_id, options).call | ||
| end | ||
|
|
||
| attr_reader :account_id, :provider_account_id, :now, :skip_notifications | ||
| attr_reader :account_id, :provider_account_id, :now, :skip_notifications, :lock_service | ||
|
|
||
| # @param account_id [Integer] either a provider account, or a buyer account with :provider_account_id in options | ||
| def initialize(account_id, options = {}) | ||
| @account_id = account_id | ||
| @provider_account_id = options[:provider_account_id] || account_id | ||
| @now = Time.zone.parse(options[:now].to_s) || Time.zone.now | ||
| @skip_notifications = options[:skip_notifications] | ||
| @lock_service = Synchronization::BillingLockService.new(account_id.to_s) | ||
| end | ||
|
|
||
| def call! | ||
| with_lock { call } | ||
| lock_service.lock | ||
| call | ||
| rescue Finance::Payment::StripeRateLimitError => error | ||
| # Rate limit errors should retry immediately via Sidekiq | ||
| # Release the lock so retries can proceed without waiting 1 hour | ||
| lock_service.unlock | ||
| report_error(error) | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's already reported from the strategy
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I removed (or I think I did) all reports and all log printing except here in the BillingService. |
||
| raise error | ||
| rescue LockBillingError, SpuriousBillingError => error | ||
| report_error(error) | ||
| nil | ||
|
|
@@ -64,13 +72,6 @@ def billing_options | |
| options | ||
| end | ||
|
|
||
| def with_lock | ||
| # intentionally skip unlocking, no further billing of account within 1 hour allowed | ||
| raise LockBillingError, "Concurrent billing job already running for account #{account_id}" unless Synchronization::NowaitLockService.call("billing:#{account_id}", timeout: 1.hour.in_milliseconds).result | ||
|
|
||
| yield | ||
| end | ||
|
|
||
| def report_error(error) | ||
| message = "Failed to perform billing job: #{error.message}" | ||
| System::ErrorReporting.report_error(error, error_message: message, | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -20,7 +20,19 @@ def initialize(gateway, payment_method_id:, invoice: nil, gateway_options: {}) | |
|
|
||
| def charge(amount) | ||
| payment_intent = latest_pending_payment_intent | ||
| payment_intent ? confirm_payment_intent(payment_intent) : create_payment_intent(amount) | ||
| response = payment_intent ? confirm_payment_intent(payment_intent) : create_payment_intent(amount) | ||
|
|
||
| if rate_limit_error? response | ||
| payment_metadata = { | ||
| invoice_id: @invoice&.id, | ||
| buyer_id: @invoice&.buyer_account&.id, | ||
| payment_method_id: @payment_method_id, | ||
| gateway_options: @gateway_options | ||
| } | ||
| raise Finance::Payment::StripeRateLimitError.new(response, payment_metadata) | ||
| end | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't like the idea that one exception we generate here and others in another place. If service has to generate exceptions, it has to do it for any failure. And if it is not supposed to generate exceptions, then it should not generate any. In this case it feels as if the "contract" is not to generate exceptions. In other languages like Java, whether a method raises (throws) or not is part of the definition of the method. Looking at the whole charging implementation though, it is rather convoluted. I think we should either:
My personal take would be to enable Stripe SDK native retries in the first place and do some refactor of how we classify errors in the future if needed. My second preference would be to both (which doesn't preclude enabling native SDK retries):
Last preference of mine is to classify the errors in the upper layers - as now but with more prominent structure that accounts for braintree and possible future gateways. But this will still require the final decision to be taken within P.S. I understand that it is very complicated to figure out the least shitty approach here besides heavily refactoring everything. That's why I prefer the least changes or at least to be as compact as possible. Not trying to complain for everything although all approaches have pros and cons. The above is just what I find most sensible but I'm very open to change my mind. |
||
|
|
||
| response | ||
| end | ||
|
|
||
| protected | ||
|
|
@@ -86,4 +98,19 @@ def default_charge_description | |
|
|
||
| "#{invoice.from.name} #{PAYMENT_DESCRIPTION} #{invoice.friendly_id}".strip | ||
| end | ||
|
|
||
| # Response body in case of Rate Limit error: | ||
| # { | ||
| # "error": { | ||
| # "message": "Request rate limit exceeded. Learn more about rate limits here https://stripe.com/docs/rate-limits.", | ||
| # "type": "invalid_request_error", | ||
| # "code": "rate_limit", | ||
| # "doc_url": "https://stripe.com/docs/error-codes/rate-limit" | ||
| # } | ||
| # } | ||
| def rate_limit_error?(response) | ||
| return false if response.success? | ||
|
|
||
| response.params.dig("error","code") == 'rate_limit' | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I repeated the tests, and while the Stripe API does return 429 status code, however the status code is "swallowed" by ActiveMerchant code, so we don't have the information on the HTTP status code in this point. See https://github.com/activemerchant/active_merchant/blob/v1.137.0/lib/active_merchant/billing/gateways/stripe.rb#L704-L710 So, the As this is, of course, specific to the gateway (Stripe in this case), I moved this detection here, and I also decided to make the exception gatway-specific too
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fine for me, just a nitpick, maybe the |
||
| end | ||
| end | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,31 @@ | ||
| # frozen_string_literal: true | ||
|
|
||
| class Synchronization::BillingLockService < Synchronization::NowaitLockService | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why splitting the logic into two classes? couldn't it be just one? |
||
|
|
||
| # Default timeout for billing | ||
| DEFAULT_TIMEOUT = 1.hour.in_milliseconds | ||
|
|
||
| # @param account_id [String] provider account ID for locking | ||
| # @param timeout [Integer] milliseconds lock timeout | ||
| # @yield [] if lock is acquired, then execute block without parameters and ensure lock is released afterwards | ||
| def initialize(account_id, timeout: DEFAULT_TIMEOUT, &block) | ||
| self.account_id = account_id | ||
|
|
||
| resource = "billing:#{account_id}" | ||
| super(resource, timeout: timeout, &block) | ||
| end | ||
|
|
||
| attr_accessor :lock_info, :account_id | ||
|
|
||
| def lock | ||
| @lock_info = manager.lock(lock_key, timeout) | ||
| raise Finance::LockBillingError, "Concurrent billing job already running for account #{account_id}" unless @lock_info | ||
| end | ||
|
|
||
| def unlock | ||
| manager.unlock(@lock_info) if @lock_info | ||
| @lock_info = nil | ||
| rescue => e | ||
| Rails.logger.warn("Failed to release billing lock for account #{account_id}: #{e.message}") | ||
| end | ||
| end | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,158 @@ | ||
| # Billing Flow Diagrams | ||
|
|
||
| This directory contains Mermaid sequence diagrams documenting the billing process. | ||
|
|
||
| ## Diagrams | ||
|
|
||
| 1. **`billing-flow-happy-path.mmd`** - Normal successful billing flow | ||
| 2. **`billing-flow-rate-limit.mmd`** - Rate limit error handling with immediate retry | ||
| 3. **`billing-flow-payment-error.mmd`** - Payment error handling (card declined) | ||
|
|
||
| ## Viewing Diagrams | ||
|
|
||
| ### Option 1: GitHub/GitLab (Easiest) | ||
|
|
||
| GitHub and GitLab render `.mmd` files automatically. Just view the files in your browser: | ||
| - Click on `billing-flow-happy-path.mmd` in GitHub | ||
| - The diagram renders automatically | ||
|
|
||
| ### Option 2: VS Code (Best for Editing) | ||
|
|
||
| Install the **Mermaid Preview** extension: | ||
|
|
||
| ```bash | ||
| # Install VS Code extension | ||
| code --install-extension bierner.markdown-mermaid | ||
| ``` | ||
|
|
||
| Then: | ||
| 1. Open any `.mmd` file in VS Code | ||
| 2. Right-click → "Open Preview" | ||
| 3. Or use `Ctrl+Shift+V` to preview | ||
|
|
||
| ### Option 3: Command Line (Linux) | ||
|
|
||
| Install Mermaid CLI: | ||
|
|
||
| ```bash | ||
| # Using npm (if you have Node.js installed) | ||
| npm install -g @mermaid-js/mermaid-cli | ||
|
|
||
| # Or using yarn | ||
| yarn global add @mermaid-js/mermaid-cli | ||
| ``` | ||
|
|
||
| Render to PNG/SVG: | ||
|
|
||
| ```bash | ||
| # Generate PNG | ||
| mmdc -i billing-flow-happy-path.mmd -o billing-flow-happy-path.png | ||
|
|
||
| # Generate SVG (better for zoom/print) | ||
| mmdc -i billing-flow-happy-path.mmd -o billing-flow-happy-path.svg | ||
|
|
||
| # Generate all diagrams | ||
| for file in billing-flow-*.mmd; do | ||
| mmdc -i "$file" -o "${file%.mmd}.png" | ||
| done | ||
| ``` | ||
|
|
||
| ### Option 4: Online Editor (No Installation) | ||
|
|
||
| Use the official Mermaid Live Editor: | ||
| - https://mermaid.live/ | ||
|
|
||
| 1. Copy the contents of any `.mmd` file | ||
| 2. Paste into the editor | ||
| 3. View/export the diagram | ||
|
|
||
| ### Option 5: IntelliJ/RubyMine | ||
|
|
||
| Install the **Mermaid** plugin: | ||
| 1. Go to Settings → Plugins | ||
| 2. Search for "Mermaid" | ||
| 3. Install and restart | ||
| 4. Open `.mmd` files to see live preview | ||
|
|
||
| ## Diagram Syntax | ||
|
|
||
| Mermaid uses a simple text-based syntax: | ||
|
|
||
| ```mermaid | ||
| sequenceDiagram | ||
| participant A as Component A | ||
| participant B as Component B | ||
|
|
||
| A->>B: method_call() | ||
| activate B | ||
| B-->>A: return value | ||
| deactivate B | ||
|
|
||
| Note over A,B: This is a note | ||
|
|
||
| rect rgb(200, 255, 200) | ||
| Note over A: Success block | ||
| A->>B: another_call() | ||
| end | ||
| ``` | ||
|
|
||
| ### Key Elements Used | ||
|
|
||
| - `participant X as Name` - Define components/actors | ||
| - `A->>B: message` - Solid arrow (synchronous call) | ||
| - `B-->>A: message` - Dashed arrow (return/response) | ||
| - `activate/deactivate` - Show when component is active | ||
| - `Note over A,B: text` - Add notes/comments | ||
| - `rect rgb(r,g,b)` - Colored background blocks | ||
| - `loop`, `alt`, `opt` - Control structures | ||
|
|
||
| ## Editing Tips | ||
|
|
||
| 1. **Keep it readable** - Don't make diagrams too complex | ||
| 2. **Use colors** - `rect rgb(...)` to highlight different scenarios: | ||
| - Green (200, 255, 200) - Success paths | ||
| - Red (255, 200, 200) - Error paths | ||
| - Yellow (255, 255, 200) - Important notes | ||
| - Blue (200, 200, 255) - Future/delayed actions | ||
| 3. **Add notes** - Explain WHY things happen, not just WHAT | ||
| 4. **Break into multiple diagrams** - One scenario per diagram | ||
|
|
||
| ## Comparing Rate Limit vs. Payment Error | ||
|
|
||
| | Aspect | Rate Limit Error | Payment Error | | ||
| |--------|------------------|---------------| | ||
| | **Exception** | `Finance::Payment::StripeRateLimitError` | `Finance::Payment::CreditCardPurchaseFailed` | | ||
| | **HTTP Code** | 429 | 402 | | ||
| | **Caught where?** | NOT caught in Invoice#charge! | Caught in Invoice#charge! | | ||
| | **Invoice state** | Remains "pending" | Changed to "unpaid" | | ||
| | **Retry counter** | NOT incremented | Incremented (1/3, 2/3, 3/3) | | ||
| | **Job behavior** | Fails → Sidekiq retry | Completes successfully | | ||
| | **Retry timing** | ~15s, ~45s, ~135s... | 3 days later | | ||
| | **Lock** | Released immediately | Held for 1 hour | | ||
| | **User notification** | None | Email sent immediately | | ||
| | **Re-raised?** | Yes (bubbles to BillingWorker) | No (handled in Invoice) | | ||
|
|
||
| ## Key Implementation Details | ||
|
|
||
| ### Critical Fixes | ||
|
|
||
| 1. **`bill_and_charge_each` (billing_strategy.rb:378-383)** | ||
| - Separate rescue for `RateLimitError` to prevent swallowing | ||
| - Must re-raise immediately | ||
|
|
||
| 2. **`BillingStrategy.daily` (billing_strategy.rb:92-112)** | ||
| - Separate rescue for better logging (WARN vs ERROR) | ||
| - Don't call `results.failure()` for rate limits | ||
|
|
||
| 3. **`BillingService.call!` (billing_service.rb:29-41)** | ||
| - Release lock on rate limit | ||
| - Re-raise error for Sidekiq | ||
|
|
||
| 4. **`BillingWorker` (billing_worker.rb:8-20)** | ||
| - Exponential backoff for rate limits: `(3 ** count) * 5 + jitter` | ||
| - Standard delay for other errors: `1.hour + 10` | ||
|
|
||
| ## See Also | ||
|
|
||
| - `STRIPE_RATE_LIMIT_HANDLING.md` - Detailed implementation documentation | ||
| - `test/integration/finance/stripe_rate_limit_handling_test.rb` - Comprehensive tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is useful for reviewing but shouldn't be merged.