Introduce duplication constraint on StripeRecord table#13193
Introduce duplication constraint on StripeRecord table#13193gregorbg wants to merge 2 commits intothewca:mainfrom
Conversation
|
As a small "bonus", I added a |
|
This also needs some form of intervention on database level, at the very least we need to clear the one duplicate record that caused this investigation. But my hunch tells me that there are also a small number of other cases which either went by unnoticed or had less unpleasant consequences. |
This query: SELECT `stripe_id`, COUNT(*) AS `occurrences` FROM `stripe_records` GROUP BY `stripe_id` HAVING COUNT(*) > 1 ORDER BY `occurrences` DESC; Yields the [redacted] list of duplicate stripe_id's |
|
Looks good! Do we want to introduce a validation at the StripeRecord level while we're at it, which asserts the uniqueness of The db index lets us make sure this doesn't happen in future, but the Rails code will catch 99.99...% of the instances, while letting us take advantage of |
|
I specifically addressed this in my wall of text above. We do not want a Rails validation error in this case, we want the hard SQL level error so that |
|
My bad for missing that! |
This is an attempted fix at an (apparent) race condition in our
StripeRecordstable.We have observed that two
StripeRecordentities exist with the same exactstripe_idandstripe_record_type. This normally should not happen, because the two aforementioned properties are exactly the two key elements inStripeRecord.create_or_update_from_api. This method internally usesfind_or_initialize_by, though, and since thecreated_attimestamps of the two duplicate rows are very close together, we suspect that the following happened:find_or_initialize_byand caused a race condition [footnote 1]StripeRecord, which (per our business logic) correspond to the same refund logically, but have two different primary key IDs technically.original_payment.refunding_registration_payments.where(receipt: stored_record).any?and happily concludes that the refund was not recorded yet because of the different primary key IDs [footnote 2]The suggested solution is to be more strict by introducing a uniqueness constraint on database level. This is the most certain, fool-proof approach to make sure that within the constraints of Stripe's business specific logic, duplicates never happen ("business-specific" means: PayPal may have completely different ways of identifying the uniqueness of a payment). The new
create_or_find_byapproach induces one more database write than before (once for creating, then immediately updating), but this is worth the reduced risk.The uniqueness constraint is handled on a low level directly by Rails. From the documentation of
create_or_find_by:Specifically, the internal Rails AR code catches
rescue ActiveRecord::RecordNotUnique, which is thrown by the database driver iff the uniqueness constraint is violated. Ironically enough, we do not want model-leveluniquenessvalidations in this case, because they would raise exceptions that we (as the application) would then need to catch/handle. (Also, it is unclear how to define this constraint: Should onestripe_idbe unique perstripe_record_type? Or should onestripe_record_typebe unique perstripe_id? These choices seem too arbitrary).Footnotes
1 -- The race condition happens because
find_or_initialize_bycreates new records in an unpersisted state if they are not found. Only the subsequentupdate!call actually writes them. This means that another thread from another request might have just barely enough time to trigger anotherfind_or_initialize_bywhile the original thread is waiting for theudpate!to actually write.2 -- Why not check for
stripe_idand/orstripe_record_typedirectly in the "has this refund already been issued" check? Because we (want to) support multiple, diverse payment systems. The relationrefund_receiptis intentionally polymorphic, and we don't know whether PayPal or WeChat Pay or IndiaPaymentGateway123 or YourCoolPaymentSolution identifies uniqueness of payments / payment identifiers in the same way that Stripe does. So we don't want to be Stripe-specific at this point in time. andStripeRecordshould be the one taking care of uniqueness in itself (see above)