-
Notifications
You must be signed in to change notification settings - Fork 47
Description
Describe the feature
audit_helper.compare_and_classify_query_results is great, but analyzing the diifs is very very time consuming
Describe alternatives you've considered
This query below gives an example of what I do to get the "diffs that I care about, above a threshold". If you could build a macro that does this, HUGE benefit.
The below is done on a model (table = theResults) materialized from audit_helper.compare_and_classify_query_results
--- get the rows not identical. This will be the driver cte at the bottom
WITH CTE_ AS (
SELECT * FROM theResults
WHERE
DBT_AUDIT_ROW_STATUS <> 'identical'
ORDER BY dbt_audit_surrogate_key
),
--- get the differing hashes on the primary key
cte_min_max AS
(
SELECT
dbt_audit_surrogate_key,
min(DBt_AUDIT_ROW_HASH) AS min_DBt_AUDIT_ROW_HASH,
max(DBt_AUDIT_ROW_HASH) AS max_DBt_AUDIT_ROW_HASH
FROM CTE_
GROUP BY all
)
SELECT
cte_min_max.Dbt_audit_surrogate_key
-- these are the measures that I needed for my example... they are the same used in
-- audit_helper.compare_and_classify_query_results
,cte_1.CONVERTED_AMOUNT AS CONVERTED_AMOUNT1
,cte_2.CONVERTED_AMOUNT AS CONVERTED_AMOUNT2
,(nvl(cte_1.CONVERTED_AMOUNT,0) - nvl(cte_2.CONVERTED_AMOUNT,0))::int
AS CONVERTED_AMOUNT_diff
,cte_1.CONVERTED_AMOUNT_DEBIT AS CONVERTED_AMOUNT_DEBIT1
,cte_2.CONVERTED_AMOUNT_DEBIT AS CONVERTED_AMOUNT_DEBIT2
,(nvl(cte_1.CONVERTED_AMOUNT_DEBIT,0) - nvl(cte_2.CONVERTED_AMOUNT_DEBIT,0))::int
AS CONVERTED_AMOUNT_DEBIT_diff
,cte_1.CONVERTED_AMOUNT_CREDIT AS CONVERTED_AMOUNT_CREDIT1
,cte_2.CONVERTED_AMOUNT_CREDIT AS CONVERTED_AMOUNT_CREDIT2
,(nvl(cte_1.CONVERTED_AMOUNT_CREDIT,0) - nvl(cte_2.CONVERTED_AMOUNT_CREDIT,0))::int
AS CONVERTED_AMOUNT_CREDIT_diff
FROM
cte_min_max
LEFT JOIN CTE_ AS cte_1 ON (cte_min_max.dbt_audit_surrogate_key = cte_1.dbt_audit_surrogate_key AND cte_min_max.min_DBt_AUDIT_ROW_HASH = cte_1.DBt_AUDIT_ROW_HASH)
LEFT JOIN CTE_ AS cte_2 ON (cte_min_max.dbt_audit_surrogate_key = cte_1.dbt_audit_surrogate_key AND cte_min_max.max_DBt_AUDIT_ROW_HASH = cte_2.DBt_AUDIT_ROW_HASH)
WHERE
--ignore identical hashed rows
cte_min_max.min_DBt_AUDIT_ROW_HASH <> cte_min_max.max_DBt_AUDIT_ROW_HASH
--just show the diffs > than a threshold. this could be a parameter for the macro
AND
(
abs(CONVERTED_AMOUNT_diff) > 1
or abs(CONVERTED_AMOUNT_DEBIT_diff) > 1
or abs(CONVERTED_AMOUNT_CREDIT_diff) > 1
)
ORDER BY cte_min_max.Dbt_audit_surrogate_key
Additional context
Is this feature database-specific? Which database(s) is/are relevant? Please include any other relevant context here.
No, will work on all databases
Who will this benefit?
This will benefit everyone using audit_helper.compare_and_classify_query_results
Are you interested in contributing this feature? No - I am not a commercial macro creator.
Sure - I have plenty to demo for this.