@@ -85,7 +85,7 @@ This table is partitioned by date (YYYY-MM-DD format).
8585
8686## Metering Table
8787
88- The ` metering ` table captures detailed usage metrics for each document processing operation:
88+ The ` metering ` table captures detailed usage metrics and cost information for each document processing operation:
8989
9090| Column | Type | Description |
9191| --------| ------| -------------|
@@ -95,15 +95,52 @@ The `metering` table captures detailed usage metrics for each document processin
9595| unit | string | Unit of measurement (pages, inputTokens, outputTokens, etc.) |
9696| value | double | Quantity of the unit consumed |
9797| number_of_pages | int | Number of pages in the document |
98+ | unit_cost | double | Cost per unit in USD (e.g., cost per token, cost per page) |
99+ | estimated_cost | double | Calculated total cost in USD (value × unit_cost) |
98100| timestamp | timestamp | When the operation was performed |
99101
100102This table is partitioned by date (YYYY-MM-DD format).
101103
104+ ### Cost Calculation and Pricing
105+
106+ The metering table now includes automated cost calculation capabilities:
107+
108+ - ** unit_cost** : Retrieved from pricing configuration for each service_api/unit combination
109+ - ** estimated_cost** : Automatically calculated as value × unit_cost for each record
110+ - ** Dynamic Pricing** : Costs are loaded from configuration and cached for performance
111+ - ** Fallback Handling** : When pricing data is not available, unit_cost defaults to $0.0
112+
113+ #### Pricing Configuration Format
114+
115+ Pricing data is loaded from the system configuration in the following format:
116+
117+ ``` yaml
118+ pricing :
119+ - name : " bedrock/us.anthropic.claude-3-sonnet-20240229-v1:0"
120+ units :
121+ - name : " inputTokens"
122+ price : " 3.0e-6" # $0.000003 per input token
123+ - name : " outputTokens"
124+ price : " 1.5e-5" # $0.000015 per output token
125+ - name : " textract/analyze_document"
126+ units :
127+ - name : " pages"
128+ price : " 0.0015" # $0.0015 per page
129+ ` ` `
130+
131+ #### Cost Calculation Process
132+
133+ 1. **Service/Unit Matching**: System attempts exact match for service_api/unit combination
134+ 2. **Partial Matching**: If exact match fails, uses fuzzy matching for common patterns
135+ 3. **Cost Calculation**: estimated_cost = value × unit_cost
136+ 4. **Caching**: Pricing data is cached to avoid repeated configuration lookups
137+
102138The metering table is particularly valuable for:
103- - Cost analysis and allocation
104- - Usage pattern identification
105- - Resource optimization
106- - Performance benchmarking across different document types and sizes
139+ - **Cost analysis and allocation** - Track spending by document type, service, or time period
140+ - **Usage pattern identification** - Analyze consumption patterns across different models
141+ - **Resource optimization** - Identify cost-effective processing approaches
142+ - **Performance benchmarking** - Compare cost efficiency across different document types and sizes
143+ - **Budget monitoring** - Track actual costs against budgets and forecasts
107144
108145## Document Sections Tables
109146
@@ -319,6 +356,108 @@ ORDER BY
319356 month;
320357```
321358
359+ ** Cost analysis queries:**
360+ ``` sql
361+ -- Total estimated costs by service API
362+ SELECT
363+ service_api,
364+ SUM (estimated_cost) as total_cost,
365+ AVG (estimated_cost) as avg_cost_per_operation,
366+ COUNT (* ) as operation_count,
367+ COUNT (DISTINCT document_id) as document_count
368+ FROM
369+ metering
370+ WHERE
371+ date BETWEEN ' 2024-01-01' AND ' 2024-01-31'
372+ GROUP BY
373+ service_api
374+ ORDER BY
375+ total_cost DESC ;
376+
377+ -- Cost per page analysis by document type
378+ SELECT
379+ se .section_type ,
380+ SUM (m .estimated_cost ) / SUM (m .number_of_pages ) as cost_per_page,
381+ SUM (m .estimated_cost ) as total_cost,
382+ SUM (m .number_of_pages ) as total_pages,
383+ COUNT (DISTINCT m .document_id ) as document_count
384+ FROM
385+ metering m
386+ JOIN
387+ section_evaluations se ON m .document_id = se .document_id
388+ WHERE
389+ m .number_of_pages > 0
390+ AND m .date BETWEEN ' 2024-01-01' AND ' 2024-01-31'
391+ GROUP BY
392+ se .section_type
393+ ORDER BY
394+ cost_per_page DESC ;
395+
396+ -- Daily cost trends
397+ SELECT
398+ date ,
399+ SUM (estimated_cost) as daily_cost,
400+ COUNT (DISTINCT document_id) as documents_processed,
401+ SUM (estimated_cost) / COUNT (DISTINCT document_id) as avg_cost_per_document
402+ FROM
403+ metering
404+ WHERE
405+ date BETWEEN ' 2024-01-01' AND ' 2024-01-31'
406+ GROUP BY
407+ date
408+ ORDER BY
409+ date ;
410+
411+ -- Most expensive documents
412+ SELECT
413+ document_id,
414+ SUM (estimated_cost) as total_document_cost,
415+ SUM (value) as total_units_consumed,
416+ COUNT (* ) as operations_count,
417+ MAX (number_of_pages) as page_count
418+ FROM
419+ metering
420+ WHERE
421+ date BETWEEN ' 2024-01-01' AND ' 2024-01-31'
422+ GROUP BY
423+ document_id
424+ ORDER BY
425+ total_document_cost DESC
426+ LIMIT 10 ;
427+
428+ -- Cost efficiency by model (cost per token)
429+ SELECT
430+ service_api,
431+ SUM (estimated_cost) / SUM (value) as cost_per_token,
432+ SUM (estimated_cost) as total_cost,
433+ SUM (value) as total_tokens,
434+ COUNT (DISTINCT document_id) as document_count
435+ FROM
436+ metering
437+ WHERE
438+ unit IN (' inputTokens' , ' outputTokens' , ' totalTokens' )
439+ AND date BETWEEN ' 2024-01-01' AND ' 2024-01-31'
440+ GROUP BY
441+ service_api
442+ ORDER BY
443+ cost_per_token ASC ;
444+
445+ -- Cost breakdown by processing context
446+ SELECT
447+ context,
448+ SUM (estimated_cost) as total_cost,
449+ COUNT (DISTINCT document_id) as document_count,
450+ SUM (estimated_cost) / COUNT (DISTINCT document_id) as avg_cost_per_document
451+ FROM
452+ metering
453+ WHERE
454+ date BETWEEN ' 2024-01-01' AND ' 2024-01-31'
455+ GROUP BY
456+ context
457+ ORDER BY
458+ total_cost DESC ;
459+ ```
460+
322461### Creating Dashboards
323462
324463For more advanced visualization and dashboarding:
0 commit comments