aws-solutions-library-samples
diff --git a/‎assets/docs/ANALYTICS.md‎
Lines changed: 79 additions & 53 deletions b/‎assets/docs/ANALYTICS.md‎
Lines changed: 79 additions & 53 deletions
diff --git a/‎assets/docs/CLI_REFERENCE.md‎
Lines changed: 24 additions & 2 deletions b/‎assets/docs/CLI_REFERENCE.md‎
Lines changed: 24 additions & 2 deletions
diff --git a/‎assets/docs/MONITORING.md‎
Lines changed: 13 additions & 0 deletions b/‎assets/docs/MONITORING.md‎
Lines changed: 13 additions & 0 deletions
@@ -53,74 +53,85 @@ aws cloudformation deploy \
 1. Navigate to the Athena console URL provided in the stack outputs
 2. Select the workgroup created by the stack (e.g., `claude-code-analytics-workgroup`)
 3. Select the database (e.g., `claude_code_analytics_analytics`)
+4. Access the saved queries from the "Saved queries" tab in the Athena console
 
-### Sample Queries
+### Pre-Built Named Queries
 
-The stack creates several named queries that you can use as starting points:
+The stack automatically creates 10 named queries associated with your workgroup. These queries provide comprehensive analytics capabilities:
 
-#### Top Users by Token Usage (Last 7 Days)
+#### 1. Top Users by Token Usage
+Identifies your top 10 users by token consumption over the last 7 days, including user email, organization, session count, and estimated costs.
 
-```sql
-WITH user_totals AS (
-    SELECT
-        user_id,
-        SUM(token_usage) as total_tokens,
-        COUNT(DISTINCT session_id) as session_count,
-        COUNT(DISTINCT DATE(from_unixtime(timestamp/1000))) as active_days
-    FROM metrics
-    WHERE year >= YEAR(CURRENT_DATE - INTERVAL '7' DAY)
-        AND from_unixtime(timestamp/1000) >= CURRENT_TIMESTAMP - INTERVAL '7' DAY
-    GROUP BY user_id
-)
-SELECT
-    SUBSTR(user_id, 1, 8) || '...' as user_id_short,
-    total_tokens,
-    session_count,
-    active_days,
-    ROUND(total_tokens * 0.000015, 2) as estimated_cost_usd
-FROM user_totals
-ORDER BY total_tokens DESC
-LIMIT 10;
-```
+**Use Case:** Understand who your power users are and track usage patterns.
 
-#### Token Usage by Model
+#### 2. Token Usage by Model and Type
+Analyzes token usage patterns across different models (Opus, Sonnet, Haiku) and token types (input/output) with cost estimates.
 
-```sql
-SELECT
-    model,
-    type as token_type,
-    SUM(token_usage) as total_tokens,
-    COUNT(DISTINCT user_id) as unique_users,
-    ROUND(SUM(token_usage) * 0.000015, 2) as estimated_cost_usd
-FROM metrics
-WHERE year >= YEAR(CURRENT_DATE - INTERVAL '30' DAY)
-    AND from_unixtime(timestamp/1000) >= CURRENT_TIMESTAMP - INTERVAL '30' DAY
-GROUP BY model, type
-ORDER BY total_tokens DESC;
-```
+**Use Case:** Optimize model selection and understand cost distribution.
 
-#### User Activity by Hour
+#### 3. User Activity Pattern
+Shows user activity patterns by hour of day to identify peak usage times.
 
-```sql
-SELECT
-    HOUR(from_unixtime(timestamp/1000)) as hour_of_day,
-    COUNT(DISTINCT user_id) as active_users,
-    SUM(token_usage) as total_tokens
-FROM metrics
-WHERE year >= YEAR(CURRENT_DATE - INTERVAL '7' DAY)
-    AND from_unixtime(timestamp/1000) >= CURRENT_TIMESTAMP - INTERVAL '7' DAY
-GROUP BY HOUR(from_unixtime(timestamp/1000))
-ORDER BY hour_of_day;
-```
+**Use Case:** Capacity planning and understanding when your users are most active.
+
+#### 4. Token Usage by Organization
+Tracks token usage across different organizations with user counts and cost attribution.
+
+**Use Case:** Organizational billing and chargeback.
+
+#### 5. Token Usage by Email Domain
+Analyzes usage patterns by email domain to understand user demographics.
+
+**Use Case:** Identify which teams or departments are using the service.
+
+#### 6. Detailed TPM and RPM Analysis
+Calculates tokens per minute (TPM) and requests per minute (RPM) metrics for rate limit monitoring.
+
+**Use Case:** Monitor API usage patterns and prevent rate limiting issues.
+
+#### 7. User Session Analysis
+Analyzes user sessions including duration, intensity, models used, and per-session costs.
+
+**Use Case:** Understand user behavior and session patterns.
+
+#### 8. Detailed Cost Attribution
+Provides precise cost calculations by user, organization, and model with cumulative tracking.
 
-### Custom Time Ranges
+**Use Case:** Accurate billing and cost management.
 
-To query different time ranges, modify the WHERE clause:
+#### 9. Peak Usage and Rate Limit Analysis
+Identifies peak usage periods and highlights when you're approaching rate limits.
+
+**Use Case:** Proactive monitoring to prevent service disruptions.
+
+#### 10. Usage Analysis by Identity Provider
+Compares usage patterns across different identity providers (Okta, Auth0, Cognito).
+
+**Use Case:** Understand usage by authentication method.
+
+### Working with the Queries
+
+Once you've selected your workgroup and database in the Athena console:
+
+1. **Access Saved Queries**: Click on the "Saved queries" tab
+2. **Load a Query**: Select any of the 10 pre-built queries to load it into the query editor
+3. **Run the Query**: Click "Run" to execute the query with your current data
+4. **Export Results**: Download results as CSV for further analysis
+
+### Customizing Queries
+
+#### Adjusting Time Ranges
+
+Modify the WHERE clause in any query to change the time range:
 
 ```sql
 -- Last 24 hours
 WHERE from_unixtime(timestamp/1000) >= CURRENT_TIMESTAMP - INTERVAL '24' HOUR
 
+-- Last 7 days
+WHERE year >= YEAR(CURRENT_DATE - INTERVAL '7' DAY)
+    AND from_unixtime(timestamp/1000) >= CURRENT_TIMESTAMP - INTERVAL '7' DAY
+
 -- Last 30 days
 WHERE year >= YEAR(CURRENT_DATE - INTERVAL '30' DAY)
     AND from_unixtime(timestamp/1000) >= CURRENT_TIMESTAMP - INTERVAL '30' DAY
@@ -129,6 +140,21 @@ WHERE year >= YEAR(CURRENT_DATE - INTERVAL '30' DAY)
 WHERE from_unixtime(timestamp/1000) BETWEEN TIMESTAMP '2024-01-01' AND TIMESTAMP '2024-01-31'
 ```
 
+#### Filtering by Specific Users or Organizations
+
+Add additional WHERE conditions to focus on specific users:
+
+```sql
+-- Filter by email domain
+AND user_email LIKE '%@example.com'
+
+-- Filter by organization
+AND organization_id = 'your-org-id'
+
+-- Filter by specific model
+AND model LIKE '%opus%'
+```
+
 ## Data Retention
 
 - **S3 Standard**: 90 days (configurable via `DataRetentionDays` parameter)
 
@@ -80,7 +80,7 @@ poetry run ccwb deploy [stack] [options]
 
 **Arguments:**
 
-- `stack` - Specific stack to deploy: auth, networking, monitoring, dashboard, or analytics (optional)
+- `stack` - Specific stack to deploy: auth, networking, monitoring, dashboard, analytics, or quota (optional)
 
 **Options:**
 
@@ -102,7 +102,29 @@ poetry run ccwb deploy [stack] [options]
 3. **monitoring** - OpenTelemetry collector on ECS Fargate (optional)
 4. **dashboard** - CloudWatch dashboard for usage metrics (optional)
 5. **analytics** - Kinesis Firehose and Athena for analytics (optional)
-6. **codebuild** - AWS CodeBuild for Windows binary builds (optional, only if enabled during init)
+6. **quota** - Per-user token quota monitoring and alerts (optional, requires dashboard)
+7. **codebuild** - AWS CodeBuild for Windows binary builds (optional, only if enabled during init)
+
+**Examples:**
+
+```bash
+# Deploy all configured stacks
+poetry run ccwb deploy
+
+# Deploy only authentication
+poetry run ccwb deploy auth
+
+# Deploy quota monitoring (requires dashboard stack first)
+poetry run ccwb deploy quota
+
+# Show commands without executing
+poetry run ccwb deploy --show-commands
+
+# Dry run to see what would be deployed
+poetry run ccwb deploy --dry-run
+```
+
+> **Note**: Quota monitoring requires the dashboard stack to be deployed first. See [Quota Monitoring Guide](QUOTA_MONITORING.md) for detailed information.
 
 ### `test` - Test Package
 
 
@@ -32,6 +32,15 @@ Claude Code sends several metric types that the collector processes:
 - `claude_code.cost.usage` - Estimates costs based on token usage
 - `claude_code.code_edit_tool.decision` - Records code editing decisions
 
+## Usage Quota Monitoring
+
+The monitoring system supports optional quota tracking to alert administrators when users approach or exceed token usage limits. This helps manage costs and prevent unexpected overages.
+
+Quota monitoring deploys as a separate CloudFormation stack that integrates with the dashboard infrastructure. When enabled, it tracks monthly token consumption per user and sends automated alerts through Amazon SNS when usage thresholds are exceeded.
+
+> **Detailed Information**: For complete quota monitoring setup, configuration, and usage instructions, see the [Quota Monitoring Guide](QUOTA_MONITORING.md).
+
+
 ## Analytics Pipeline (Optional)
 
 Beyond real-time monitoring through CloudWatch, you can enable an analytics pipeline for advanced reporting and historical analysis. The analytics stack creates a data lake for long-term metric storage and analysis.
@@ -54,6 +63,8 @@ The deployment creates the complete monitoring infrastructure: a VPC with public
 
 If you provide a custom domain name and hosted zone ID during setup, the system automatically provisions an ACM certificate and configures HTTPS. This ensures encrypted transmission of metrics from Claude Code to your collector.
 
+The dashboard stack creates the metrics aggregation infrastructure that supports quota monitoring. If you choose to deploy quota monitoring as a separate stack, it integrates with the dashboard's metrics table to track user consumption.
+
 ## Claude Code Configuration
 
 The package command generates a `claude-settings/settings.json` file in the distribution package that configures Claude Code for telemetry collection. During installation, this file gets copied to `~/.claude/settings.json` in the user's home directory and contains all the settings needed for monitoring.
@@ -131,3 +142,5 @@ The Application Load Balancer is internet-facing to receive metrics from Claude
 The monitoring system provides comprehensive visibility into Claude Code usage across your organization. Deployment is automated through the `ccwb` CLI tools, creating all necessary infrastructure with minimal configuration. The OTEL Collector on ECS Fargate handles metric collection and transformation, while CloudWatch provides storage and visualization.
 
 User attribution happens automatically through the OTEL helper binary that extracts information from authentication tokens. This enables detailed usage tracking by user, department, team, and other organizational dimensions without requiring manual configuration.
+
+Quota monitoring provides proactive alerts when users approach or exceed token usage limits. The system sends detailed notifications through SNS, allowing organizations to manage costs and usage patterns effectively.