Skip to content

Commit 697620a

Browse files
authored
Merge pull request #2255 from sandeepyadav-lt/atx-6202-prod
Replace and update AI RCA images and documentation. Added new category management section and refined analysis scope instructions. Removed outdated images and improved clarity in configuration steps.
2 parents 6363452 + e5ab35d commit 697620a

6 files changed

+107
-58
lines changed
101 KB
Loading
Binary file not shown.
43.4 KB
Loading
Binary file not shown.
-95.4 KB
Binary file not shown.

docs/analytics-ai-root-cause-analysis.md

Lines changed: 107 additions & 58 deletions
Original file line numberDiff line numberDiff line change
@@ -72,65 +72,18 @@ AI RCA is an intelligent feature that uses advanced machine learning algorithms
7272

7373
### Step 2: Enable AI RCA
7474

75-
1. **Toggle the Feature**: Use the blue toggle switch to enable "Automatic AI RCA"
76-
2. **Configure Analysis Scope**: Choose which types of test failures to analyze:
77-
- **All failures**: Analyze every failed test, regardless of previous status
78-
- **New failures**: Analyze only tests that have failed recently after having passed at least 10 consecutive times previously.
79-
- **Always failing**: Analyze only tests that have failed in all of their previous 5 runs to identify persistent issues.
75+
**Toggle the Feature**: Use the blue toggle switch to enable "Automatic AI RCA"
8076

81-
### Step 3: Set Special Instructions (Optional)
77+
### Step 3: Configure Analysis Scope
8278

83-
Provide context or specific guidance for the AI to consider during analysis:
84-
85-
1. Click on the **Special Instructions** section
86-
2. Enter any special instructions or context that should be considered during AI root cause analysis
87-
3. Use the "Show examples" link for guidance on effective instruction writing
88-
89-
**Example Instructions:**
90-
91-
:::tip
92-
Our CRM application has specific failure patterns to watch for:
93-
94-
**PRIORITY CATEGORIES**
95-
1. **Database Connection Issues** - Our PostgreSQL connection pool is limited to 20 connections. Look for connection timeouts, pool exhaustion, or slow query performance.
96-
97-
2. **Third-party API Failures** - We integrate with Salesforce, HubSpot, and Mailchimp. These external APIs often have rate limits and intermittent failures that cause our tests to fail.
98-
99-
3. **File Upload/Processing Issues** - Contact import via CSV files often fails due to file size limits (10MB max) or malformed data. Check for upload timeouts and validation errors.
100-
101-
4. **Authentication/Authorization** - We use OAuth 2.0 with multiple providers. Token expiration and permission changes frequently cause test failures.
102-
103-
5. **UI Element Timing Issues** - Our CRM uses dynamic loading for contact lists and reports. Elements may not be ready when tests try to interact with them.
104-
105-
**SPECIFIC CONTEXT**
106-
- Our test environment has limited resources compared to production
107-
- We run tests during business hours when external APIs are under heavy load
108-
- Focus on identifying whether failures are environment-specific or application bugs
109-
- Prioritize failures that affect core CRM functionality (contact management, lead tracking, reporting)
110-
- Consider our custom error handling - we log all errors to Sentry and show user-friendly messages
111-
112-
**IGNORE THESE COMMON FALSE POSITIVES**
113-
- Browser console warnings that don't affect functionality
114-
- Network requests to analytics services (Google Analytics, Hotjar)
115-
- Minor UI layout shifts that don't break functionality
116-
:::
117-
118-
**Possible Categories and Descriptions:**
119-
120-
| Category | Description |
121-
|----------|-------------|
122-
| **Database Issues** | Connection timeouts, query performance, data integrity problems |
123-
| **API Integration** | Third-party service failures, rate limiting, authentication issues |
124-
| **UI/UX Problems** | Element not found, timing issues, responsive design failures |
125-
| **Performance Issues** | Slow page loads, memory leaks, resource exhaustion |
126-
| **Environment Issues** | Test data problems, configuration mismatches, infrastructure failures |
127-
| **Authentication/Authorization** | Login failures, permission errors, session timeouts |
128-
| **File Processing** | Upload failures, format validation, processing timeouts |
129-
| **Network Issues** | Connectivity problems, DNS failures, proxy issues |
79+
In the **Analysis Scope** section, choose which types of test failures to analyze:
80+
- **All failures**: Analyze every failed test, regardless of previous status
81+
- **New failures**: Analyze only tests that have failed recently after having passed at least 10 consecutive times previously.
82+
- **Consistent Failures**: Analyze only tests that have failed in all of their previous 5 runs to identify persistent issues.
13083

13184
### Step 4: Configure Intelligent Targeting
13285

133-
Configure intelligent targeting rules to precisely control which tests, builds, tags, or jobs are included in AI-powered analysis:
86+
Configure intelligent targeting rules to precisely control which tests, builds, tags, projects, or jobs are included in AI-powered analysis:
13487

13588
1. **Add Targeting Rules**: Enter regex patterns in the input field
13689
2. **Click Include (+) or Exclude (-)**: Choose whether to include or exclude matching tests
@@ -139,6 +92,7 @@ Configure intelligent targeting rules to precisely control which tests, builds,
13992
- **Build Names**: Include or exclude builds with specific names (e.g., hourly, nightly)
14093
- **Test Tags**: Include or exclude tests with specific tags (e.g., playwright_test, atxHyperexecute_test)
14194
- **Build Tags**: Include or exclude builds with specific tags (e.g., hourly, nightly)
95+
- **Project Names**: Include or exclude tests from specific projects using regex patterns
14296
- **Job Labels**: Include tests with specific job labels or tags
14397

14498
#### Rule Logic and Application
@@ -148,7 +102,7 @@ The intelligent targeting system applies rules using the following logic:
148102
**Rule Evaluation Process:**
149103
1. **Include Rules (AND Logic)**: All Include rules within the same category must match for a test to be considered
150104
2. **Exclude Rules (OR Logic)**: Any Exclude rule that matches will immediately exclude the test from analysis
151-
3. **Cross-Category Logic**: Include rules across different categories (Test Names, Build Tags, etc.) must ALL match
105+
3. **Cross-Category Logic**: Include rules across different categories (Test Names, Build Tags, Project Names, etc.) must ALL match
152106
4. **Exclusion Precedence**: Exclude rules take priority over Include rules - if any exclude rule matches, the test is excluded regardless of include matches
153107

154108
**Best Practices for Rule Configuration:**
@@ -173,11 +127,103 @@ The intelligent targeting system applies rules using the following logic:
173127
- **Include**: `playwright_test|atxHyperexecute_test` - Focus on specific test frameworks
174128
- **Exclude**: `.*smoke.*` - Skip smoke tests
175129

176-
**Result**: AI-powered analysis will run only on production tests (excluding non-critical ones) from hourly builds, focusing on Playwright or HyperExecute test tags, while excluding smoke tests. This configuration helps narrow down analysis to the most critical test scenarios.
130+
**Project Names:**
131+
- **Include**: `^ecommerce|^payment` - Only analyze tests from projects starting with "ecommerce" or "payment"
132+
- **Exclude**: `.*staging.*` - Skip tests from staging projects
133+
134+
**Result**: AI-powered analysis will run only on production tests (excluding non-critical ones) from hourly builds, focusing on Playwright or HyperExecute test tags, while excluding smoke tests. The analysis will target ecommerce and payment projects, excluding staging projects. This configuration helps narrow down analysis to the most critical test scenarios.
177135
:::
178136

137+
### Step 5: Manage Custom RCA Categories (Optional)
138+
139+
Custom RCA Categories allow you to define intelligent classification categories that automatically categorize and organize test failure analysis results. This helps you group similar failures together, track trends, and prioritize fixes more effectively.
140+
141+
<img loading="lazy" src={require('../assets/images/analytics/test-intelligence-ai-test-rca-category.webp').default} alt="Custom RCA Categories Management" width="800" height="400" className="doc_img"/>
142+
143+
#### Managing Categories
144+
145+
1. In the **Automatic AI RCA** configuration page, locate the **Custom RCA Categories** section
146+
2. Click the **Manage** button to open the category management drawer
147+
3. **Create**: Click **Add Category**, enter a name and description, select **Active** or **Inactive** status, then click **Create RCA Category**
148+
4. **Edit**: Click the edit icon on any category card to modify its details
149+
5. **Delete**: Click the delete icon and confirm to remove a category
150+
6. **Search**: Use the search box to filter categories by name or description
151+
152+
**Category Status:**
153+
- **Active**: Used by AI for automatic classification and appears in RCA results
154+
- **Inactive**: Saved but not used for classification; can be reactivated later
155+
156+
**Best Practices:**
157+
158+
:::tip
159+
- **Be Specific**: Create distinct categories (e.g., "Database Connection Timeouts" vs "Database Issues")
160+
- **Use Clear Names**: Choose names your team understands immediately
161+
- **Start Small**: Begin with 5-10 active categories for your most common failure types
162+
- **Review Regularly**: Periodically refine categories based on your failure patterns
163+
:::
164+
165+
**Example Custom RCA Categories:**
166+
167+
| Category Name | Description |
168+
|--------------|-------------|
169+
| **UI Element Not Found** | Failures where tests cannot locate expected UI elements due to timing issues, selector changes, or DOM modifications |
170+
| **API Timeout Errors** | Failures caused by API requests exceeding timeout thresholds, often related to third-party service reliability |
171+
| **Database Connection Issues** | Failures due to database connection pool exhaustion, connection timeouts, or query performance problems |
172+
| **Authentication Token Expiration** | Failures related to expired or invalid authentication tokens, session timeouts, or OAuth refresh issues |
173+
| **Network Connectivity Issues** | Failures caused by network interruptions, DNS failures, proxy issues, or unstable network connections |
174+
175+
### Step 6: Set Special Instructions (Optional)
176+
177+
Provide context or specific guidance for the AI to consider during analysis:
178+
179+
1. Click on the **Special Instructions** section
180+
2. Enter any special instructions or context that should be considered during AI root cause analysis
181+
3. Use the "Show examples" link for guidance on effective instruction writing
182+
183+
**Example Instructions:**
184+
185+
:::tip
186+
**Environment-Specific Context:**
187+
- Running on Staging environment with test data
188+
- Database may have lag issues during peak hours (9 AM - 5 PM EST)
189+
- Test environment has limited resources compared to production (2GB RAM vs 8GB)
190+
- Network latency is higher in test environment (average 150ms vs 50ms in production)
191+
192+
**Known Issues & Patterns:**
193+
- Payment gateway timeouts during high traffic periods (especially between 2-4 PM)
194+
- Cache invalidation issues occur immediately after deployments
195+
- Third-party API rate limits: Salesforce (1000 requests/hour), HubSpot (500 requests/hour)
196+
- Database connection pool is limited to 20 connections - look for pool exhaustion patterns
197+
- OAuth token expiration happens every 24 hours - failures around token refresh time are expected
198+
199+
**Analysis Preferences:**
200+
- Focus on recent failures over recurring issues when prioritizing
201+
- Consider browser compatibility differences (Chrome vs Firefox behavior variations)
202+
- Check for timing-related failures (elements loading asynchronously)
203+
- Distinguish between environment-specific issues vs application bugs
204+
- Prioritize failures affecting core user journeys: Login, Checkout, Dashboard, Profile Management
205+
206+
**Business Context:**
207+
- Critical user journeys: Login, Checkout, Dashboard, Profile Management
208+
- Performance thresholds: Page load < 3s, API response < 500ms
209+
- Peak usage hours: 10 AM - 2 PM and 6 PM - 9 PM EST
210+
- High-value features: Payment processing, Order management, Customer support portal
211+
212+
**Technical Constraints:**
213+
- Flaky network connections in mobile tests (use retry logic)
214+
- Third-party service dependencies may be unstable (payment gateway, email service)
215+
- Custom error handling: All errors logged to Sentry, user-friendly messages displayed
216+
- Test data cleanup runs nightly - some data may be stale during day
217+
218+
**Ignore These Common False Positives:**
219+
- Browser console warnings that don't affect functionality
220+
- Network requests to analytics services (Google Analytics, Hotjar, Mixpanel)
221+
- Minor UI layout shifts that don't break functionality (< 5px)
222+
- Expected 404s for optional resources (favicon, tracking pixels)
223+
- Third-party script loading delays that don't impact core functionality
224+
:::
179225

180-
### Step 5: Save Configuration
226+
### Step 7: Save Configuration
181227

182228
1. Click **Save Configuration** to apply your settings
183229
2. The settings will be applied to all users in your organization and cannot be modified by individual users or need admin level privileges.
@@ -192,7 +238,7 @@ AI RCA results are available in multiple locations across the LambdaTest platfor
192238
2. **HyperExecute Dashboard**: Access detailed RCA analysis for HyperExecute jobs
193239
3. **Insights Dashboard**: Comprehensive RCA analytics and trend analysis
194240

195-
<img loading="lazy" src={require('../assets/images/analytics/test-intelligence-ai-test-rca-insights.webp').default} alt="cmd" width="800" height="400" className="doc_img"/>
241+
<img loading="lazy" src={require('../assets/images/analytics/test-intelligence-ai-test-rca-insights.png').default} alt="cmd" width="800" height="400" className="doc_img"/>
196242

197243
### Understanding RCA Output
198244

@@ -263,6 +309,7 @@ The RCA Category Trends widget in Insights enables you to:
263309
- **Start with "All failures"** to get comprehensive coverage, then refine based on your needs
264310
- **Use specific special instructions** to guide the AI toward your most critical issues
265311
- **Set up intelligent targeting** to focus on relevant test suites and exclude noise
312+
- **Create custom RCA categories** to organize and track failure patterns systematically
266313

267314
### 2. Interpreting Results
268315

@@ -276,6 +323,7 @@ The RCA Category Trends widget in Insights enables you to:
276323
- **Review RCA accuracy** and provide feedback when possible
277324
- **Monitor trend analysis** to identify recurring patterns
278325
- **Update special instructions** based on new insights and requirements
326+
- **Refine custom RCA categories** to better match your failure patterns and organizational needs
279327
- **Share RCA results** with your team to improve collective understanding
280328

281329
<!-- ### 4. Integration with Workflow
@@ -304,6 +352,7 @@ The RCA Category Trends widget in Insights enables you to:
304352
- **Refine special instructions**: Provide more specific context about your application
305353
- **Update intelligent targeting**: Exclude irrelevant tests that might confuse the analysis
306354
- **Review error categorization**: Ensure test failures are properly categorized
355+
- **Refine custom RCA categories**: Update category descriptions to better match your failure patterns
307356
- **Provide feedback**: Use any available feedback mechanisms to improve accuracy
308357

309358
</details>

0 commit comments

Comments
 (0)