forked from open-telemetry/opentelemetry-java-contrib
    
        
        - 
                Notifications
    
You must be signed in to change notification settings  - Fork 1
 
Debug flaky OpAMP client tests in GitHub Actions #85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
  
    
      This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
      Learn more about bidirectional Unicode characters
    
  
  
    
    edb606e    to
    2c3ef47      
    Compare
  
    
              
                    trasktest
  
              
              approved these changes
              
                  
                    Aug 12, 2025 
                  
              
              
            
            
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
testing
              
                    trasktest
  
              
              previously approved these changes
              
                  
                    Aug 12, 2025 
                  
              
              
            
            
- Added comprehensive logging to OpampClientImplTest to track timing issues - Increased timeouts from 1s to 10s with detailed polling info - Added thread names and timestamps to all debug output - Added stress test to reproduce flakiness (10 iterations) - Enhanced tearDown to properly clean up resources - Added GitHub Actions workflow to run tests multiple times with different JVM options - Tests now show detailed callback timing and state changes This will help identify race conditions and timing issues in CI that aren't reproducible locally.
- Added @SuppressWarnings for SystemOut, CatchingUnchecked, and InterruptedExceptionSwallowed - This allows the debug logging code to compile in CI environments with -Werror
The whenServerProvidesNewInstanceUid_useIt test was using reference comparison
(!=) instead of content comparison (Arrays.equals) for byte arrays. This caused
the test to be flaky and sometimes fail in CI environments.
Changes:
- Fixed array comparison from '!=' to '!Arrays.equals()'
- Changed server-provided UID from {1,2,3} to {4,5,6} to ensure it differs from initial UID
- Cleaned up debug logging added during investigation
Fixes the flaky test behavior reported in CI runs.
    This workflow tests the specific fixed test across: - Java 17 and 21 - 10 iterations each (20 total runs) - Only the whenServerProvidesNewInstanceUid_useIt test This will prove the array comparison fix eliminates flakiness.
              
                    trasktest
  
              
              approved these changes
              
                  
                    Aug 12, 2025 
                  
              
              
            
            
e9fb107    to
    09047ad      
    Compare
  
    2a3699a    to
    e4093f4      
    Compare
  
    
  
    Sign up for free
    to join this conversation on GitHub.
    Already have an account?
    Sign in to comment
  
      
  Add this suggestion to a batch that can be applied as a single commit.
  This suggestion is invalid because no changes were made to the code.
  Suggestions cannot be applied while the pull request is closed.
  Suggestions cannot be applied while viewing a subset of changes.
  Only one suggestion per line can be applied in a batch.
  Add this suggestion to a batch that can be applied as a single commit.
  Applying suggestions on deleted lines is not supported.
  You must change the existing code in this line in order to create a valid suggestion.
  Outdated suggestions cannot be applied.
  This suggestion has been applied or marked resolved.
  Suggestions cannot be applied from pending reviews.
  Suggestions cannot be applied on multi-line comments.
  Suggestions cannot be applied while the pull request is queued to merge.
  Suggestion cannot be applied right now. Please check back later.
  
    
  
    
Problem
The
./gradlew opamp-client:testis flaky in GitHub Actions but not reproducible locally. This PR adds comprehensive debugging to identify and fix the issue.Changes
Enhanced Test Debugging
GitHub Actions Debugging Workflow
Debug Output Examples
Tests now show:
[DEBUG] Test setUp - Thread: main, Time: 1704067200000[DEBUG] TestCallbacks.onConnect() called - count: 1, Thread: pool-1-thread-1[DEBUG] Current onMessageCalls: 0, Thread: main, Time: 1704067201000Expected Outcomes
Next Steps
After this PR shows the issue pattern, I'll create a follow-up PR with the actual fix.
This is specifically for debugging the flaky test issue reported in GitHub Actions.