Machinery Diagnostics - Complete Examples

This guide provides step-by-step examples of complete diagnostic workflows using the Machinery Diagnostics MCP Server.

Example 1: Simple FFT Analysis
Example 2: Bearing Fault Detection
Example 3: ISO 20816-3 Compliance Check
Example 4: Complete Bearing Diagnosis
Example 5: Working with Different Segment Durations
Example 6: Machine Learning-Based Anomaly Detection
Example 7: Machine Documentation Reader

Example 1: Simple FFT Analysis

Objective

Analyze the frequency content of a vibration signal to identify dominant frequencies.

Scenario

A bearing in healthy condition is operating. You want to analyze the baseline frequency spectrum to establish normal operating characteristics.

Real Dataset Information

Signal: real_train/baseline_1.csv
Sampling Rate: 97,656 Hz (from metadata)
Shaft Speed: 1500 RPM (25 Hz) - from metadata
Signal Duration: 6.0 seconds
Condition: Healthy baseline
Note: Bearing type not specified in metadata

Step-by-Step

Step 1: List available signals

In Claude Desktop:

List all available signals in the dataset

Step 2: Run FFT analysis

Analyze the FFT spectrum of real_train/baseline_1.csv

Note: The LLM will auto-detect sampling rate from metadata (97656 Hz).
If metadata is missing, you must provide sampling_rate explicitly.

Step 3: Interpret results

The tool will analyze the signal and return dominant frequencies:

{
  "peaks": [
    {"frequency": 25.0, "magnitude": 0.68},
    {"frequency": 50.0, "magnitude": 0.42},
    {"frequency": 75.0, "magnitude": 0.25}
  ],
  "signal_duration": 6.0,
  "analyzed_segment_duration": 1.0,
  "frequency_resolution": 1.0,
  "sampling_rate": 97656
}

Important Notes:

The analysis uses 1.0 second random segment by default (not full 6.0s signal)
Random segment extracted for efficiency (different segment each time unless seed specified)
Frequency resolution: 1 Hz (1/1.0s) - excellent for most diagnostics
To analyze full signal, specify segment_duration=None
To get reproducible results, specify random_seed parameter

Interpretation:

25 Hz: Shaft rotation frequency (1500 RPM from metadata)
50 Hz, 75 Hz: Harmonics of shaft frequency (2×, 3×)
Clean spectrum: No bearing fault frequencies detected

Step 4: Generate professional HTML report

Generate FFT report for real_train/baseline_1.csv

This creates:

HTML file (fft_spectrum_real_train_baseline_1.html) in reports/ directory
Interactive Plotly chart with zoom, pan, and hover features
Embedded metadata for LLM to read without opening file

No PNG files are generated - only HTML reports. To share static images:

Open HTML in browser and use screenshot/export feature
Or implement PNG export separately using kaleido library

Expected Outcome

✅ Identified shaft rotation frequency (25 Hz) and harmonics
✅ No unusual frequencies detected
✅ Vibration spectrum confirms healthy baseline operation
✅ Professional HTML report generated in reports/ directory
✅ LLM understood signal was analyzed using 1.0s random segment (default)

Example 2: Bearing Fault Detection

Objective

Detect and localize bearing faults using envelope analysis.

Scenario

You are investigating potential bearing faults in an operating machine. The vibration signal shows symptoms that may indicate outer race damage. You need to confirm the fault type and localize it using envelope analysis.

Critical Note: To perform accurate diagnosis, you MUST provide:

Shaft speed (Hz or RPM)
Bearing characteristic frequencies (BPFO, BPFI, BSF, FTF) in Hz
Or bearing geometry if frequencies are unknown

The LLM should request this information if not provided.

Real Dataset Information

Signal: real_train/OuterRaceFault_1.csv
Sampling Rate: 97,656 Hz (from metadata)
Shaft Speed: 1500 RPM (25 Hz) - from metadata
Signal Duration: 6.0 seconds
Fault Type: Outer race fault (known from experiment)
Bearing Characteristic Frequencies (from metadata):
- BPFO: 81.125 Hz (Ball Pass Frequency Outer race)
- BPFI: 118.875 Hz (Ball Pass Frequency Inner race)
- BSF: 63.91 Hz (Ball Spin Frequency)
- FTF: 14.8375 Hz (Fundamental Train Frequency / Cage frequency)

Important: These frequencies are provided as absolute values in Hz, not as multiples of shaft speed. When calling tools, use these exact Hz values.

Step-by-Step

Step 1: Perform envelope analysis

Perform envelope analysis on real_train/OuterRaceFault_1.csv with:
- Filter band: 2000 to 8000 Hz

Note: If you don't provide sampling_rate, the LLM will auto-detect it from metadata.
The analysis will use 1.0 second random segment by default.
For reproducible results, specify random_seed parameter.
To analyze full signal, set segment_duration=None.

Step 2: Examine envelope spectrum

The tool returns:

{
  "envelope_peaks": [
    {"frequency": 81.125, "magnitude": 0.94},
    {"frequency": 162.25, "magnitude": 0.52},
    {"frequency": 243.375, "magnitude": 0.31}
  ],
  "filter_band": [2000, 8000],
  "dominant_frequency": 81.125,
  "analyzed_segment_duration": 1.0,
  "signal_duration": 6.0
}

Step 3: Compare with bearing frequencies

You must provide the bearing frequencies to compare:

Compare detected peaks with these bearing frequencies:
- BPFO: 81.125 Hz
- BPFI: 118.875 Hz
- BSF: 63.91 Hz
- FTF: 14.8375 Hz

Detected Peak	Theoretical	Match	Fault Type
81.125 Hz	81.125 Hz (BPFO)	✅	Outer race fault
162.25 Hz	162.25 Hz (2×BPFO)	✅	Harmonic
243.375 Hz	243.375 Hz (3×BPFO)	✅	Harmonic

Step 4: Generate professional envelope report

Generate envelope report for real_train/OuterRaceFault_1.csv with bearing frequencies:
- BPFO: 81.125 Hz
- BPFI: 118.875 Hz
- BSF: 63.91 Hz
- FTF: 14.8375 Hz
- Shaft speed: 25 Hz

This creates:

HTML file in reports/ directory with filtered signal, envelope, and spectrum plots
Automatic highlighting of bearing frequency peaks with vertical markers
Embedded metadata for LLM analysis

No PNG files are generated. For static images, open HTML in browser and export manually.

Diagnosis

🔴 OUTER RACE BEARING FAULT DETECTED

Evidence:

Strong peak at BPFO (81.125 Hz) with 94% magnitude
Clear harmonics (2×BPFO at 162.25 Hz, 3×BPFO at 243.375 Hz)
Pattern consistent with localized outer race defect
High signal-to-noise ratio confirms advanced fault stage

Recommendation:

Immediate action: Schedule bearing replacement within 1-2 weeks
Monitoring: Increase inspection frequency to daily
Operation: Reduce load and speed if possible until replacement
Root cause: Investigate lubrication, contamination, or installation errors

Expected Outcome

✅ Bearing fault localized to outer race
✅ Severity assessed (advanced stage)
✅ Actionable maintenance plan created
✅ Professional HTML report in reports/ directory
✅ LLM confirmed analysis used 1.0s random segment from 6.0s signal

Example 3: ISO 20816-3 Compliance Check

Objective

Evaluate vibration severity according to international standard ISO 20816-3.

Scenario

You need to evaluate vibration severity according to international standard ISO 20816-3 for compliance and condition monitoring baseline establishment.

Critical Note: ISO 20816-3 only defines Machine Group 1 and 2:

Group 1: Large machines (> 300 kW)
Group 2: Medium machines (15-300 kW)

There is no Group 3 in the standard. If user provides invalid group, LLM should request clarification.

Real Dataset Information

Signal: real_train/baseline_1.csv
Sampling Rate: 97,656 Hz (from metadata)
Shaft Speed: 1500 RPM (25 Hz) - from metadata
Signal Duration: 6.0 seconds
Condition: Healthy baseline
Machine Classification: For this example, assume Group 2 (medium machine, 15-300 kW)

Step-by-Step

Step 1: Evaluate against ISO standard

Evaluate real_train/baseline_1.csv against ISO 20816-3 with:
- Machine group: 2
- Support type: rigid

Note: LLM will auto-detect sampling_rate from metadata.
If not available, you must provide it explicitly.
ISO evaluation uses the FULL signal to calculate RMS (not segments).

Step 2: Review results

{
  "rms_velocity": 1.8,
  "unit": "mm/s",
  "severity_zone": "A",
  "machine_group": 2,
  "support_type": "rigid",
  "thresholds": {
    "zone_A_B": 2.3,
    "zone_B_C": 3.5,
    "zone_C_D": 5.5
  },
  "interpretation": "New machine condition - acceptable",
  "signal_duration": 6.0
}

Step 3: Interpret zones

Zone A (🟢): 0 - 2.3 mm/s     → New machine condition (CURRENT: 1.8 mm/s)
Zone B (🟡): 2.3 - 3.5 mm/s   → Acceptable for long-term operation
Zone C (🟠): 3.5 - 5.5 mm/s   → Unsatisfactory, short-term operation only
Zone D (🔴): > 5.5 mm/s       → Unacceptable, damage may occur

Step 4: Generate ISO compliance report

Generate ISO 20816-3 report for real_train/baseline_1.csv with:
- Machine group: 2
- Support type: rigid

This creates:

HTML file in reports/ directory with zone visualization and RMS velocity indicator
Color-coded zones (A=Green, B=Yellow, C=Orange, D=Red)
Current RMS velocity position on zone chart
Severity interpretation and recommendations

No PNG file is generated. For documentation, open HTML and export manually.

Step 5: Document for compliance

Create baseline report:

Vibration Assessment Report - Baseline Measurement
--------------------------------------------------
Machine: Test bearing rig
Standard: ISO 20816-3
Date: [Measurement Date]
Dataset: real_train/baseline_1.csv

Measurement:
- Signal Duration: 6.0 seconds
- Analysis: Full signal RMS (ISO standard requires complete signal)
- RMS Velocity: 1.8 mm/s
- Severity Zone: A (Green)
- Status: NEW MACHINE CONDITION

Classification:
- Machine Group: 2 (Medium machine, 15-300 kW)
- Support Type: Rigid
- Zone A Threshold: 0 - 2.3 mm/s

Result:
✅ Baseline measurement within Zone A (optimal condition)
✅ Suitable as reference for future condition monitoring
✅ No action required

Recommendation:
- Use this measurement as baseline for future comparisons
- Monitor quarterly for trend analysis
- Re-evaluate if RMS increases by >0.5 mm/s

Expected Outcome

✅ ISO 20816-3 compliance documented
✅ Baseline measurement in Zone A (optimal)
✅ Reference established for future monitoring
✅ Professional HTML report generated in reports/ directory
✅ LLM communicated that ISO uses full signal RMS (not segments)

Example 4: Complete Bearing Diagnosis

Objective

Perform comprehensive bearing diagnostics using the guided workflow prompt.

Scenario

You need to perform comprehensive bearing diagnostics using the guided workflow prompt. A vibration signal shows symptoms that require complete diagnostic workup with severity assessment and recommendations.

Critical Note: The LLM should request all necessary parameters if not provided:

Shaft speed (Hz or RPM)
Bearing characteristic frequencies (BPFO, BPFI, BSF, FTF) in Hz
Machine classification for ISO evaluation
Filter bands for envelope analysis (if not using defaults)

Real Dataset Information

Signal: real_train/InnerRaceFault_vload_1.csv
Sampling Rate: 48,828 Hz (from metadata)
Shaft Speed: 1500 RPM (25 Hz) - from metadata
Signal Duration: 3.0 seconds
Load Condition: Variable load test scenario (load = 0 N from metadata)
Fault Type: Inner race fault (known from experiment)
Bearing Characteristic Frequencies (from metadata):
- BPFO: 81.125 Hz (Outer race)
- BPFI: 118.875 Hz (Inner race fault)
- BSF: 63.91 Hz (Ball spin frequency)
- FTF: 14.8375 Hz (Cage frequency)

Important: Frequencies are in absolute Hz, not multiples of shaft speed.

Machine Classification

Machine Type: Assume Group 2 (medium machine, 15-300 kW)
Support: Rigid
Application: Fault diagnosis under variable load

Step-by-Step

Step 1: Initiate guided diagnosis

Run complete bearing diagnosis on real_train/InnerRaceFault_vload_1.csv with:
- Shaft frequency: 25 Hz
- Bearing frequencies:
  BPFO: 81.125 Hz
  BPFI: 118.875 Hz
  BSF: 63.91 Hz
  FTF: 14.8375 Hz
- Machine group: 2
- Support type: rigid

Note: LLM will auto-detect sampling_rate (48828 Hz) from metadata.
The signal is 3.0 seconds long.
FFT and Envelope will use 1.0s random segments by default.
ISO will use the full 3.0s signal for RMS calculation.

Step 2: Workflow executes automatically

The diagnose_bearing prompt performs 6 steps:

📋 STEP 1: Parameter Validation
✅ All required parameters present
✅ Bearing frequencies verified from metadata
✅ Signal duration: 3.0 seconds
✅ FFT/Envelope will use 1.0s random segments
✅ ISO will use full 3.0s signal

📊 STEP 2: ISO 20816-3 Severity Evaluation
Machine parameters:
- Group 2: Medium machine (15-300 kW) ✅
- Support: Rigid ✅

ISO Result (full signal RMS):
- RMS Velocity: 4.2 mm/s
- Zone: C (Orange)
- Interpretation: Unsatisfactory - plan maintenance immediately

🔍 STEP 3: FFT Analysis (1.0s random segment)
Primary frequencies detected:
- 25 Hz: Shaft rotation (1500 RPM)
- 50 Hz: 2× shaft (normal harmonic)
- 118.875 Hz: BPFI (⚠️ INNER RACE FAULT!)
- 237.75 Hz: 2×BPFI (strong harmonic)

📈 STEP 4: Envelope Analysis (1.0s random segment)
Envelope spectrum peaks:
- 118.875 Hz: Strong peak (BPFI) - magnitude 0.91
- 237.75 Hz: 2×BPFI harmonic - magnitude 0.58
- 356.625 Hz: 3×BPFI harmonic - magnitude 0.34

🎯 STEP 5: Fault Pattern Recognition
Pattern Match:
- Peak at BPFI: ✅ CONFIRMED (118.875 Hz)
- Harmonics present: ✅ YES (2×, 3×BPFI clearly visible)
- Sidebands: ✅ Modulation sidebands at ±25 Hz (shaft frequency)
- Pattern: Localized inner race defect with load-dependent modulation

Diagnosis: INNER RACE BEARING FAULT (Variable Load Condition)

💡 STEP 6: Recommendations

FAULT DETECTED: Inner Race Bearing Defect
Severity: HIGH (Zone C + Strong BPFI pattern + Harmonics)

Immediate Actions:
1. ⚠️ URGENT: Schedule bearing replacement within 1 week
2. Reduce operating speed and load immediately
3. Implement daily vibration monitoring
4. Prepare spare bearing and maintenance crew

Short-term (1-2 weeks):
5. Perform bearing replacement during next available window
6. Inspect bearing visually for fault confirmation
7. Document fault characteristics for root cause analysis

Long-term Actions:
8. Investigate root cause:
   - Check lubrication quality and quantity
   - Verify bearing installation (preload, alignment)
   - Review operating conditions (load cycles, temperature)
9. Improve predictive maintenance procedures
10. Consider automated condition monitoring system

Safety Notes:
⚠️ Variable load conditions accelerate fault progression
⚠️ Monitor closely - fault may deteriorate rapidly
⚠️ Prepare for emergency shutdown if vibration increases

Diagnosis Summary

Aspect	Finding	Severity
ISO 20816-3	Zone C (4.2 mm/s)	🟠 Unsatisfactory
FFT Analysis	BPFI at 118.875 Hz + harmonics	🔴 Fault confirmed
Envelope	Strong BPFI (0.91) + 2×, 3× harmonics	🔴 Advanced stage
Modulation	Sidebands at ±25 Hz	🔴 Load-sensitive fault
Fault Type	Inner race defect	🔴 Critical
Recommendation	Replace within 1 week	⚠️ URGENT

Professional Reports Generated

The workflow automatically creates HTML reports in reports/ directory:

FFT Report (fft_spectrum_real_train_InnerRaceFault_vload_1.html)
- Shows shaft frequency (25 Hz), harmonics, and BPFI peak (118.875 Hz)
Envelope Report (envelope_analysis_real_train_InnerRaceFault_vload_1.html)
- Highlights BPFI with automatic bearing frequency markers
- Displays filtered signal (2-8 kHz), envelope, and spectrum
ISO Report (iso_20816_real_train_InnerRaceFault_vload_1.html)
- Zone C indication with color-coded severity chart

No PNG files are generated. For documentation, open HTML and export manually.

Expected Outcome

✅ Comprehensive diagnosis completed
✅ Fault localized to inner race (BPFI signature at 118.875 Hz)
✅ Severity assessed (HIGH/URGENT - Zone C + strong harmonics)
✅ Detailed action plan with timeline
✅ Professional HTML reports for documentation
✅ Root cause investigation guidance provided
✅ LLM communicated analysis strategy (1.0s segments for FFT/Envelope, full signal for ISO)

Example 5: Working with Different Segment Durations

Objective

Understand when and how to use different segment durations for optimal results.

Important: The default segment duration is 1.0 seconds (random) for FFT/Envelope analysis, and 0.1 seconds for ML feature extraction. This affects frequency resolution, analysis speed, and results.

Understanding Segment Duration

For FFT/Envelope Analysis:

Default: 1.0s random segment (efficient, good resolution)
Frequency Resolution = 1 / segment_duration
- 1.0s segment → 1 Hz resolution
- 2.0s segment → 0.5 Hz resolution
- Full signal → Maximum resolution (depends on signal length)

For ML Anomaly Detection:

Default: 0.1s segments with 50% overlap
Shorter segments capture transient features better
More segments = better statistical representation

Scenario Comparison

Scenario A: Standard Bearing Analysis (Default 1.0s Works Well)

Real Dataset:

Signal: real_train/baseline_1.csv
Sampling Rate: 97,656 Hz (from metadata)
Signal Duration: 6.0 seconds
Shaft Speed: 1500 RPM (25 Hz)
Bearing Frequencies: BPFO = 81.125 Hz, BPFI = 118.875 Hz

Requirements:

Quick screening for bearing fault frequencies (~80-120 Hz range)
Standard diagnostic workflow
Need balance between speed and resolution

Solution: Use default 1.0s random segment

Generate FFT report for real_train/baseline_1.csv

Note: LLM will auto-detect sampling rate (97656 Hz) from metadata
Default segment_duration = 1.0s random segment
Frequency resolution = 1 Hz (excellent for bearing diagnostics)

Results:

Frequency resolution: 1 Hz (excellent for detecting 81 Hz and 119 Hz peaks)
Processing time: Fast
Detected peaks: 25 Hz (shaft), 50 Hz (2× harmonic), 75 Hz (3× harmonic)
✅ Perfect for routine bearing diagnostics

Scenario B: Detailed Frequency Analysis (Need Better Resolution)

Real Dataset:

Signal: real_train/InnerRaceFault_vload_1.csv
Sampling Rate: 48,828 Hz (from metadata)
Signal Duration: 3.0 seconds
Fault Type: Inner race fault with BPFI = 118.875 Hz

Requirements:

Detect modulation sidebands around BPFI (±25 Hz spacing)
Need to distinguish 118.875 Hz from nearby frequencies
Detailed analysis for research or validation

Solution: Use longer segment (2.0s)

Generate FFT report for real_train/InnerRaceFault_vload_1.csv with:
- segment_duration: 2.0

LLM will inform: "Using 2.0s random segment for 0.5 Hz frequency resolution"
Auto-detects sampling_rate: 48828 Hz from metadata

Results:

Frequency resolution: 0.5 Hz (can resolve sidebands at 118.875 ± 25 Hz)
Processing time: Moderate
Detected: BPFI at 118.875 Hz + sidebands + harmonics
✅ Excellent resolution for sideband analysis

Scenario C: Complete Signal Analysis (Full Signal)

Real Dataset:

Signal: real_train/OuterRaceFault_1.csv
Sampling Rate: 97,656 Hz (from metadata)
Signal Duration: 6.0 seconds
Fault Type: Outer race fault with BPFO = 81.125 Hz

Requirements:

Maximum frequency information for publication
Detailed envelope spectrum analysis
No performance constraints

Solution: Use full signal

Generate envelope report for real_train/OuterRaceFault_1.csv with:
- segment_duration: None
- bearing_frequencies: BPFO=81.125, BPFI=118.875, BSF=63.91, FTF=14.8375

LLM informs: "Analyzing full 6.0s signal for maximum resolution (0.167 Hz)"
Auto-detects sampling_rate: 97656 Hz from metadata

Results:

Frequency resolution: 0.167 Hz (maximum detail)
Processing time: Slower
Clear BPFO peak at 81.125 Hz with harmonics up to 6×BPFO
✅ Maximum detail for research/documentation

Scenario D: Machine Learning Feature Extraction

Real Dataset:

Signals: real_train/baseline_1.csv, real_train/baseline_2.csv
Sampling Rates: 97,656 Hz (auto-detected per file)
Signal Durations: 6.0 seconds each
Purpose: Train anomaly detection model

Requirements:

Extract features from multiple healthy baselines
Need statistical representation across signals
Capture transient characteristics

Solution: Use default 0.1s segments with overlap

Train anomaly detection model on:
- real_train/baseline_1.csv
- real_train/baseline_2.csv

LLM specifies: "Extracting features from 0.1s segments with 50% overlap"
Auto-detects sampling rates from metadata files

Results:

Many segments: 6.0s × 2 files → ~240 segments total
Good statistical representation from multiple baselines
Captures transient features for anomaly detection
✅ Optimal for ML training with real data

Segment Duration Recommendations

Application	Tool	Segment Duration	Frequency Resolution	Reason
Quick FFT screening	analyze_fft	1.0s (default random)	1 Hz	Fast, adequate resolution
Routine monitoring	analyze_fft	1.0s random	1 Hz	Balances speed and accuracy
Bearing diagnostics	analyze_envelope	1.0s random	1 Hz	Good resolution for BPFO/BPFI
Low-speed machines	analyze_fft	2.0-5.0s	0.2-0.5 Hz	Better low-frequency resolution
Gear analysis	analyze_fft	2.0-5.0s	0.2-0.5 Hz	Resolves sidebands
Research/validation	analyze_fft	None (full)	Maximum	Complete information
ML feature extraction	extract_features	0.1s (default)	N/A	Captures transients
ML training	train_anomaly_model	0.1s + 50% overlap	N/A	Statistical representation

Critical Guidelines for LLM

Always inform user about segment strategy:
- FFT/Envelope: "Analyzing 1.0s random segment from 6.0s signal (1 Hz resolution)"
- ML: "Extracting features from 0.1s segments with 50% overlap → 120 segments"
- ISO: "Using full signal for RMS calculation (ISO standard requirement)"
Suggest appropriate duration if user needs different resolution:
- "For better frequency resolution, try segment_duration=2.0 (0.5 Hz resolution)"
- "For low-speed analysis, consider segment_duration=5.0 (0.2 Hz resolution)"
- "For complete analysis, use segment_duration=None (full signal)"
Explain random segment strategy:
- "Random segment ensures representative sample without analyzing full signal"
- "For reproducible results, specify random_seed parameter"
- "Different random segment each analysis unless seed is fixed"
Clarify ML segment strategy:
- "ML uses 0.1s segments (not 1.0s) to capture transient features"
- "50% overlap provides more training samples without full re-computation"
- "6.0s signal → ~120 segments → better statistical model"
Warn about computational cost:
- "Using full signal (segment_duration=None) will take longer"
- "Full signal analysis may produce larger output"
- "For batch processing, stick to default 1.0s random segments"

Example 6: Machine Learning-Based Anomaly Detection

Objective

Train an ML model on healthy machine data to automatically detect anomalies in new measurements using real bearing data.

Scenario

You have vibration data from bearing tests. Some signals are from healthy bearings (baseline), others from faulty bearings. You want to build a model that automatically detects when a bearing develops a fault.

Real Dataset Available

Training Data (Healthy baselines):

real_train/baseline_1.csv - Healthy bearing, 6.0s, 97656 Hz
real_train/baseline_2.csv - Healthy bearing, 6.0s, 97656 Hz

Validation Data (Mix of healthy and faulty):

real_train/baseline_1.csv (portion) - Should classify as healthy
real_train/InnerRaceFault_vload_1.csv - Should detect as anomaly

Test Data (Real-world testing):

real_test/baseline_3.csv - Healthy baseline test
real_test/InnerRaceFault_vload_6.csv - Inner race fault test
real_test/OuterRaceFault_vload_6.csv - Outer race fault test

Step-by-Step

Step 1: Prepare training data

Use healthy baseline signals for training:

Train anomaly detection model on these healthy signals:
- real_train/baseline_1.csv
- real_train/baseline_2.csv

Use parameters:
- Model type: OneClassSVM
- PCA variance: 0.95
- Segment duration: 0.1 seconds (ML default, not 1.0s!)
- Model name: "bearing_health_model"

For validation, include:
- Faulty: real_train/InnerRaceFault_vload_1.csv

Note: LLM will auto-detect sampling rates from metadata files.
The tool automatically reads {filename}_metadata.json for each signal.
Different files may have different sampling rates (97656 Hz vs 48828 Hz).
If metadata is missing, tool will raise an error requesting sampling_rate.
0.1s segments with 50% overlap → many segments for statistical representation.

IMPORTANT: Including fault_signal_files enables SEMI-SUPERVISED hyperparameter tuning:
- With fault data → Manual hyperparameter search with validation scoring (balanced accuracy)
- Without fault data → UNSUPERVISED mode with automatic parameters
- NOTE: Model is ALWAYS trained only on healthy data (unsupervised)
- Fault data is used ONLY for hyperparameter tuning (post-training validation)
- LLM should suggest including fault data when available for better parameter selection

Validation Strategy Selection:

The tool supports TWO validation strategies for healthy data:

Validation Strategy 1: EXPLICIT Validation Files ⭐ RECOMMENDED for separate test set

Provide healthy_validation_files explicitly:
- healthy_validation_files: ["real_test/baseline_3.csv"]
- fault_signal_files: ["real_train/InnerRaceFault_vload_1.csv"]

Behavior: Uses provided validation files AS-IS (no splitting)
Model: Trains on 100% of training data (baseline_1.csv, baseline_2.csv)
Validation: Tests on SEPARATE validation set (baseline_3.csv)
Use case: When you have dedicated validation/test datasets
Advantage: Proper train/test separation, no data leakage

Validation Strategy 2: AUTO-SPLIT (80/20) - Default if no validation files

Do NOT provide healthy_validation_files:
- fault_signal_files: ["real_train/InnerRaceFault_vload_1.csv"]

Behavior: Automatically splits training data 80/20
Model: Retrains on 80% of training data
Validation: Tests on remaining 20% of training data
Use case: When you don't have separate validation data
Advantage: Automatic validation without extra data

LLM Prompt Examples:

Option 1 (Explicit validation files):

Train anomaly detection model:
- Training: real_train/baseline_1.csv, real_train/baseline_2.csv
- Validation (healthy): real_test/baseline_3.csv
- Validation (fault): real_train/InnerRaceFault_vload_1.csv
- Model: OneClassSVM
- Model name: "bearing_health_model"

Option 2 (Auto-split):

Train anomaly detection model:
- Training: real_train/baseline_1.csv, real_train/baseline_2.csv
- Validation (fault): real_train/InnerRaceFault_vload_1.csv
- Model: OneClassSVM
- Model name: "bearing_health_model"

Note: The tool will automatically split training data 80/20 for validation.

Training Mode Selection:

The tool automatically selects training strategy based on available data:

Option A: UNSUPERVISED (Novelty Detection) - No fault data

Train model with ONLY healthy signals (no fault data provided)

Training: OneClassSVM/LOF trained ONLY on healthy data
Parameters: Automatically calculated based on sample size
OneClassSVM: nu = adaptive (1/√n), gamma = 'scale'
LocalOutlierFactor: n_neighbors = √n, contamination = 0.1
Use case: When you only have healthy baseline data
Advantage: No need for fault examples
Learning type: Pure unsupervised (one-class learning)

Option B: SEMI-SUPERVISED (With Validation) - With fault data ⭐ RECOMMENDED

Train model with healthy signals + fault signals for hyperparameter tuning

Training: Still ONLY on healthy data (unsupervised)
Hyperparameter tuning: Uses validation set (healthy + fault) post-training
OneClassSVM: Tests nu=[0.01,0.05,0.1,0.2] × gamma=['scale','auto',0.001,0.01,0.1]
LocalOutlierFactor: Tests n_neighbors=[10,20,30,50] × contamination=[0.05,0.1,0.15,0.2]
Scoring: Balanced accuracy on validation (healthy specificity + fault sensitivity)
Use case: When you have both healthy AND fault examples
Advantage: Better parameter selection, higher accuracy
Learning type: Semi-supervised (train unsupervised, tune with labels)

CRITICAL: This is NOT supervised learning!

Model fit() is called ONLY on healthy data (no labels)
Fault data is used ONLY after training for hyperparameter selection
This is called "semi-supervised" because labels are used indirectly for tuning

LLM Recommendation Strategy:

User provides only healthy signals:
  → LLM: "I can train in unsupervised mode with automatic parameters. 
          Do you have any fault examples? Including them would enable 
          semi-supervised hyperparameter tuning for better performance."

User provides healthy + fault signals:
  → LLM: "Training in SEMI-SUPERVISED mode with hyperparameter optimization. 
          Model will be trained only on healthy data, then parameters will be 
          tuned using validation set for optimal anomaly detection."

Step 2: Review training results

The tool will return different information based on training mode:

SEMI-SUPERVISED Mode Output (with fault validation):

{
  "model_path": "models/bearing_health_model_model.pkl",
  "scaler_path": "models/bearing_health_model_scaler.pkl",
  "pca_path": "models/bearing_health_model_pca.pkl",
  "training_mode": "semi-supervised",
  "training_samples": 190,
  "features_per_sample": 17,
  "pca_components": 8,
  "pca_variance_explained": 0.956,
  "segment_duration": 0.1,
  "overlap_ratio": 0.5,
  "best_params": {
    "kernel": "rbf",
    "nu": 0.05,
    "gamma": 0.01
  },
  "validation_accuracy": 0.93,
  "validation_details": "Healthy: 46/48 correct (95.8%), Fault: 54/59 detected (91.5%)",
  "validation_metrics": {
    "healthy_correct": 46,
    "healthy_total": 48,
    "healthy_accuracy": 0.958,
    "fault_detected": 54,
    "fault_total": 59,
    "fault_accuracy": 0.915,
    "overall_accuracy": 0.93
  }
}

UNSUPERVISED Mode Output (no fault validation):

{
  "model_path": "models/bearing_health_model_model.pkl",
  "scaler_path": "models/bearing_health_model_scaler.pkl",
  "pca_path": "models/bearing_health_model_pca.pkl",
  "training_mode": "unsupervised",
  "training_samples": 120,
  "features_per_sample": 17,
  "pca_components": 8,
  "pca_variance_explained": 0.956,
  "segment_duration": 0.1,
  "overlap_ratio": 0.5,
  "best_params": {
    "kernel": "rbf",
    "nu": 0.0913,
    "gamma": "scale",
    "mode": "unsupervised_auto"
  }
}

Interpretation:

✅ Model trained on 190 segments (80% used, 20% reserved for validation)
✅ 17 time-domain features extracted per segment
✅ PCA reduced to 8 components (95.6% variance retained)
✅ SUPERVISED: Parameters optimized via grid search → nu=0.05, gamma=0.01
✅ UNSUPERVISED: Parameters auto-calculated → nu=0.0913 (based on √n), gamma='scale'
✅ Validation (supervised only):
- Healthy validation: 46/48 correct (95.8%) - Low false positive rate ✓
- Fault detection: 54/59 detected (91.5%) - High sensitivity ✓
- Overall balanced accuracy: 93% (combines both metrics)

Important: LLM must inform user about:

Training mode: "Training in SUPERVISED mode with GridSearchCV" or "Training in UNSUPERVISED mode with automatic parameters"
Validation strategy: "Using 80/20 split: 80% for training, 20% healthy + fault signals for validation"
Sampling rate detection: "Auto-detected 97656 Hz from baseline_1_metadata.json"
Number of training files vs number of segments
Segment duration used (0.1s for ML, not 1.0s for FFT!)
Signal durations (baseline_1: 6.0s, baseline_2: 6.0s → total 12.0s → 238 segments → 190 training + 48 validation)
Parameter optimization: "Tested 20 parameter combinations, best: nu=0.05, gamma=0.01" (supervised) or "Auto-calculated nu=0.0913 based on sample size" (unsupervised)
Balanced validation: "Tested on BOTH healthy (specificity) and fault (sensitivity) data"
If metadata missing: "ERROR: No metadata found - please provide sampling_rate or create metadata file"

Step 3: Predict anomalies in test data

Test the model on unseen data:

Predict anomalies in real_test/baseline_3.csv using bearing_health_model

Expected: Should classify as healthy (low anomaly ratio)

Step 4: Review predictions for healthy test signal

Results:

{
  "num_segments": 30,
  "anomaly_count": 2,
  "anomaly_ratio": 0.067,
  "overall_health": "Healthy",
  "confidence": "High",
  "signal_duration": 6.0,
  "segment_duration": 0.2,
  "predictions": [1, 1, 1, 1, 1, 1, 1, -1, 1, 1, ...],
  "anomaly_scores": [-0.23, -0.18, -0.15, ..., 0.82, -0.12, ...]
}

Interpretation:

✅ 30 segments analyzed (6.0s signal / 0.2s segments)
✅ 2 anomalous segments detected (6.7% of signal)
✅ Overall: Healthy (< 10% anomalies)
� Action: Continue normal operation

Step 5: Test on faulty bearing

Predict anomalies in real_test/InnerRaceFault_vload_6.csv using bearing_health_model

Expected: Should detect high anomaly ratio

Results:

{
  "num_segments": 30,
  "anomaly_count": 24,
  "anomaly_ratio": 0.80,
  "overall_health": "Faulty",
  "confidence": "High",
  "signal_duration": 6.0,
  "segment_duration": 0.2
}

Interpretation:

✅ 30 segments analyzed
⚠️ 24 anomalous segments (80% of signal)
🔴 Overall: FAULTY (> 30% anomalies)
🔴 Action: Immediate inspection and bearing replacement

Step 6: Test on different fault type

Predict anomalies in real_test/OuterRaceFault_vload_6.csv using bearing_health_model

Expected: Should also detect as faulty

Results show high anomaly ratio → Model generalizes to different fault types!

Step 7: Visualize PCA Space (NEW TOOL!)

To understand how the model separates healthy from faulty bearings in 2D space:

Option A - Predictions Only (No Ground Truth):

Generate PCA visualization report for bearing_health_model with test signals:
- real_test/baseline_3.csv
- real_test/InnerRaceFault_vload_6.csv
- real_test/OuterRaceFault_vload_6.csv

Option B - With Validation (Ground Truth Provided):

Generate PCA visualization report for bearing_health_model with test signals:
- real_test/baseline_3.csv
- real_test/InnerRaceFault_vload_6.csv
- real_test/OuterRaceFault_vload_6.csv
And true labels:
- baseline_3.csv: healthy
- InnerRaceFault_vload_6.csv: faulty
- OuterRaceFault_vload_6.csv: faulty

This creates an interactive HTML report showing:

2D PCA scatter plot (PC1 vs PC2)
Green dots: Segments predicted as healthy
Red X markers: Segments predicted as anomaly
Variance explained by each component
Hover information with segment details

IMPORTANT:

Without true_labels: Legend shows "Predicted: Healthy/Anomaly" (model predictions only)
With true_labels: Legend shows "True: X, Predicted: Y" for validation + accuracy metrics
Never assumes predictions = ground truth
Use validation mode to check model accuracy against known labels

Validation Metrics (when true_labels provided):

Overall accuracy across all segments
Per-file accuracy breakdown
Shows which files had prediction errors

The report is saved to reports/pca_visualization_bearing_health_model.html

Step 8: Compare Features with Violin Plots (NEW TOOL!)

To understand which time-domain features distinguish healthy from faulty bearings:

Generate feature comparison report with signal groups:
- Healthy: ["real_train/baseline_1.csv", "real_train/baseline_2.csv"]
- Inner Fault: ["real_train/InnerRaceFault_vload_1.csv", "real_train/InnerRaceFault_vload_2.csv"]
- Outer Fault: ["real_train/OuterRaceFault_1.csv", "real_train/OuterRaceFault_2.csv"]
Use segment_duration: 0.1 seconds

This creates an interactive HTML report showing:

Violin plots for all 17 time-domain features
Color-coded by signal group (Healthy, Inner Fault, Outer Fault)
Distribution comparison showing which features are most discriminative
Grid layout (3 columns × N rows)

The report is saved to reports/feature_comparison_Healthy_vs_Inner_Fault_vs_Outer_Fault.html

Key insights from feature comparison:

Kurtosis and Crest Factor: Higher in faulty bearings (impulsive signals)
RMS: Generally higher in faulty bearings
Entropy: Often lower in faulty bearings (more periodic)
Shape Factor: Can distinguish between fault types

Expected Outcome

✅ Automated monitoring system that:

Detects anomalies using only healthy baseline training data
Provides early warning (detects faults before failure)
Generalizes to different fault types (inner race, outer race)
Uses real experimental bearing data
Supports multiple sampling rates (each file auto-detected)
Works without specifying sampling rate (metadata required)
Analyzes signals in 0.1s segments (ML default)

✅ Model Performance:

Training: 2 healthy baselines (baseline_1, baseline_2)
Balanced validation: Tests BOTH healthy (specificity) and fault (sensitivity) data
Validation accuracy: ~93% overall (95.8% healthy correct, 91.5% fault detected)
Testing: Correctly identifies healthy and faulty test bearings
Low false positive rate: 95.8% of healthy segments correctly classified

✅ Visualization Tools:

PCA scatter plots: See how model separates healthy from faulty in 2D space
Violin plots: Compare 17 time-domain features across groups
Both saved as interactive HTML reports

✅ LLM Communication:

Informed user about sampling rate detection per file
Explained training data size (190 training + 48 healthy validation + 59 fault validation)
Clarified validation strategy: "80/20 split + fault validation for balanced metrics"
Reported both specificity (healthy accuracy) and sensitivity (fault detection)
Handled multi-rate training automatically

When to Use ML Approach

Use Case	Traditional Analysis	ML Approach
Single machine diagnosis	✅ Better (interpretable)	⚠️ Overkill
Fleet monitoring (>5 machines)	⚠️ Labor intensive	✅ Automated
Trend detection	⚠️ Manual comparison	✅ Automatic alerts
Unknown fault patterns	❌ May miss novel faults	✅ Detects deviations
Root cause diagnosis	✅ Clear (FFT, Envelope)	❌ Black box
Early warning	⚠️ Requires expertise	✅ Automated screening

Recommendation: Use ML for screening and monitoring, then use traditional analysis (FFT, Envelope, ISO) for root cause diagnosis when anomalies are detected.

Critical Guidelines for LLM

Always handle sampling rate correctly (CRITICAL UPDATE!):
- ML tools (train_anomaly_model, extract_features_from_signal): Auto-detect PER FILE from metadata
- Each training/validation file can have different sampling rate!
- FFT/Envelope/ISO: Auto-detect from signal's own metadata
- If metadata exists → "Auto-detected sampling rate: 97656 Hz from baseline_1_metadata.json"
- If metadata missing → RAISE ERROR requesting user to provide sampling_rate or create metadata
- NEVER use silent default - always inform or error
- User can override by providing explicit sampling_rate parameter (applies to all files)
Multi-rate training communication:
- "Auto-detected sampling rates: baseline_1 (97656 Hz), baseline_2 (97656 Hz), InnerFault (48828 Hz)"
- "Training model with files at different sampling rates - features extracted correctly per file"
- "Model saved with 'multi_rate_training': true in metadata"
Training mode selection (NEW - CRITICAL!):
- Ask user if they have fault data: "Do you have any fault signal examples? Including them enables supervised optimization for better accuracy."
- If user provides ONLY healthy signals:
  - "Training in UNSUPERVISED mode (novelty detection)"
  - "Using automatic parameters: nu=0.0913 (adaptive), gamma='scale'"
  - "Parameters calculated based on your sample size (120 segments)"
- If user provides healthy + fault signals:
  - "Training in SUPERVISED mode with GridSearchCV optimization"
  - "Testing 20 parameter combinations to find the best for your data..."
  - "Best parameters found: nu=0.05, gamma=0.01 (validation accuracy: 95%)"
- Always inform about training mode before starting
Parameter optimization communication:
- SUPERVISED: "Optimizing parameters via grid search... tested nu × gamma combinations"
- UNSUPERVISED: "Using automatic parameters based on sample size: nu = 1/√n = 0.0913"
- Explain WHY: "Supervised mode finds better parameters when you have fault examples"
Always request necessary information:
- If bearing frequencies needed → "Please provide BPFO, BPFI, BSF, FTF"
- For ISO evaluation → "What machine group? (1 or 2 only)"
- If no metadata AND no sampling_rate → "Please provide sampling_rate or create metadata file"
- NEW: "Do you have fault signal examples? They would enable supervised parameter optimization"
Communicate segment analysis:
- "Signal is X.X seconds long"
- "Analyzing in 0.1s segments (ML default) → N segments"
- "Using 50% overlap → total M segments for training"
Explain model training:
- "Auto-detected sampling rates from metadata files"
- "Training on N files → M total segments"
- "17 time-domain features per segment"
- "PCA reduced to K components (X% variance)"
- NEW: Training mode and parameter strategy
Interpret results with context:
- "Anomaly ratio 5% → Healthy (threshold: 10%)"
- "Anomaly ratio 80% → Faulty (threshold: 30%)"
- "Recommend traditional analysis to identify fault type"
Use new visualization tools:
- After training: "Would you like to visualize the PCA space?"
- For feature analysis: "Generate feature comparison report to see which features distinguish healthy from faulty"
- Reports saved as interactive HTML in reports/ directory
PCA visualization labels (CRITICAL!):

WITHOUT true_labels: Labels show "Predicted: Healthy/Anomaly" (model predictions only)
WITH true_labels: Labels show "True: X, Predicted: Y" + accuracy metrics
NEVER assume predictions = ground truth unless user explicitly provides true_labels
Communication: "The PCA plot shows model PREDICTIONS. Provide true_labels for validation."
Example validation: "Overall accuracy: 95% (114/120 segments correct)"

Example 7: Machine Documentation Reader

Objective

Extract bearing specifications and technical data from equipment manuals to enable accurate fault diagnosis.

Scenario

You have a vibration signal showing potential bearing faults, but you don't know the bearing specifications (type, geometry, characteristic frequencies). You have the equipment manual as a PDF. You need to extract this information to perform accurate diagnosis.

Test Manual Available

A test pump manual is included in resources/machine_manuals/:

File: test_pump_manual.pdf (also available as .txt for testing)
Content: Complete pump specifications including:
- Bearings: SKF 6205-2RS (drive end), NSK 6206 (non-drive end)
- Bearing geometry: 9 balls, ball diameter, pitch diameter
- Operating speeds: 1475 RPM (rated), 3000 RPM (max)
- Power: 15 kW, 20 HP
- Mechanical seal: Type 21
- Maintenance schedules

Step-by-Step

Step 1: List available manuals

List all available machine manuals

Expected response:

{
  "manuals": [
    {
      "filename": "test_pump_manual.pdf",
      "type": "PDF",
      "size_mb": 0.08,
      "uri": "manual://read/test_pump_manual.pdf"
    }
  ],
  "total_manuals": 1
}

Step 2: Extract structured specifications

Extract specifications from test_pump_manual.pdf

IMPORTANT - LLM Guidelines:

The tool returns ONLY data extracted from the manual
DO NOT add information not present in the extraction results
If a specification is not found, state "Not found in manual"
DO NOT invent bearing geometries or frequencies

Response:

{
  "manual_file": "test_pump_manual.pdf",
  "bearings": [
    "SKF 6205-2RS",
    "NSK 6206",
    "6205",
    "6206"
  ],
  "rpm_values": [1475.0, 3000.0],
  "power_ratings": "15.0 KW, 20.0 HP",
  "text_excerpt": "CENTRIFUGAL PUMP MANUAL\nModel: CP-150..."
}

Interpretation:

✅ Found 2 bearing types: SKF 6205-2RS (drive end), NSK 6206 (non-drive end)
✅ Operating speeds: 1475 RPM (rated), 3000 RPM (maximum)
✅ Power ratings: 15 kW (20 HP)

Step 3: Calculate bearing frequencies (Option A - Known Geometry)

If bearing geometry is in the manual:

Calculate bearing frequencies for SKF 6205 at 1475 RPM with:
- Number of balls: 9
- Ball diameter: 7.94 mm
- Pitch diameter: 34.55 mm
- Contact angle: 0° (deep groove bearing)

IMPORTANT - LLM Guidelines:

This tool REQUIRES exact bearing geometry
DO NOT guess or estimate geometry
ONLY calculate with geometry from manual or user
If geometry unknown, suggest checking manual or catalog

Result:

{
  "shaft_frequency_hz": 24.58,
  "BPFO": 85.20,
  "BPFI": 136.05,
  "BSF": 101.32,
  "FTF": 9.47
}

Step 4: Calculate frequencies (Option B - Catalog Lookup)

If geometry is NOT in manual but bearing designation is known:

Look up bearing 6205 in catalog and calculate frequencies at 1475 RPM

The system has a local catalog with common bearings (6205, 6206):

{
  "6205": {
    "num_balls": 9,
    "ball_diameter_mm": 7.94,
    "pitch_diameter_mm": 34.55,
    "bore_mm": 25,
    "outer_diameter_mm": 52,
    "width_mm": 15
  }
}

Step 5: Read full manual for additional context

For questions not answered by structured extraction:

Read the maintenance section from test_pump_manual.pdf

IMPORTANT - LLM Guidelines:

Base ALL answers EXCLUSIVELY on returned text
DO NOT add information not in the text
If not found, state "Not found in manual"
ALWAYS cite: "According to test_pump_manual.pdf..."

Response (excerpt):

MAINTENANCE SCHEDULE
-------------------
- Bearing lubrication: Every 6 months (lithium-based grease)
- Mechanical seal inspection: Every 3 months
- Impeller check: Annually
- Alignment check: Every 12 months
...

Step 6: Answer specific questions

User can now ask:

"What type of mechanical seal is used?"
→ "According to test_pump_manual.pdf, the pump uses a Type 21 mechanical seal 
   with carbon/ceramic seal faces."

"How often should I lubricate the bearings?"
→ "According to the manual, bearing lubrication should be performed every 
   6 months using lithium-based grease."

"How many impeller vanes?"
→ "According to test_pump_manual.pdf, the impeller is a closed-type bronze 
   impeller with 5 vanes."

Step 7: Complete workflow integration

Now combine manual data with signal analysis:

1. Extract bearing info from manual: SKF 6205-2RS at 1475 RPM
2. Calculate characteristic frequencies: BPFO = 85.20 Hz, BPFI = 136.05 Hz
3. Analyze vibration signal: real_train/OuterRaceFault_1.csv
4. Compare envelope spectrum peaks with bearing frequencies
5. Diagnose: Peak at 81.125 Hz matches BPFO → Outer race fault confirmed

Architecture: Hybrid Approach

The system uses 3 complementary methods:

1. MCP Resources (Primary - Full Text Access)

"Read the pump manual and tell me about the bearings"

Claude reads FULL PDF text directly via manual://read/{filename}
Can answer ANY question (not limited to pre-defined patterns)
Understands context: "nominal" vs "maximum" RPM, "drive end" vs "non-drive end"

2. Structured Extraction (Hints - Fast Lookup)

extract_manual_specs("test_pump_manual.pdf")

Regex-based extraction provides quick hints
Bearings: SKF 6205-2RS, NSK 6206
RPM: 1475, 3000
Power: 15 kW, 20 HP
Cached for repeated access

3. Catalog Lookup (Fallback - Missing Geometry)

lookup_bearing_in_catalog("6205")

Local database with common bearings
Provides geometry when not in manual
Extensible: add more bearings to catalog

Expected Outcome

✅ Zero-knowledge diagnosis enabled

No need to know bearing type beforehand
Extract specs automatically from manual
Calculate frequencies from geometry
Complete diagnostic workflow

✅ Flexible question answering

ANY question about manual content
Not limited to pre-defined patterns
Multiple RPM values handled correctly
Context-aware (rated vs maximum speed)

✅ LLM stays grounded in data

Answers based ONLY on manual content
No hallucinations or assumptions
Clear when information is missing
Always cites source: "According to manual..."

Critical Guidelines for LLM

NEVER invent specifications:
- If bearing geometry not in manual → "Bearing geometry not found in manual. Check manufacturer catalog or measure physically."
- If RPM not specified → "Operating speed not found in manual."
- DO NOT use "typical" or "standard" values without user confirmation
ALWAYS cite the manual:
- ✅ "According to test_pump_manual.pdf, the bearing is SKF 6205-2RS"
- ❌ "The bearing is probably a 6205 based on the pump size"
Distinguish between different RPM values:
- Manual may list: rated speed, maximum speed, minimum speed
- Ask user which to use: "Manual shows 1475 RPM (rated) and 3000 RPM (max). Which operating speed should I use?"
Request missing information:
- "Bearing designation found (SKF 6205-2RS) but geometry not in manual. Should I look it up in the catalog or would you prefer to provide the geometry?"
Use structured extraction as hints, not truth:
- Regex extraction may find false positives (e.g., page numbers)
- Always verify by reading full text if critical
- Structured extraction is for quick screening only
Combine methods intelligently:
- Start with structured extraction (fast)
- Use full text for ambiguous cases
- Fall back to catalog for missing geometry
- Ask user if catalog doesn't have bearing

Pro Tips

Tip 1: Always Check Metadata First

Before analysis, check if metadata files exist (e.g., baseline_1_metadata.json):

Sampling rate: Auto-detected from metadata (no need to specify!)
Bearing frequencies: May be provided in metadata
Signal duration: Documented in metadata
Shaft speed: Often documented
LLM will inform: "Auto-detected sampling rate: 97656 Hz from metadata" or "No metadata found, using default 10000 Hz"
User can always override with explicit parameters

Tip 2: Understanding Frequency Specifications

Bearing frequencies can be specified in two ways:

Absolute Hz (e.g., BPFO = 81.125 Hz) - used in our metadata
Multiples of shaft speed (e.g., BPFO = 3.245 × shaft_freq)

Always clarify with user which format they're providing. Our metadata uses absolute Hz values.

Tip 3: Bearing Type is Often Unknown

Don't assume specific bearing type unless explicitly stated. Instead:

Use "test bearing" or "bearing under test"
Rely on metadata for characteristic frequencies
Focus on fault detection, not bearing identification

Tip 4: Segment Duration Awareness

Different tools use different default segment durations:

ML tools (train_anomaly_model, extract_features): 0.1s segments (captures transients)
FFT/Envelope: 1.0s random segment (good frequency resolution, 1 Hz)
ISO 20816-3: Full signal (standard requirement)
LLM must always inform user about segment duration used
Explain frequency resolution (1 Hz for 1.0s, 10 Hz for 0.1s)
Suggest longer segments for low-speed machines
Mention signal duration vs analyzed segment duration

Tip 5: ISO 20816-3 Machine Groups

Only Group 1 and Group 2 exist in the standard:

Group 1: Large machines (> 300 kW)
Group 2: Medium machines (15-300 kW)
No Group 3 - if user asks, request clarification on machine power

Tip 6: No PNG Files Generated

Reports are HTML-only:

Professional interactive Plotly charts
Saved to reports/ directory
LLM should NOT claim PNG files are created
For static images: user must open HTML and export manually

Tip 7: Combine Multiple Analysis Methods

For best results, use:

ISO 20816-3 → Overall severity assessment
FFT → Identify primary frequencies
Envelope → Bearing fault localization
ML Anomaly Detection → Automated screening

Tip 8: Request Missing Parameters

LLM should inform about auto-detection and only request if truly needed:

Sampling rate: "I'll auto-detect from metadata" (rarely needs to ask)
Shaft speed: "What is the shaft speed?" (if needed for diagnosis and not in metadata)
Bearing frequencies: "Do you have BPFO, BPFI, BSF, FTF?" (if not in metadata)
Machine group: "What machine group for ISO? (1 or 2)" (always ask, not in metadata)
Segment duration: Use tool-specific defaults (0.1s ML, 1.0s FFT/Envelope)

Tip 9: Trend Analysis

Save analysis results over time to track:

RMS velocity trends (ISO evaluation)
Peak amplitude changes (FFT/Envelope)
Appearance of new frequencies
Anomaly ratio progression (ML models)

Tip 10: Safety First

If ISO returns Zone D or envelope shows strong BPFI/BPFO:

Stop operation immediately
Inspect bearing visually
Do not restart until repaired
High anomaly ratios (>50%) also warrant immediate inspection

Next Steps

Practice: Try these examples with your own signals
Learn: Study the diagnostic reasoning in each example
Contribute: Share your diagnostic workflows in GitHub Discussions

Need help? Open an issue or start a discussion on GitHub!

Found this useful? ⭐ Star the repository to show support!

FilesExpand file tree

EXAMPLES.md

Latest commit

History

EXAMPLES.md

File metadata and controls

Machinery Diagnostics - Complete Examples

Table of Contents

Example 1: Simple FFT Analysis

Objective

Scenario

Real Dataset Information

Step-by-Step

Expected Outcome

Example 2: Bearing Fault Detection

Objective

Scenario

Real Dataset Information

Step-by-Step

Diagnosis

Expected Outcome

Example 3: ISO 20816-3 Compliance Check

Objective

Scenario

Real Dataset Information

Step-by-Step

Expected Outcome

Example 4: Complete Bearing Diagnosis

Objective

Scenario

Real Dataset Information

Machine Classification

Step-by-Step

Diagnosis Summary

Professional Reports Generated

Expected Outcome

Example 5: Working with Different Segment Durations

Objective

Understanding Segment Duration

Scenario Comparison

Scenario A: Standard Bearing Analysis (Default 1.0s Works Well)

Scenario B: Detailed Frequency Analysis (Need Better Resolution)

Scenario C: Complete Signal Analysis (Full Signal)

Scenario D: Machine Learning Feature Extraction

Segment Duration Recommendations

Critical Guidelines for LLM

Example 6: Machine Learning-Based Anomaly Detection

Objective

Scenario

Real Dataset Available

Step-by-Step

Expected Outcome

When to Use ML Approach

Critical Guidelines for LLM

Example 7: Machine Documentation Reader

Objective

Scenario

Test Manual Available

Step-by-Step

Architecture: Hybrid Approach

1. MCP Resources (Primary - Full Text Access)

2. Structured Extraction (Hints - Fast Lookup)

3. Catalog Lookup (Fallback - Missing Geometry)

Expected Outcome

Critical Guidelines for LLM

Pro Tips

Tip 1: Always Check Metadata First

Tip 2: Understanding Frequency Specifications

Tip 3: Bearing Type is Often Unknown

Tip 4: Segment Duration Awareness

Tip 5: ISO 20816-3 Machine Groups

Tip 6: No PNG Files Generated

Tip 7: Combine Multiple Analysis Methods

Tip 8: Request Missing Parameters

Tip 9: Trend Analysis

Tip 10: Safety First

Next Steps