Skip to content

Troubleshooting

SRIJA DE CHOWDHURY edited this page Jan 4, 2026 · 1 revision

🔍 Troubleshooting

Solutions to Common Problems and Errors

Debug Help


🚨 Common Issues

🐛 Code Errors

Syntax & Runtime

📉 Poor Performance

Low accuracy

⚠️ Training Issues

Won't converge

💾 Data Problems

Loading/Processing


1️⃣ Installation & Setup Issues

❌ Problem: ModuleNotFoundError

ModuleNotFoundError: No module named 'numpy'

✅ Solution:

# Install missing package
pip install numpy

# Or install all requirements
pip install numpy pandas matplotlib seaborn jupyter scikit-learn

# Check installed packages
pip list | grep numpy

❌ Problem: Jupyter Notebook Won't Start

'jupyter' is not recognized as an internal or external command

✅ Solution:

# Reinstall Jupyter
pip uninstall jupyter
pip install jupyter

# Or use JupyterLab
pip install jupyterlab
jupyter lab

# Check installation
jupyter --version

❌ Problem: Import Errors in Notebook

ImportError: cannot import name 'LogisticRegression' from 'logistic_regression'

✅ Solution:

# Restart the kernel:  Kernel → Restart

# Re-run all cells in order

# Check if file exists
import os
print(os.path.exists('logistic_regression.py'))

# Add parent directory to path
import sys
sys.path.append('. .')

2️⃣ Data Loading Issues

❌ Problem: File Not Found

FileNotFoundError: [Errno 2] No such file or directory:  'data.csv'

✅ Solution:

import os

# Check current directory
print("Current directory:", os.getcwd())

# List files in directory
print("Files:", os.listdir('.'))

# Use absolute path
file_path = os. path.abspath('data.csv')
data = pd.read_csv(file_path)

# Or navigate to correct directory
os.chdir('/path/to/your/data')

❌ Problem: Data Shape Mismatch

ValueError: X has 100 samples but y has 90 samples

✅ Solution:

# Check shapes
print(f"X shape: {X. shape}")
print(f"y shape: {y.shape}")

# Ensure same number of samples
assert X.shape[0] == y.shape[0], "Sample mismatch!"

# Remove NaN rows
mask = ~(np.isnan(X).any(axis=1) | np.isnan(y))
X = X[mask]
y = y[mask]

print(f"After cleaning - X:  {X.shape}, y: {y.shape}")

❌ Problem: Missing Values

ValueError: Input contains NaN, infinity or a value too large

✅ Solution:

import numpy as np
import pandas as pd

# Check for NaN values
print("NaN in X:", np.isnan(X).sum())
print("NaN in y:", np.isnan(y).sum())

# Option 1: Remove rows with NaN
X_clean = X[~np.isnan(X).any(axis=1)]
y_clean = y[~np. isnan(X).any(axis=1)]

# Option 2: Fill with mean (for continuous features)
from sklearn.impute import SimpleImputer
imputer = SimpleImputer(strategy='mean')
X_filled = imputer.fit_transform(X)

# Option 3: Fill with median (more robust to outliers)
imputer = SimpleImputer(strategy='median')
X_filled = imputer.fit_transform(X)

# Check infinity values
print("Inf in X:", np.isinf(X).sum())

# Replace infinity with max value
X[np.isinf(X)] = np.nan
X = imputer.fit_transform(X)

3️⃣ Training Problems

❌ Problem: Cost is NaN

Iteration 0:  Cost = nan

✅ Solution:

# Cause 1: Learning rate too high
model = LogisticRegression(learning_rate=0.001)  # Reduce LR

# Cause 2: Overflow in sigmoid
def _sigmoid(self, z):
    # Clip z to prevent overflow
    z_clipped = np.clip(z, -500, 500)
    return 1 / (1 + np.exp(-z_clipped))

# Cause 3: Log of zero in cost function
def _compute_cost(self, y_true, y_pred):
    epsilon = 1e-15  # Small constant
    y_pred = np.clip(y_pred, epsilon, 1 - epsilon)
    cost = -1/m * np.sum(
        y_true * np.log(y_pred) + 
        (1 - y_true) * np.log(1 - y_pred)
    )
    return cost

# Cause 4: Unscaled features
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

❌ Problem: Cost Increasing

Iteration 0: Cost = 0.693
Iteration 100: Cost = 1.234
Iteration 200: Cost = 2.456  # Getting worse! 

✅ Solution:

# Problem: Learning rate too high
# Visualize the problem
plt.plot(model.cost_history)
plt.title('Cost is Increasing - Learning Rate Too High!')
plt.xlabel('Iterations')
plt.ylabel('Cost')
plt.show()

# Solution: Reduce learning rate
learning_rates = [1. 0, 0.1, 0.01, 0.001]

for lr in learning_rates: 
    model = LogisticRegression(learning_rate=lr, n_iterations=100)
    model.fit(X_train_scaled, y_train)
    
    plt.plot(model.cost_history, label=f'LR={lr}')

plt.xlabel('Iterations')
plt.ylabel('Cost')
plt.title('Finding Optimal Learning Rate')
plt.legend()
plt.grid(True, alpha=0.3)
plt.show()

# Usually 0.01 or 0.001 works well

❌ Problem: Model Not Converging

# Cost still decreasing after 1000 iterations
Iteration 1000: Cost = 0.234  (still decreasing...)

✅ Solution:

# Solution 1: Increase iterations
model = LogisticRegression(learning_rate=0.01, n_iterations=5000)

# Solution 2: Use early stopping
class LogisticRegressionEarlyStopping(LogisticRegression):
    def fit(self, X, y, tolerance=1e-4):
        # ... (forward pass) ...
        
        # Check convergence
        if i > 0: 
            cost_change = abs(self.cost_history[-1] - self.cost_history[-2])
            if cost_change < tolerance:
                print(f"✅ Converged at iteration {i}")
                break
        
        # ... (backward pass and update) ...

# Solution 3: Check if cost is decreasing
if len(model.cost_history) > 10:
    recent_costs = model.cost_history[-10:]
    if all(recent_costs[i] <= recent_costs[i+1] for i in range(len(recent_costs)-1)):
        print("⚠️ Cost not decreasing - check your implementation!")

❌ Problem: Oscillating Cost

# Cost goes up and down
Iteration 0: Cost = 0.693
Iteration 100: Cost = 0.234
Iteration 200: Cost = 0.456  # Went up! 
Iteration 300: Cost = 0.123  # Went down

✅ Solution:

# Visualize oscillation
plt.plot(model.cost_history)
plt.title('Oscillating Cost - Learning Rate Too High')
plt.xlabel('Iterations')
plt.ylabel('Cost')
plt.show()

# Solution 1: Reduce learning rate
model = LogisticRegression(learning_rate=0.001, n_iterations=1000)

# Solution 2: Use adaptive learning rate
model = LogisticRegressionAdam(learning_rate=0.01, n_iterations=1000)

# Solution 3: Add momentum
model = LogisticRegressionMomentum(
    learning_rate=0.01, 
    n_iterations=1000, 
    momentum=0.9
)

4️⃣ Performance Issues

❌ Problem: Low Accuracy (Random Guessing)

Training Accuracy: 0.52
Test Accuracy: 0.48  # Barely better than random! 

✅ Solution:

# Debug checklist: 

# 1. Check data quality
print("Class distribution:", np.bincount(y))
print("Feature statistics:\n", pd.DataFrame(X).describe())

# 2. Check if features are scaled
print("Feature mean:", X.mean())
print("Feature std:", X. std())

# Scale if needed
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# 3. Check implementation
# Test sigmoid function
test_sigmoid = model._sigmoid(0)
assert abs(test_sigmoid - 0.5) < 1e-10, "Sigmoid broken!"

# 4. Try different hyperparameters
param_grid = {
    'learning_rate': [0.001, 0.01, 0.1],
    'n_iterations': [500, 1000, 2000]
}

best_score = 0
for lr in param_grid['learning_rate']: 
    for iters in param_grid['n_iterations']:
        model = LogisticRegression(learning_rate=lr, n_iterations=iters)
        model.fit(X_train_scaled, y_train)
        score = model.score(X_test_scaled, y_test)
        
        if score > best_score: 
            best_score = score
            print(f"✅ New best: LR={lr}, Iters={iters}, Accuracy={score:.4f}")

# 5. Check for label errors
print("Unique labels:", np.unique(y))
assert set(np.unique(y)) == {0, 1}, "Labels should be 0 and 1!"

❌ Problem: Overfitting

Training Accuracy: 0.98
Test Accuracy: 0.65  # Big gap!

✅ Solution:

# Visualize overfitting
train_acc = []
test_acc = []

for i in range(100, 2000, 100):
    model = LogisticRegression(learning_rate=0.01, n_iterations=i)
    model.fit(X_train_scaled, y_train)
    
    train_acc.append(model.score(X_train_scaled, y_train))
    test_acc.append(model.score(X_test_scaled, y_test))

plt.plot(range(100, 2000, 100), train_acc, label='Train', marker='o')
plt.plot(range(100, 2000, 100), test_acc, label='Test', marker='s')
plt.xlabel('Iterations')
plt.ylabel('Accuracy')
plt.title('Overfitting Detection')
plt.legend()
plt.grid(True, alpha=0.3)
plt.show()

# Solution 1: Add L2 regularization
model = LogisticRegressionL2(
    learning_rate=0.01,
    n_iterations=1000,
    lambda_reg=0.1  # Start with 0.1
)

# Solution 2: Reduce model complexity
# Use feature selection
from sklearn.feature_selection import SelectKBest, f_classif
selector = SelectKBest(f_classif, k=10)
X_train_selected = selector.fit_transform(X_train_scaled, y_train)
X_test_selected = selector.transform(X_test_scaled)

# Solution 3: Get more data
# Or use data augmentation techniques

# Solution 4: Early stopping
model. fit(X_train_scaled, y_train, X_val_scaled, y_val, patience=50)

❌ Problem: Underfitting

Training Accuracy: 0.62
Test Accuracy: 0.60  # Both low!

✅ Solution:

# Solution 1: Increase model complexity
# Add polynomial features
from sklearn.preprocessing import PolynomialFeatures
poly = PolynomialFeatures(degree=2)
X_train_poly = poly.fit_transform(X_train_scaled)
X_test_poly = poly. transform(X_test_scaled)

# Solution 2: Train longer
model = LogisticRegression(learning_rate=0.01, n_iterations=5000)

# Solution 3: Increase learning rate
model = LogisticRegression(learning_rate=0.1, n_iterations=1000)

# Solution 4: Add more features
# Engineer new features from existing ones

# Solution 5: Check if problem is linearly separable
# Try non-linear model if needed

5️⃣ Prediction Issues

❌ Problem: All Predictions Same Class

predictions = model.predict(X_test)
print(np.unique(predictions))  # Output: [0]  or [1]
# All predictions are the same!

✅ Solution:

# Check predicted probabilities
probabilities = model.predict_proba(X_test)
print("Min prob:", probabilities.min())
print("Max prob:", probabilities.max())
print("Mean prob:", probabilities. mean())

# Visualize probability distribution
plt.hist(probabilities, bins=50)
plt.xlabel('Predicted Probability')
plt.ylabel('Frequency')
plt.title('Probability Distribution')
plt.axvline(x=0.5, color='red', linestyle='--', label='Threshold')
plt.legend()
plt.show()

# Possible causes and solutions: 

# Cause 1: Imbalanced dataset
print("Class distribution:", np.bincount(y_train))

# Use class weights
model = LogisticRegressionWeighted(
    learning_rate=0.01,
    n_iterations=1000,
    class_weight='balanced'
)

# Cause 2: Features not scaled
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler. transform(X_test)

# Cause 3: Threshold too high/low
# Try different thresholds
thresholds = [0.3, 0.4, 0.5, 0.6, 0.7]
for threshold in thresholds:
    preds = (probabilities >= threshold).astype(int)
    print(f"Threshold {threshold}:  {np.unique(preds, return_counts=True)}")

❌ Problem: Predictions Don't Make Sense

# Model predicts opposite of what's expected
# High-risk patients predicted as low-risk, etc. 

✅ Solution:

# Check if labels are flipped
print("Label 0 count:", np.sum(y == 0))
print("Label 1 count:", np.sum(y == 1))

# Manually inspect some predictions
for i in range(10):
    print(f"Sample {i}:")
    print(f"  Features: {X_test[i]}")
    print(f"  True label: {y_test[i]}")
    print(f"  Predicted:  {model.predict(X_test[i: i+1])[0]}")
    print(f"  Probability: {model.predict_proba(X_test[i:i+1])[0]:.4f}")
    print()

# Check feature importance
importance = np.abs(model.weights)
feature_names = ['Feature ' + str(i) for i in range(len(importance))]

plt.figure(figsize=(10, 6))
plt.barh(feature_names, importance)
plt.xlabel('Importance')
plt.title('Feature Importance')
plt.tight_layout()
plt.show()

# Look for unexpected patterns

6️⃣ Memory & Performance Issues

❌ Problem: Training Too Slow

# Training takes hours on small dataset

✅ Solution:

# Solution 1: Use vectorization (check your code)
# ❌ BAD: Loops
for i in range(m):
    z[i] = np.dot(X[i], weights) + bias

# ✅ GOOD: Vectorized
z = np.dot(X, weights) + bias

# Solution 2: Use mini-batch gradient descent
model = LogisticRegressionMiniBatch(
    learning_rate=0.01,
    n_iterations=1000,
    batch_size=32
)

# Solution 3: Reduce iterations
# Use early stopping instead of fixed iterations

# Solution 4: Profile your code
import time

start = time.time()
model.fit(X_train, y_train)
end = time.time()

print(f"Training time: {end - start:.2f} seconds")

# Find bottlenecks
import cProfile
cProfile.run('model.fit(X_train, y_train)')

❌ Problem: Memory Error

MemoryError: Unable to allocate array

✅ Solution:

# Solution 1: Use mini-batch processing
model = LogisticRegressionMiniBatch(batch_size=32)

# Solution 2: Reduce data size
# Use sampling for large datasets
from sklearn.model_selection import train_test_split
X_sample, _, y_sample, _ = train_test_split(
    X, y, train_size=0.1, random_state=42
)

# Solution 3: Use float32 instead of float64
X = X.astype(np.float32)

# Solution 4: Process in chunks
def train_in_chunks(X, y, chunk_size=1000):
    n_samples = len(y)
    for start in range(0, n_samples, chunk_size):
        end = min(start + chunk_size, n_samples)
        X_chunk = X[start:end]
        y_chunk = y[start:end]
        # Process chunk... 

# Solution 5: Clear memory
import gc
gc.collect()

7️⃣ Visualization Issues

❌ Problem: Plots Not Showing

plt.plot(model.cost_history)
# Nothing appears! 

✅ Solution:

import matplotlib.pyplot as plt

# Solution 1: Add plt.show()
plt.plot(model.cost_history)
plt.show()  # Don't forget this!

# Solution 2: Use inline mode for Jupyter
%matplotlib inline

# Solution 3: Use interactive mode
plt.ion()

# Solution 4: Check backend
import matplotlib
print("Backend:", matplotlib.get_backend())

# Change backend if needed
matplotlib.use('TkAgg')  # or 'Qt5Agg', 'nbAgg'

❌ Problem: Matplotlib Warnings

UserWarning:  Matplotlib is currently using agg, which is a non-GUI backend

✅ Solution:

# For Jupyter Notebook
%matplotlib inline

# For standalone scripts
import matplotlib
matplotlib.use('TkAgg')
import matplotlib.pyplot as plt

# Suppress warnings
import warnings
warnings.filterwarnings('ignore')

8️⃣ Debugging Toolkit

Comprehensive Debug Function

def debug_model(model, X_train, y_train, X_test, y_test):
    """
    Comprehensive model debugging
    
    Checks for common issues and provides diagnostics
    """
    print("=" * 60)
    print("🔍 MODEL DEBUGGING REPORT")
    print("=" * 60)
    
    # 1. Data checks
    print("\n1️⃣ DATA CHECKS:")
    print(f"   Training samples: {X_train.shape[0]}")
    print(f"   Test samples: {X_test.shape[0]}")
    print(f"   Features: {X_train.shape[1]}")
    print(f"   Class distribution (train): {np.bincount(y_train)}")
    print(f"   Class distribution (test): {np.bincount(y_test)}")
    
    # Check for NaN/Inf
    if np.isnan(X_train).any():
        print("   ⚠️ WARNING: NaN values in training data!")
    if np.isinf(X_train).any():
        print("   ⚠️ WARNING: Inf values in training data!")
    
    # Check scaling
    mean_val = np.abs(X_train. mean())
    std_val = X_train.std()
    if mean_val > 1 or std_val > 10:
        print(f"   ⚠️ WARNING: Data may need scaling (mean={mean_val:.2f}, std={std_val:.2f})")
    else:
        print(f"   ✅ Data scaling looks good")
    
    # 2. Training checks
    print("\n2️⃣ TRAINING CHECKS:")
    
    if not hasattr(model, 'cost_history') or len(model.cost_history) == 0:
        print("   ❌ Model not trained yet!")
        return
    
    # Check convergence
    final_costs = model.cost_history[-10:]
    if np.isnan(final_costs).any():
        print("   ❌ CRITICAL: Cost is NaN!")
        print("      → Try reducing learning rate")
        print("      → Check for overflow in sigmoid")
        print("      → Ensure features are scaled")
    elif len(final_costs) > 1 and all(final_costs[i] >= final_costs[i+1] 
                                       for i in range(len(final_costs)-1)):
        print("   ✅ Cost is decreasing")
    else:
        print("   ⚠️ Cost is oscillating - consider reducing learning rate")
    
    # Plot cost history
    plt.figure(figsize=(10, 4))
    plt.subplot(1, 2, 1)
    plt.plot(model.cost_history)
    plt.xlabel('Iterations')
    plt.ylabel('Cost')
    plt.title('Cost History')
    plt.grid(True, alpha=0.3)
    
    # 3. Performance checks
    print("\n3️⃣ PERFORMANCE CHECKS:")
    
    train_acc = model.score(X_train, y_train)
    test_acc = model.score(X_test, y_test)
    gap = train_acc - test_acc
    
    print(f"   Training accuracy: {train_acc:.4f}")
    print(f"   Test accuracy: {test_acc:.4f}")
    print(f"   Gap: {gap:.4f}")
    
    if gap > 0.15:
        print("   ❌ OVERFITTING detected!")
        print("      → Add L2 regularization")
        print("      → Reduce model complexity")
        print("      → Get more training data")
    elif gap < -0.05:
        print("   ⚠️ Test accuracy > Train accuracy (unusual)")
        print("      → Check for data leakage")
    elif train_acc < 0.6 and test_acc < 0.6:
        print("   ❌ UNDERFITTING detected!")
        print("      → Increase iterations")
        print("      → Add polynomial features")
        print("      → Check if problem is linearly separable")
    else:
        print("   ✅ Good generalization")
    
    # 4. Prediction checks
    print("\n4️⃣ PREDICTION CHECKS:")
    
    y_pred = model.predict(X_test)
    y_prob = model.predict_proba(X_test)
    
    unique_preds = np.unique(y_pred)
    print(f"   Unique predictions: {unique_preds}")
    
    if len(unique_preds) == 1:
        print("   ❌ CRITICAL: All predictions are the same class!")
        print("      → Check class imbalance")
        print("      → Use class weights")
        print("      → Adjust decision threshold")
    
    # Plot probability distribution
    plt.subplot(1, 2, 2)
    plt.hist(y_prob[y_test == 0], bins=30, alpha=0.6, color='red', label='Class 0')
    plt.hist(y_prob[y_test == 1], bins=30, alpha=0.6, color='blue', label='Class 1')
    plt.axvline(x=0.5, color='green', linestyle='--', linewidth=2)
    plt.xlabel('Predicted Probability')
    plt.ylabel('Frequency')
    plt.title('Probability Distribution')
    plt.legend()
    plt.grid(True, alpha=0.3)
    
    plt.tight_layout()
    plt.show()
    
    # 5. Model parameters
    print("\n5️⃣ MODEL PARAMETERS:")
    print(f"   Learning rate: {model.learning_rate}")
    print(f"   Iterations: {model.n_iterations}")
    print(f"   Weight range: [{model.weights. min():.4f}, {model.weights.max():.4f}]")
    print(f"   Bias:  {model.bias:. 4f}")
    
    # Check for extreme weights
    if np.abs(model.weights).max() > 100:
        print("   ⚠️ WARNING: Very large weights detected!")
        print("      → May indicate scaling issues")
        print("      → Consider adding regularization")
    
    print("\n" + "=" * 60)
    print("🏁 DEBUGGING COMPLETE")
    print("=" * 60)

# Usage
debug_model(model, X_train_scaled, y_train, X_test_scaled, y_test)

9️⃣ Quick Fixes Checklist

Before Training

  • ✅ Features are scaled (StandardScaler)
  • ✅ No NaN or Inf values
  • ✅ Labels are 0 and 1 (not -1 and 1)
  • ✅ Train/test split done correctly
  • ✅ Random seed set for reproducibility
  • ✅ Data shapes match (X and y same length)

During Training

  • ✅ Cost is decreasing
  • ✅ Cost is not NaN
  • ✅ Cost is not oscillating wildly
  • ✅ Learning rate is appropriate (0.001-0.1)
  • ✅ Sufficient iterations (500-2000)

After Training

  • ✅ Training accuracy > 50%
  • ✅ Test accuracy > 50%
  • ✅ Train-test gap < 15%
  • ✅ Predictions include both classes
  • ✅ Probabilities are well-distributed

🆘 Getting Help

Where to Find Help

| Resource | Type | Link | |: ---------|:-----|:-----| | GitHub Issues | Bug reports | Open Issue | | Stack Overflow | Q&A | Ask Question | | Documentation | Reference | Wiki Home | | FAQ | Common questions | FAQ Page |

Creating a Good Bug Report

## Bug Description
Brief description of the problem

## Environment
- Python version: 3.8.5
- NumPy version: 1.21.0
- OS: Windows 10

## Code to Reproduce
```python
# Minimal code that reproduces the issue
model = LogisticRegression(learning_rate=0.01, n_iterations=1000)
model.fit(X_train, y_train)

Expected Behavior

What you expected to happen

Actual Behavior

What actually happened

Error Message

Full error traceback here

What I've Tried

  • Tried reducing learning rate
  • Checked data for NaN values
  • ...

---

<div align="center">

[← Optimization Techniques](./Optimization-Techniques) | [FAQ →](./FAQ)

</div>

Clone this wiki locally