-
Notifications
You must be signed in to change notification settings - Fork 0
Troubleshooting
SRIJA DE CHOWDHURY edited this page Jan 4, 2026
·
1 revision
|
Syntax & Runtime |
Low accuracy |
Won't converge |
Loading/Processing |
ModuleNotFoundError: No module named 'numpy'✅ Solution:
# Install missing package
pip install numpy
# Or install all requirements
pip install numpy pandas matplotlib seaborn jupyter scikit-learn
# Check installed packages
pip list | grep numpy'jupyter' is not recognized as an internal or external command✅ Solution:
# Reinstall Jupyter
pip uninstall jupyter
pip install jupyter
# Or use JupyterLab
pip install jupyterlab
jupyter lab
# Check installation
jupyter --versionImportError: cannot import name 'LogisticRegression' from 'logistic_regression'✅ Solution:
# Restart the kernel: Kernel → Restart
# Re-run all cells in order
# Check if file exists
import os
print(os.path.exists('logistic_regression.py'))
# Add parent directory to path
import sys
sys.path.append('. .')FileNotFoundError: [Errno 2] No such file or directory: 'data.csv'✅ Solution:
import os
# Check current directory
print("Current directory:", os.getcwd())
# List files in directory
print("Files:", os.listdir('.'))
# Use absolute path
file_path = os. path.abspath('data.csv')
data = pd.read_csv(file_path)
# Or navigate to correct directory
os.chdir('/path/to/your/data')ValueError: X has 100 samples but y has 90 samples✅ Solution:
# Check shapes
print(f"X shape: {X. shape}")
print(f"y shape: {y.shape}")
# Ensure same number of samples
assert X.shape[0] == y.shape[0], "Sample mismatch!"
# Remove NaN rows
mask = ~(np.isnan(X).any(axis=1) | np.isnan(y))
X = X[mask]
y = y[mask]
print(f"After cleaning - X: {X.shape}, y: {y.shape}")ValueError: Input contains NaN, infinity or a value too large✅ Solution:
import numpy as np
import pandas as pd
# Check for NaN values
print("NaN in X:", np.isnan(X).sum())
print("NaN in y:", np.isnan(y).sum())
# Option 1: Remove rows with NaN
X_clean = X[~np.isnan(X).any(axis=1)]
y_clean = y[~np. isnan(X).any(axis=1)]
# Option 2: Fill with mean (for continuous features)
from sklearn.impute import SimpleImputer
imputer = SimpleImputer(strategy='mean')
X_filled = imputer.fit_transform(X)
# Option 3: Fill with median (more robust to outliers)
imputer = SimpleImputer(strategy='median')
X_filled = imputer.fit_transform(X)
# Check infinity values
print("Inf in X:", np.isinf(X).sum())
# Replace infinity with max value
X[np.isinf(X)] = np.nan
X = imputer.fit_transform(X)Iteration 0: Cost = nan✅ Solution:
# Cause 1: Learning rate too high
model = LogisticRegression(learning_rate=0.001) # Reduce LR
# Cause 2: Overflow in sigmoid
def _sigmoid(self, z):
# Clip z to prevent overflow
z_clipped = np.clip(z, -500, 500)
return 1 / (1 + np.exp(-z_clipped))
# Cause 3: Log of zero in cost function
def _compute_cost(self, y_true, y_pred):
epsilon = 1e-15 # Small constant
y_pred = np.clip(y_pred, epsilon, 1 - epsilon)
cost = -1/m * np.sum(
y_true * np.log(y_pred) +
(1 - y_true) * np.log(1 - y_pred)
)
return cost
# Cause 4: Unscaled features
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)Iteration 0: Cost = 0.693
Iteration 100: Cost = 1.234
Iteration 200: Cost = 2.456 # Getting worse! ✅ Solution:
# Problem: Learning rate too high
# Visualize the problem
plt.plot(model.cost_history)
plt.title('Cost is Increasing - Learning Rate Too High!')
plt.xlabel('Iterations')
plt.ylabel('Cost')
plt.show()
# Solution: Reduce learning rate
learning_rates = [1. 0, 0.1, 0.01, 0.001]
for lr in learning_rates:
model = LogisticRegression(learning_rate=lr, n_iterations=100)
model.fit(X_train_scaled, y_train)
plt.plot(model.cost_history, label=f'LR={lr}')
plt.xlabel('Iterations')
plt.ylabel('Cost')
plt.title('Finding Optimal Learning Rate')
plt.legend()
plt.grid(True, alpha=0.3)
plt.show()
# Usually 0.01 or 0.001 works well# Cost still decreasing after 1000 iterations
Iteration 1000: Cost = 0.234 (still decreasing...)✅ Solution:
# Solution 1: Increase iterations
model = LogisticRegression(learning_rate=0.01, n_iterations=5000)
# Solution 2: Use early stopping
class LogisticRegressionEarlyStopping(LogisticRegression):
def fit(self, X, y, tolerance=1e-4):
# ... (forward pass) ...
# Check convergence
if i > 0:
cost_change = abs(self.cost_history[-1] - self.cost_history[-2])
if cost_change < tolerance:
print(f"✅ Converged at iteration {i}")
break
# ... (backward pass and update) ...
# Solution 3: Check if cost is decreasing
if len(model.cost_history) > 10:
recent_costs = model.cost_history[-10:]
if all(recent_costs[i] <= recent_costs[i+1] for i in range(len(recent_costs)-1)):
print("⚠️ Cost not decreasing - check your implementation!")# Cost goes up and down
Iteration 0: Cost = 0.693
Iteration 100: Cost = 0.234
Iteration 200: Cost = 0.456 # Went up!
Iteration 300: Cost = 0.123 # Went down✅ Solution:
# Visualize oscillation
plt.plot(model.cost_history)
plt.title('Oscillating Cost - Learning Rate Too High')
plt.xlabel('Iterations')
plt.ylabel('Cost')
plt.show()
# Solution 1: Reduce learning rate
model = LogisticRegression(learning_rate=0.001, n_iterations=1000)
# Solution 2: Use adaptive learning rate
model = LogisticRegressionAdam(learning_rate=0.01, n_iterations=1000)
# Solution 3: Add momentum
model = LogisticRegressionMomentum(
learning_rate=0.01,
n_iterations=1000,
momentum=0.9
)Training Accuracy: 0.52
Test Accuracy: 0.48 # Barely better than random! ✅ Solution:
# Debug checklist:
# 1. Check data quality
print("Class distribution:", np.bincount(y))
print("Feature statistics:\n", pd.DataFrame(X).describe())
# 2. Check if features are scaled
print("Feature mean:", X.mean())
print("Feature std:", X. std())
# Scale if needed
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)
# 3. Check implementation
# Test sigmoid function
test_sigmoid = model._sigmoid(0)
assert abs(test_sigmoid - 0.5) < 1e-10, "Sigmoid broken!"
# 4. Try different hyperparameters
param_grid = {
'learning_rate': [0.001, 0.01, 0.1],
'n_iterations': [500, 1000, 2000]
}
best_score = 0
for lr in param_grid['learning_rate']:
for iters in param_grid['n_iterations']:
model = LogisticRegression(learning_rate=lr, n_iterations=iters)
model.fit(X_train_scaled, y_train)
score = model.score(X_test_scaled, y_test)
if score > best_score:
best_score = score
print(f"✅ New best: LR={lr}, Iters={iters}, Accuracy={score:.4f}")
# 5. Check for label errors
print("Unique labels:", np.unique(y))
assert set(np.unique(y)) == {0, 1}, "Labels should be 0 and 1!"Training Accuracy: 0.98
Test Accuracy: 0.65 # Big gap!✅ Solution:
# Visualize overfitting
train_acc = []
test_acc = []
for i in range(100, 2000, 100):
model = LogisticRegression(learning_rate=0.01, n_iterations=i)
model.fit(X_train_scaled, y_train)
train_acc.append(model.score(X_train_scaled, y_train))
test_acc.append(model.score(X_test_scaled, y_test))
plt.plot(range(100, 2000, 100), train_acc, label='Train', marker='o')
plt.plot(range(100, 2000, 100), test_acc, label='Test', marker='s')
plt.xlabel('Iterations')
plt.ylabel('Accuracy')
plt.title('Overfitting Detection')
plt.legend()
plt.grid(True, alpha=0.3)
plt.show()
# Solution 1: Add L2 regularization
model = LogisticRegressionL2(
learning_rate=0.01,
n_iterations=1000,
lambda_reg=0.1 # Start with 0.1
)
# Solution 2: Reduce model complexity
# Use feature selection
from sklearn.feature_selection import SelectKBest, f_classif
selector = SelectKBest(f_classif, k=10)
X_train_selected = selector.fit_transform(X_train_scaled, y_train)
X_test_selected = selector.transform(X_test_scaled)
# Solution 3: Get more data
# Or use data augmentation techniques
# Solution 4: Early stopping
model. fit(X_train_scaled, y_train, X_val_scaled, y_val, patience=50)Training Accuracy: 0.62
Test Accuracy: 0.60 # Both low!✅ Solution:
# Solution 1: Increase model complexity
# Add polynomial features
from sklearn.preprocessing import PolynomialFeatures
poly = PolynomialFeatures(degree=2)
X_train_poly = poly.fit_transform(X_train_scaled)
X_test_poly = poly. transform(X_test_scaled)
# Solution 2: Train longer
model = LogisticRegression(learning_rate=0.01, n_iterations=5000)
# Solution 3: Increase learning rate
model = LogisticRegression(learning_rate=0.1, n_iterations=1000)
# Solution 4: Add more features
# Engineer new features from existing ones
# Solution 5: Check if problem is linearly separable
# Try non-linear model if neededpredictions = model.predict(X_test)
print(np.unique(predictions)) # Output: [0] or [1]
# All predictions are the same!✅ Solution:
# Check predicted probabilities
probabilities = model.predict_proba(X_test)
print("Min prob:", probabilities.min())
print("Max prob:", probabilities.max())
print("Mean prob:", probabilities. mean())
# Visualize probability distribution
plt.hist(probabilities, bins=50)
plt.xlabel('Predicted Probability')
plt.ylabel('Frequency')
plt.title('Probability Distribution')
plt.axvline(x=0.5, color='red', linestyle='--', label='Threshold')
plt.legend()
plt.show()
# Possible causes and solutions:
# Cause 1: Imbalanced dataset
print("Class distribution:", np.bincount(y_train))
# Use class weights
model = LogisticRegressionWeighted(
learning_rate=0.01,
n_iterations=1000,
class_weight='balanced'
)
# Cause 2: Features not scaled
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler. transform(X_test)
# Cause 3: Threshold too high/low
# Try different thresholds
thresholds = [0.3, 0.4, 0.5, 0.6, 0.7]
for threshold in thresholds:
preds = (probabilities >= threshold).astype(int)
print(f"Threshold {threshold}: {np.unique(preds, return_counts=True)}")# Model predicts opposite of what's expected
# High-risk patients predicted as low-risk, etc. ✅ Solution:
# Check if labels are flipped
print("Label 0 count:", np.sum(y == 0))
print("Label 1 count:", np.sum(y == 1))
# Manually inspect some predictions
for i in range(10):
print(f"Sample {i}:")
print(f" Features: {X_test[i]}")
print(f" True label: {y_test[i]}")
print(f" Predicted: {model.predict(X_test[i: i+1])[0]}")
print(f" Probability: {model.predict_proba(X_test[i:i+1])[0]:.4f}")
print()
# Check feature importance
importance = np.abs(model.weights)
feature_names = ['Feature ' + str(i) for i in range(len(importance))]
plt.figure(figsize=(10, 6))
plt.barh(feature_names, importance)
plt.xlabel('Importance')
plt.title('Feature Importance')
plt.tight_layout()
plt.show()
# Look for unexpected patterns# Training takes hours on small dataset✅ Solution:
# Solution 1: Use vectorization (check your code)
# ❌ BAD: Loops
for i in range(m):
z[i] = np.dot(X[i], weights) + bias
# ✅ GOOD: Vectorized
z = np.dot(X, weights) + bias
# Solution 2: Use mini-batch gradient descent
model = LogisticRegressionMiniBatch(
learning_rate=0.01,
n_iterations=1000,
batch_size=32
)
# Solution 3: Reduce iterations
# Use early stopping instead of fixed iterations
# Solution 4: Profile your code
import time
start = time.time()
model.fit(X_train, y_train)
end = time.time()
print(f"Training time: {end - start:.2f} seconds")
# Find bottlenecks
import cProfile
cProfile.run('model.fit(X_train, y_train)')MemoryError: Unable to allocate array✅ Solution:
# Solution 1: Use mini-batch processing
model = LogisticRegressionMiniBatch(batch_size=32)
# Solution 2: Reduce data size
# Use sampling for large datasets
from sklearn.model_selection import train_test_split
X_sample, _, y_sample, _ = train_test_split(
X, y, train_size=0.1, random_state=42
)
# Solution 3: Use float32 instead of float64
X = X.astype(np.float32)
# Solution 4: Process in chunks
def train_in_chunks(X, y, chunk_size=1000):
n_samples = len(y)
for start in range(0, n_samples, chunk_size):
end = min(start + chunk_size, n_samples)
X_chunk = X[start:end]
y_chunk = y[start:end]
# Process chunk...
# Solution 5: Clear memory
import gc
gc.collect()plt.plot(model.cost_history)
# Nothing appears! ✅ Solution:
import matplotlib.pyplot as plt
# Solution 1: Add plt.show()
plt.plot(model.cost_history)
plt.show() # Don't forget this!
# Solution 2: Use inline mode for Jupyter
%matplotlib inline
# Solution 3: Use interactive mode
plt.ion()
# Solution 4: Check backend
import matplotlib
print("Backend:", matplotlib.get_backend())
# Change backend if needed
matplotlib.use('TkAgg') # or 'Qt5Agg', 'nbAgg'UserWarning: Matplotlib is currently using agg, which is a non-GUI backend✅ Solution:
# For Jupyter Notebook
%matplotlib inline
# For standalone scripts
import matplotlib
matplotlib.use('TkAgg')
import matplotlib.pyplot as plt
# Suppress warnings
import warnings
warnings.filterwarnings('ignore')def debug_model(model, X_train, y_train, X_test, y_test):
"""
Comprehensive model debugging
Checks for common issues and provides diagnostics
"""
print("=" * 60)
print("🔍 MODEL DEBUGGING REPORT")
print("=" * 60)
# 1. Data checks
print("\n1️⃣ DATA CHECKS:")
print(f" Training samples: {X_train.shape[0]}")
print(f" Test samples: {X_test.shape[0]}")
print(f" Features: {X_train.shape[1]}")
print(f" Class distribution (train): {np.bincount(y_train)}")
print(f" Class distribution (test): {np.bincount(y_test)}")
# Check for NaN/Inf
if np.isnan(X_train).any():
print(" ⚠️ WARNING: NaN values in training data!")
if np.isinf(X_train).any():
print(" ⚠️ WARNING: Inf values in training data!")
# Check scaling
mean_val = np.abs(X_train. mean())
std_val = X_train.std()
if mean_val > 1 or std_val > 10:
print(f" ⚠️ WARNING: Data may need scaling (mean={mean_val:.2f}, std={std_val:.2f})")
else:
print(f" ✅ Data scaling looks good")
# 2. Training checks
print("\n2️⃣ TRAINING CHECKS:")
if not hasattr(model, 'cost_history') or len(model.cost_history) == 0:
print(" ❌ Model not trained yet!")
return
# Check convergence
final_costs = model.cost_history[-10:]
if np.isnan(final_costs).any():
print(" ❌ CRITICAL: Cost is NaN!")
print(" → Try reducing learning rate")
print(" → Check for overflow in sigmoid")
print(" → Ensure features are scaled")
elif len(final_costs) > 1 and all(final_costs[i] >= final_costs[i+1]
for i in range(len(final_costs)-1)):
print(" ✅ Cost is decreasing")
else:
print(" ⚠️ Cost is oscillating - consider reducing learning rate")
# Plot cost history
plt.figure(figsize=(10, 4))
plt.subplot(1, 2, 1)
plt.plot(model.cost_history)
plt.xlabel('Iterations')
plt.ylabel('Cost')
plt.title('Cost History')
plt.grid(True, alpha=0.3)
# 3. Performance checks
print("\n3️⃣ PERFORMANCE CHECKS:")
train_acc = model.score(X_train, y_train)
test_acc = model.score(X_test, y_test)
gap = train_acc - test_acc
print(f" Training accuracy: {train_acc:.4f}")
print(f" Test accuracy: {test_acc:.4f}")
print(f" Gap: {gap:.4f}")
if gap > 0.15:
print(" ❌ OVERFITTING detected!")
print(" → Add L2 regularization")
print(" → Reduce model complexity")
print(" → Get more training data")
elif gap < -0.05:
print(" ⚠️ Test accuracy > Train accuracy (unusual)")
print(" → Check for data leakage")
elif train_acc < 0.6 and test_acc < 0.6:
print(" ❌ UNDERFITTING detected!")
print(" → Increase iterations")
print(" → Add polynomial features")
print(" → Check if problem is linearly separable")
else:
print(" ✅ Good generalization")
# 4. Prediction checks
print("\n4️⃣ PREDICTION CHECKS:")
y_pred = model.predict(X_test)
y_prob = model.predict_proba(X_test)
unique_preds = np.unique(y_pred)
print(f" Unique predictions: {unique_preds}")
if len(unique_preds) == 1:
print(" ❌ CRITICAL: All predictions are the same class!")
print(" → Check class imbalance")
print(" → Use class weights")
print(" → Adjust decision threshold")
# Plot probability distribution
plt.subplot(1, 2, 2)
plt.hist(y_prob[y_test == 0], bins=30, alpha=0.6, color='red', label='Class 0')
plt.hist(y_prob[y_test == 1], bins=30, alpha=0.6, color='blue', label='Class 1')
plt.axvline(x=0.5, color='green', linestyle='--', linewidth=2)
plt.xlabel('Predicted Probability')
plt.ylabel('Frequency')
plt.title('Probability Distribution')
plt.legend()
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()
# 5. Model parameters
print("\n5️⃣ MODEL PARAMETERS:")
print(f" Learning rate: {model.learning_rate}")
print(f" Iterations: {model.n_iterations}")
print(f" Weight range: [{model.weights. min():.4f}, {model.weights.max():.4f}]")
print(f" Bias: {model.bias:. 4f}")
# Check for extreme weights
if np.abs(model.weights).max() > 100:
print(" ⚠️ WARNING: Very large weights detected!")
print(" → May indicate scaling issues")
print(" → Consider adding regularization")
print("\n" + "=" * 60)
print("🏁 DEBUGGING COMPLETE")
print("=" * 60)
# Usage
debug_model(model, X_train_scaled, y_train, X_test_scaled, y_test)- ✅ Features are scaled (StandardScaler)
- ✅ No NaN or Inf values
- ✅ Labels are 0 and 1 (not -1 and 1)
- ✅ Train/test split done correctly
- ✅ Random seed set for reproducibility
- ✅ Data shapes match (X and y same length)
- ✅ Cost is decreasing
- ✅ Cost is not NaN
- ✅ Cost is not oscillating wildly
- ✅ Learning rate is appropriate (0.001-0.1)
- ✅ Sufficient iterations (500-2000)
- ✅ Training accuracy > 50%
- ✅ Test accuracy > 50%
- ✅ Train-test gap < 15%
- ✅ Predictions include both classes
- ✅ Probabilities are well-distributed
| Resource | Type | Link | |: ---------|:-----|:-----| | GitHub Issues | Bug reports | Open Issue | | Stack Overflow | Q&A | Ask Question | | Documentation | Reference | Wiki Home | | FAQ | Common questions | FAQ Page |
## Bug Description
Brief description of the problem
## Environment
- Python version: 3.8.5
- NumPy version: 1.21.0
- OS: Windows 10
## Code to Reproduce
```python
# Minimal code that reproduces the issue
model = LogisticRegression(learning_rate=0.01, n_iterations=1000)
model.fit(X_train, y_train)What you expected to happen
What actually happened
Full error traceback here
- Tried reducing learning rate
- Checked data for NaN values
- ...
---
<div align="center">
[← Optimization Techniques](./Optimization-Techniques) | [FAQ →](./FAQ)
</div>