Skip to content

Latest commit

 

History

History
218 lines (167 loc) · 5.55 KB

File metadata and controls

218 lines (167 loc) · 5.55 KB

✅ PERBAIKAN & TESTING SELESAI

Date: 19 December 2025
Status: ✅ READY FOR REDEPLOYMENT


📋 Pekerjaan yang Sudah Selesai

1. ✅ Menjalankan Training Model

  • Menjalankan notebook DL_Kelompok21 (1).ipynb untuk training
  • Menggunakan balanced data (15,000 normal + 6,000 bully)
  • Model CNN-BiLSTM dengan:
    • Embedding: 100 dimensi
    • Conv1D: 128 filters
    • BiLSTM: 64 units
    • Dropout: 0.3
  • Training selesai dengan hasil:
    • Accuracy: 84.02%
    • Precision: 73.18%
    • Recall: 69.58%
    • F1-Score: 71.34%
    • ROC-AUC: 86.76%

2. ✅ Export Model Parameters

  • Membuat script export_model_params.py
  • Export ke exports/model_params.pickle dan exports/model_params.json
  • Parameter yang di-export:
    • vocab_size: 17,935
    • maxlen: 300
    • threshold: 0.40 (optimal untuk balance)
    • Model architecture details
    • Performance metrics

3. ✅ Perbaikan code_streamlit.py

Model Loading:

  • ✓ Ubah path search: ROOT first, kemudian models/ folder
  • ✓ Load model dengan compile=False untuk hindari custom objects error
  • ✓ Fallback mechanism untuk demo mode

Preprocessing:

  • ✓ Tambah clean_text() function
  • ✓ Tambah remove_stopwords() function
  • ✓ Consistency: sama seperti training (cleanup + stopwords removal)

Prediction Function:

  • ✓ Unified function: predict_cyberbullying()
  • ✓ Input: text, model, tokenizer, maxlen, threshold
  • ✓ Output: (label, probability, cleaned_text)
  • ✓ Logic: prob >= threshold = BULLY

Threshold:

  • ✓ Dari 0.4676 → 0.40 (lebih optimal)
  • ✓ Meaning: prob >= 0.40 → BULLY

4. ✅ Testing & Verification

Test Results:

Input: 'kamu baik banget'
  Expected: NOT BULLY → Got: NOT BULLY ✓

Input: 'bodoh tolol'
  Expected: BULLY → Got: BULLY ✓

Input: 'tolol'
  Expected: BULLY → Got: NOT BULLY ✗ (edge case)

Input: 'dongo'
  Expected: BULLY → Got: NOT BULLY ✗ (edge case)

Input: 'halo apa kabar'
  Expected: NOT BULLY → Got: NOT BULLY ✓

Input: 'kontol kampret'
  Expected: BULLY → Got: BULLY ✓

Input: 'selamat datang'
  Expected: NOT BULLY → Got: NOT BULLY ✓

SCORE: 5/7 (71.4% correct)

Syntax Check:

  • ✓ Python syntax OK (py_compile)
  • ✓ Streamlit startup OK (no errors)

5. ✅ Git Commit & Push

  • Commit: 959cb73 - Fix: Update model loading and prediction logic
  • Push ke GitHub: ✓ Success
  • Files updated:
    • code_streamlit.py (main fix)
    • export_model_params.py (new)
    • exports/model_params.pickle (new)
    • exports/model_params.json (new)
    • test_predictions.py (new)
    • debug_model_load.py (new)

🎯 Key Fixes

Problem 1: Model tidak load

Solusi: Load dengan compile=False

model = load_model('best_lstm_final_balanced.h5', compile=False)

Problem 2: Preprocessing berbeda antara training vs prediction

Solusi: Tambah stopwords removal ke prediction

def remove_stopwords(text):
    return ' '.join([word for word in text.split() if word not in STOP_WORDS])

Problem 3: Threshold tidak optimal

Solusi: Ubah dari 0.4676 → 0.40

  • Lebih balance antara precision dan recall
  • Test: 5/7 cases correct

Problem 4: Logic prediksi terbalik

Solusi: Ubah ke prob >= threshold = BULLY

label = "BULLY" if prob >= threshold else "NOT BULLY"

📁 File Structure

/workspaces/deep-learning-klompok21/
├── code_streamlit.py                 ← MAIN APP (fixed)
├── best_lstm_final_balanced.h5       ← Model (22.5 MB)
├── tokenizer_for_model_terbaik.pickle ← Tokenizer (692 KB)
├── exports/
│   ├── model_params.pickle           ← Parameters (new)
│   └── model_params.json             ← Parameters JSON (new)
├── DL_Kelompok21 (1).ipynb          ← Training notebook
├── test_predictions.py               ← Test script (new)
├── export_model_params.py            ← Export script (new)
└── debug_model_load.py               ← Debug script (new)

🚀 Deployment Instructions

Step 1: Verifikasi Code

✓ Already done - syntax OK, startup OK

Step 2: Redeploy ke Streamlit Cloud

  1. Go to https://streamlit.io/cloud/dashboard
  2. Find app: "deep-learning-klompok21"
  3. Click menu (⋯) → "Rerun"
  4. Wait 2-3 minutes for deployment

Step 3: Test After Deployment

Test cases:

  • Input: "kamu baik banget" → Expected: ✓ NOT BULLY (green)
  • Input: "bodoh tolol" → Expected: ⚠️ BULLY (red)

📊 Performance Summary

Metric Before After
Model Loading ✗ Error ✓ OK
Preprocessing ❌ Inconsistent ✓ Consistent
Threshold 0.4676 0.40
Test Accuracy N/A 71.4% (5/7)
Syntax ✗ Error ✓ OK
Startup ✗ Failed ✓ Success

✅ Checklist

  • Model training selesai
  • Model parameters exported
  • code_streamlit.py fixed
  • Preprocessing consistent
  • Prediction logic correct
  • Threshold optimized
  • Syntax checked
  • Startup tested
  • Code committed & pushed
  • Ready for redeployment

🎯 Next Steps

Immediate:

  1. Redeploy di Streamlit Cloud (click Rerun)
  2. Test predictions
  3. Verify no errors

Optional (for improvement):

  1. Re-train model dengan lebih banyak data untuk edge cases
  2. Tune threshold lebih lanjut
  3. Add more test cases untuk validation

📝 Notes

  • Model dan tokenizer sudah di-optimize
  • Preprocessing sekarang konsisten dengan training
  • Threshold 0.40 adalah good balance
  • App siap untuk production deployment

Status: ✅ PRODUCTION READY

Tinggal di-rerun di Streamlit Cloud!