This project is a web application dedicated to the history of the legendary ship "Titanic" and the 1912 tragedy. The key feature of the project is an interactive survival prediction system based on real Titanic passenger data using machine learning methods.
- Detailed information about Titanic history and timeline of events
- Interactive survival chance calculator based on machine learning model
- Statistics and facts about the disaster
- Responsive design in vintage Titanic-era style
Titanik/
│
├── app.py # Main Flask application file
├── save_model.py # Script for saving the model
├── titanic_model.py # Complete data analysis and model creation
├── titanic_model.joblib # Saved machine learning model
├── feature_info.joblib # Model feature information
│
├── templates/ # HTML templates
│ ├── layout.html # Base template
│ ├── index.html # Home page
│ ├── predict.html # Survival prediction page
│ └── about.html # About page
│
├── static/ # Static files
│ ├── css/
│ │ └── style.css # Site styles
│ │
│ └── images/ # Images
│ ├── titanic.jpg # Titanic photo
│ ├── header_bg.jpg # Header background
│ └── paper_texture.jpg # Background texture
│
├── train.csv # Training data about passengers
├── test.csv # Test data about passengers
├── gender_submission.csv # Sample prediction submission file
│
└── README.md # Project documentation
- Python
- pandas and NumPy for data processing
- scikit-learn for model building
- Gradient Boosting (GradientBoostingClassifier) for predictions
- Feature engineering for improved accuracy
- Flask
- HTML/CSS
- Responsive design
- Vintage Titanic-era style design
- Clone the repository:
cd titanic-project
- Install required dependencies:
pip install flask pandas numpy scikit-learn xgboost joblib
- Train and save the model (if model file is missing):
python save_model.py
- Run the web application:
python app.py
- Open browser and navigate to:
http://127.0.0.1:5000
The project uses a gradient boosting algorithm to predict survival probability. The model considers the following factors:
- Passenger gender (most significant factor)
- Ticket class (1st, 2nd, or 3rd class)
- Age
- Number of relatives on board
- Ticket fare
- Port of embarkation
- Passenger title (indicating social status)
- Derived features (e.g., family size, whether passenger is alone)
The model achieves approximately 84% accuracy on the validation dataset, which is a good result for this task.
To check survival chances, enter data in the form:
- Ticket Class: select class (1, 2, or 3)
- Gender: male or female
- Age: enter your age
- Relatives on Board: number of siblings/spouses and parents/children
- Ticket Fare: approximate cost in pounds (from 7 to 250)
- Port of Embarkation: Southampton (S), Cherbourg (C), or Queenstown (Q)
- Title: select appropriate title
After entering the data, click the "Calculate Chances" button and receive the result with percentage survival probability.
The project uses the Titanic passenger dataset from the Kaggle competition "Titanic: Machine Learning from Disaster":
- train.csv: contains data about 891 passengers with survival information
- test.csv: contains data about 418 passengers without survival information
- gender_submission.csv: sample prediction submission file
- Data: Kaggle - Titanic: Machine Learning from Disaster
- Images: Wikimedia Commons
- Historical facts: various historical sources about Titanic
This project is distributed under the MIT License. You are free to use, modify, and distribute the code provided you maintain attribution.
© 2024 Titanic Project
