click here to view fully working streamlit app.
click here to view blog.
- This project represents the culmination of my Data Science mentorship program, completed under the guidance of our mentor, Sir Nitish.
- Objective: Gain practical experience in the end-to-end Data analysis lifecycle in the real estate domain.
1. 🌐 Web Scraping Challenges 🌐
Blocked IPs 🚫 Website blocked our IP addresses. Limited access to pages.
Solution ✅
Slow Down Requests: Adjusted the frequency of requests to the website.
2.🧹 Data Cleaning 🧹
Manual cleaning: Fixed spelling mistakes and focused on major areas.
Addressed Inconsistencies: Numeric data stored as strings,Inconsistencies in units of measurement,Converting all text to a lower case etc.
3.🛠️ Feature Engineering 🛠️
Resolving errors in the built-up area column
Modified luxury score column and furnishing details column for clarity and accuracy.
4. 📊 Outlier Detection & Removal 📊
Some manual work in excel worksheet.
Tailored approaches for dealing with outliers in specific cases.
5. 📈 Exploratory Data Analysis 📈
Conducted exploratory data analysis (EDA) to uncover insights and patterns in the dataset, revealing key findings for informed decision-making.
I have developed a fully functional Streamlit web application as part of this project. The app allows users to interactively explore the data analysis results and visualize key insights.
To run this dashboard locally, follow these steps:
- Clone
streamlit_app
repository. - download
data_viz1..csv
- Run the Streamlit app using
streamlit run app.py
.
Python 3.x
Streamlit
Pandas
matplotlib
We welcome contributions from the community. If you find any bugs or have ideas for new features, please open an issue or submit a pull request.
For any questions or feedback, feel free to reach out to [email protected].