You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: projects/predict-home-prices-with-python-and-linear-regression/predict-home-prices-with-python-and-linear-regression.mdx
@@ -21,10 +25,10 @@ Have you ever wanted to know how much an apartment or house costs to rent or own
21
25
22
26
Machine learning is a major field that lets computers take in data and learn patterns to make predictions and decisions. We will be using Python to learn about one of the fundamentals of predictive modeling in machine learning and, more specifically, linear regression. To help us out, we will use a few data science libraries:
23
27
24
-
- Scikit-learn
25
-
- Pandas
26
28
- NumPy
29
+
- Pandas
27
30
- Matplotlib
31
+
- Scikit-learn
28
32
29
33
## Linear Regression
30
34
@@ -34,7 +38,7 @@ In this tutorial, we will create a linear regression model to predict house pric
@@ -57,7 +61,7 @@ Next, sign into your account to get to the homepage. Then, select "Launch Notebo
57
61
Libraries! Gotta love 'em. In this tutorial, we will be using the following Python libraries for data analysis, data visualization, and machine learning:
58
62
59
63
- 🔢 [NumPy](https://numpy.org) offers a robust foundation for numerical operations and data analysis.
60
-
-📖[Pandas](https://pandas.pydata.org/) lets you to analyze, clean, explore, and manipulate data from different sources.
64
+
-🐼[Pandas](https://pandas.pydata.org/) lets you to analyze, clean, explore, and manipulate data from different sources.
61
65
- 📈 [Matplotlib](https://matplotlib.org) transforms your data into compelling visuals like 2D graphs and bar charts.
62
66
- 🤖 [Scikit-learn](https://scikit-learn.org), commonly known as Sklearn, provides a user-friendly interface for all kinds of machine learning.
63
67
@@ -72,7 +76,7 @@ After launching Notebook on Anaconda, open a new project:
72
76
73
77
## Getting Started
74
78
75
-
In this tutorial, we will be using a [data set](https://drive.google.com/file/d/1cNtzy7IwR753aXvpaYQRuEoxMfUNnYHl/view?usp=sharing) that compares house size and house prices of properties recently sold in Brooklyn's Dumbo neighborhood to predict the cost of houses based on size.
79
+
In this tutorial, we will be using a [dataset](https://drive.google.com/file/d/1cNtzy7IwR753aXvpaYQRuEoxMfUNnYHl/view?usp=sharing) that compares house size and house prices of properties recently sold in Brooklyn's Dumbo neighborhood to predict the cost of houses based on size.
76
80
77
81
**Note:** This data was gathered from [Zillow](https://bit.ly/3Sawp7f).
78
82
@@ -94,7 +98,7 @@ For a quick refresher, we imported the following libraries:
94
98
-`train_test_split` from `sklearn.model_selection` for training and testing the model.
95
99
- The `LinearRegression` class from `sklearn.linear_model` for implementing linear regression.
96
100
97
-
Next, let's import some data. You will need to download the [data set](https://drive.google.com/file/d/1cNtzy7IwR753aXvpaYQRuEoxMfUNnYHl/view?usp=sharing) and upload it to Anaconda.
101
+
Next, let's import some data. You will need to download the [dataset](https://drive.google.com/file/d/1cNtzy7IwR753aXvpaYQRuEoxMfUNnYHl/view?usp=sharing) and upload it to Anaconda.
@@ -140,7 +144,7 @@ Now that we have extracted our data and plotted on a graph, it's time for the li
140
144
141
145
A [train-test](https://en.wikipedia.org/wiki/Training,_validation,_and_test_data_sets) splits the generated data into training and testing sets using the `train_test_split()` function from Sklearn.
142
146
143
-
The train test split is a model validation procedure commonly used in predictive machine learning to simulate how a model will work with new/unknown data. It is commonly used with large data sets or when you need a quick estimate.
147
+
The train test split is a model validation procedure commonly used in predictive machine learning to simulate how a model will work with new/unknown data. It is commonly used with large datasets or when you need a quick estimate.
144
148
145
149
Let's now add the following:
146
150
@@ -199,7 +203,7 @@ With our `plt` object, we use a scatter plot to visualize the actual and predict
199
203
200
204
That's it, you have now officially trained and visualized a predictive modeling algorithm!
201
205
202
-
To take your skills to the next level, consider sourcing data and utilizing linear regression for visualization. [Kaggle](https://www.kaggle.com/) is a great place to find data sets that have already been proven but you can also import any data. You can use linear regression to predict so many things:
206
+
To take your skills to the next level, consider sourcing data and utilizing linear regression for visualization. [Kaggle](https://www.kaggle.com/) is a great place to find datasets that have already been proven but you can also import any data. You can use linear regression to predict so many things:
203
207
204
208
- Video game sales based on reviews.
205
209
- Social media engagement based on follower growth.
0 commit comments