Skip to content

Commit 5f1c361

Browse files
committed
fixing pandas-profiling by removing it
1 parent 9da5e2a commit 5f1c361

File tree

2 files changed

+2
-38
lines changed

2 files changed

+2
-38
lines changed

exploratory-data-analysis.ipynb

Lines changed: 1 addition & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,6 @@
3535
"outputs": [],
3636
"source": [
3737
"from skimpy import skim\n",
38-
"from pandas_profiling import ProfileReport\n",
3938
"import pandas as pd\n",
4039
"from pandas.api.types import CategoricalDtype\n",
4140
"from lets_plot import *\n",
@@ -1081,41 +1080,6 @@
10811080
"skim(taxis)"
10821081
]
10831082
},
1084-
{
1085-
"cell_type": "markdown",
1086-
"id": "0a1fc099",
1087-
"metadata": {},
1088-
"source": [
1089-
"### The **pandas-profiling** package\n",
1090-
"\n",
1091-
"The EDA we did using the built-in **pandas** functions was a bit limited and user-input heavy. The [**pandas-profiling**](https://pandas-profiling.github.io/pandas-profiling/docs/master/rtd/) library aims to automate the legwork of EDA for you. It generates 'profile' reports from a pandas DataFrame. For each column, many statistics are computed and then relayed in an interactive HTML report. To install it, run `pip install pandas-profiling` in the terminal.\n",
1092-
"\n",
1093-
"Let's generate a report on our dataset. If you are using a large dataset, you may wish to employ the `minimal=True` setting that cuts out a lot of computationally expensive extras:"
1094-
]
1095-
},
1096-
{
1097-
"cell_type": "code",
1098-
"execution_count": null,
1099-
"id": "39739f74",
1100-
"metadata": {},
1101-
"outputs": [],
1102-
"source": [
1103-
"profile = ProfileReport(taxis, minimal=True, title=\"Profiling Report: Taxis Dataset\")\n",
1104-
"profile.to_notebook_iframe()"
1105-
]
1106-
},
1107-
{
1108-
"cell_type": "markdown",
1109-
"id": "f2494069",
1110-
"metadata": {},
1111-
"source": [
1112-
"This is a full on report about everything in our dataset! We can see, for instance, that we have 14 variables and what kind each of them are.\n",
1113-
"\n",
1114-
"The alerts page shows where **pandas-profiling** really shines. It flags *potential* issues with the data that should be taken into account in any subsequent analysis. For example, although not relevant here, the report will say if there are very unbalanced classes in a low cardinality categorical variable.\n",
1115-
"\n",
1116-
"Another good package for automated EDA is [dataprep](https://dataprep.ai/)."
1117-
]
1118-
},
11191083
{
11201084
"cell_type": "markdown",
11211085
"id": "f2810c9e",
@@ -1156,7 +1120,7 @@
11561120
"name": "python",
11571121
"nbconvert_exporter": "python",
11581122
"pygments_lexer": "ipython3",
1159-
"version": "3.10.12"
1123+
"version": "3.10.13"
11601124
},
11611125
"toc-showtags": true
11621126
},

webscraping-and-apis.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -547,7 +547,7 @@
547547
"name": "python",
548548
"nbconvert_exporter": "python",
549549
"pygments_lexer": "ipython3",
550-
"version": "3.10.12"
550+
"version": "3.10.13"
551551
},
552552
"toc-showtags": true
553553
},

0 commit comments

Comments
 (0)