Skip to content

Commit f72327b

Browse files
authored
Merge branch 'master' into python-repl-314
2 parents 1a7f913 + 8e4a5c8 commit f72327b

28 files changed

+231
-12
lines changed
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
# How to Drop Null Values in pandas
2+
3+
The materials contained in this download are designed to complement the Real Python tutorial [How to Drop Null Values in pandas](https://realpython.com/how-to-drop-null-values-in-pandas/).
4+
5+
Consider creating a [Python virtual environment](https://realpython.com/python-virtual-environments-a-primer/) before installing the dependencies:
6+
7+
```shell
8+
$ python3 -m venv .venv/ --prompt pandas-nulls
9+
$ source .venv/bin/activate
10+
(pandas-nulls) $ python -m pip install -r requirements.txt
11+
```
12+
13+
Your download bundle contains the following four files:
14+
15+
1. `drop_null_rows.py`
16+
2. `drop_null_columns.py`
17+
3. `drop_a_subset.py`
18+
4. `exercise_solutions.py`
19+
20+
The first three files contain the code from different tutorial sections, while the fourth contains the solutions to the exercise.
21+
22+
There are also two data files containing the data used throughout the tutorial:
23+
24+
1. `sales_data_with_missing_values.csv`
25+
2. `grades.csv`
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
import pandas as pd
2+
3+
pd.set_option("display.max_columns", None)
4+
5+
sales_data = pd.read_csv(
6+
"sales_data_with_missing_values.csv",
7+
parse_dates=["order_date"],
8+
date_format="%d/%m/%Y",
9+
).convert_dtypes(dtype_backend="pyarrow")
10+
11+
print(sales_data.dropna(axis=0, subset=(["discount", "sale_price"])))
12+
print(sales_data.dropna(how="all"))
13+
print(sales_data.dropna(thresh=5))
14+
print(sales_data.dropna(thresh=5, ignore_index=True))
Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
import pandas as pd
2+
3+
sales_data = pd.read_csv(
4+
"sales_data_with_missing_values.csv",
5+
parse_dates=["order_date"],
6+
date_format="%d/%m/%Y",
7+
).convert_dtypes(dtype_backend="pyarrow")
8+
9+
print(sales_data.dropna(axis="columns"))
Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
import pandas as pd
2+
3+
pd.set_option("display.max_columns", None)
4+
5+
sales_data = pd.read_csv(
6+
"sales_data_with_missing_values.csv",
7+
parse_dates=["order_date"],
8+
date_format="%d/%m/%Y",
9+
).convert_dtypes(dtype_backend="pyarrow")
10+
11+
print(sales_data)
12+
print(sales_data.isna().sum())
13+
print(sales_data.dropna())
14+
15+
clean_sales_data = sales_data.dropna()
16+
print(clean_sales_data)
17+
18+
sales_data.dropna(inplace=True)
19+
print(sales_data)
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
import pandas as pd
2+
3+
grades = pd.read_csv(
4+
"grades.csv",
5+
).convert_dtypes(dtype_backend="pyarrow")
6+
7+
# 1. Use `.dropna()` in such a way that it permanently drops the row in the dataframe containing only null values.
8+
grades.dropna(how="all", inplace=True)
9+
10+
# 2. Display the rows for the exams that all students have completed.
11+
grades.dropna()
12+
13+
# 3. Display any columns with no missing data.
14+
grades.dropna(axis=1)
15+
16+
# 4. Display the exams sat by at least five students.
17+
grades.dropna(axis=0, thresh=6) # Remember there are seven columns.
18+
19+
# 5. Who else was in in every exam that both S2 and S4 sat?
20+
grades.dropna(subset=["S2", "S4"]).dropna(axis=1, ignore_index=True)
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
Subject,S1,S2,S3,S4,S5,S6
2+
math,18,,15,20,17,18
3+
science,26,35,19,,33,
4+
art,15,,9,17,18,14
5+
music,14,20,12,20,13,18
6+
history,18,19,,17,,18
7+
sport,20,17,20,17,18
8+
,,,,,
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
numpy==2.3.2
2+
pandas==2.3.2
3+
pyarrow==21.0.0
4+
python-dateutil==2.9.0.post0
5+
pytz==2025.2
6+
six==1.17.0
7+
tzdata==2025.2
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
order_number,order_date,customer_name,product_purchased,discount,sale_price
2+
,09/02/2025,Skipton Fealty,Chili Extra Virgin Olive Oil,TRUE,135.00
3+
70041,,Carmine Priestnall,,,150.00
4+
70042,09/02/2025,,Rosemary Olive Oil Candle,FALSE,78.00
5+
70043,10/02/2025,Lanni D'Ambrogi,,TRUE,19.50
6+
70044,10/02/2025,Tann Angear,Vanilla and Olive Oil Candle,,13.98
7+
70045,10/02/2025,Skipton Fealty,Basil Extra Virgin Olive Oil,TRUE,
8+
70046,11/02/2025,Far Pow,Chili Extra Virgin Olive Oil,FALSE,150.00
9+
70047,11/02/2025,Hill Group,Chili Extra Virgin Olive Oil,TRUE,135.00
10+
70048,11/02/2025,Devlin Nock,Lavender and Olive Oil Lotion,FALSE,39.96
11+
,,,,,
12+
70049,12/02/2025,Swift Inc,Garlic Extra Virgin Olive Oil,TRUE,936.00
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
1-
from multi_line_docstring import get_harry_potter_books
1+
from multiline_docstring import get_harry_potter_books
22

33
print(get_harry_potter_books.__doc__)

python-docstrings/google_style.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ def get_magic_items(user_id, include_potions=False):
44
55
Args:
66
user_id (int): The ID of the user whose items should be retrieved.
7-
include_potions (bool, optional): Whether to include potions in the result.
7+
include_potions (bool, optional): Whether to include potions.
88
99
Returns:
1010
list[str]: A list of item names associated with the user.

0 commit comments

Comments
 (0)