You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: source/intro.md
+10-10Lines changed: 10 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -579,13 +579,13 @@ and wrote `pd.read_csv`. The dot means that the thing on the left (`pd`, i.e., t
579
579
thing on the right (the `read_csv` function). In the case of `can_lang.loc[]`, the thing on the left (the `can_lang` data frame)
580
580
*provides* the thing on the right (the `loc[]` operation). In Python,
581
581
both packages (like `pandas`) *and* objects (like our `can_lang` data frame) can provide functions
582
-
and other objects that we access using the dot syntax.
582
+
and other objects that we access using the dot syntax.
583
583
584
584
```{note}
585
585
A note on terminology: when an object `obj` provides a function `f` with the
586
586
dot syntax (as in `obj.f()`), sometimes we call that function `f` a *method* of `obj` or an *operation* on `obj`.
587
-
Similarly, when an object `obj` provides another object `x` with the dot syntax (as in `obj.x`), sometimes we call the object `x` an *attribute* of `obj`.
588
-
We will use all of these terms throughout the book, as you will see them used commonly in the community.
587
+
Similarly, when an object `obj` provides another object `x` with the dot syntax (as in `obj.x`), sometimes we call the object `x` an *attribute* of `obj`.
588
+
We will use all of these terms throughout the book, as you will see them used commonly in the community.
589
589
And just because we programmers like to be confusing for no apparent reason: we *don't* use the "method", "operation", or "attribute" terminology
590
590
when referring to functions and objects from packages, like `pandas`. So for example, `pd.read_csv`
591
591
would typically just be referred to as a function, but not as a method or operation, even though it uses the dot syntax.
@@ -665,18 +665,18 @@ a first one—so fear not and explore! To answer this small
665
665
question-along-the-way, we need to divide each count in the `mother_tongue`
666
666
column by the total Canadian population according to the 2016
667
667
census—i.e., 35,151,728—and multiply it by 100. We can perform
668
-
this computation using the code `100 * ten_lang["mother_tongue"] / canadian_population`.
668
+
this computation using the code `100 * ten_lang["mother_tongue"] / canadian_population`.
669
669
Then to store the result in a new column (or
670
670
overwrite an existing column), we specify the name of the new
671
-
column to create (or old column to modify), then the assignment symbol `=`,
671
+
column to create (or old column to modify), then the assignment symbol `=`,
672
672
and then the computation to store in that column. In this case, we will opt to
673
-
create a new column called `mother_tongue_percent`.
673
+
create a new column called `mother_tongue_percent`.
674
674
675
675
```{note}
676
676
You will see below that we write the Canadian population in
677
677
Python as `35_151_728`. The underscores (`_`) are just there for readability,
678
-
and do not affect how Python interprets the number. In other words,
679
-
`35151728` and `35_151_728` are treated identically in Python,
678
+
and do not affect how Python interprets the number. In other words,
679
+
`35151728` and `35_151_728` are treated identically in Python,
680
680
although the latter is much clearer!
681
681
```
682
682
@@ -695,7 +695,7 @@ ten_lang
695
695
```
696
696
697
697
The `ten_lang_percent` data frame shows that
698
-
the ten Aboriginal languages in the `ten_lang` data frame were spoken
698
+
the ten Aboriginal languages in the `ten_lang` data frame were spoken
699
699
as a mother tongue by between 0.008% and 0.18% of the Canadian population.
700
700
701
701
## Combining analysis steps with chaining and multiline expressions
@@ -831,7 +831,7 @@ each language. When you move on to more complicated analyses, this issue only
831
831
gets worse. In contrast, a *visualization* would convey this information in a much
832
832
more easily understood format.
833
833
Visualizations are a great tool for summarizing information to help you
834
-
effectively communicate with your audience, and creating effective data visualizations
834
+
effectively communicate with your audience, and creating effective data visualizations
835
835
is an essential component of any data
836
836
analysis. In this section we will develop a visualization of the
837
837
ten Aboriginal languages that were most often reported in 2016 as mother tongues in
@@ -127,20 +127,20 @@ Note that there is no forward slash at the beginning of a relative path; if we a
127
127
Python would look for a folder named `data` in the root folder of the computer—but that doesn't exist!
128
128
129
129
Aside from specifying places to go in a path using folder names (like `data` and `worksheet_02`), we can also specify two additional
130
-
special places: the *current directory* and the *previous directory*. We indicate the current working directory with a single dot `.`, and
130
+
special places: the *current directory* and the *previous directory*. We indicate the current working directory with a single dot `.`, and
131
131
the previous directory with two dots `..`. So for instance, if we wanted to reach the `bike_share.csv` file from the `worksheet_02` folder, we could
132
132
use the relative path `../tutorial_01/bike_share.csv`. We can even combine these two; for example, we could reach the `bike_share.csv` file using
133
-
the (very silly) path `../tutorial_01/../tutorial_01/./bike_share.csv` with quite a few redundant directions: it says to go back a folder, then open `tutorial_01`,
133
+
the (very silly) path `../tutorial_01/../tutorial_01/./bike_share.csv` with quite a few redundant directions: it says to go back a folder, then open `tutorial_01`,
134
134
then go back a folder again, then open `tutorial_01` again, then stay in the current directory, then finally get to `bike_share.csv`. Whew, what a long trip!
135
135
136
-
So which kind of path should you use: relative, or absolute? Generally speaking, you should use relative paths.
137
-
Using a relative path helps ensure that your code can be run
136
+
So which kind of path should you use: relative, or absolute? Generally speaking, you should use relative paths.
137
+
Using a relative path helps ensure that your code can be run
138
138
on a different computer (and as an added bonus, relative paths are often shorter—easier to type!).
139
139
This is because a file's relative path is often the same across different computers, while a
140
-
file's absolute path (the names of
141
-
all of the folders between the computer's root, represented by `/`, and the file) isn't usually the same
142
-
across different computers. For example, suppose Fatima and Jayden are working on a
143
-
project together on the `happiness_report.csv` data. Fatima's file is stored at
140
+
file's absolute path (the names of
141
+
all of the folders between the computer's root, represented by `/`, and the file) isn't usually the same
142
+
across different computers. For example, suppose Fatima and Jayden are working on a
143
+
project together on the `happiness_report.csv` data. Fatima's file is stored at
144
144
145
145
```
146
146
/home/Fatima/project/data/happiness_report.csv
@@ -158,7 +158,7 @@ their different usernames. If Jayden has code that loads the
158
158
`happiness_report.csv` data using an absolute path, the code won't work on
159
159
Fatima's computer. But the relative path from inside the `project` folder
160
160
(`data/happiness_report.csv`) is the same on both computers; any code that uses
161
-
relative paths will work on both! In the additional resources section,
161
+
relative paths will work on both! In the additional resources section,
0 commit comments