You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: LICENSE.md
+8-10Lines changed: 8 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,23 +1,21 @@
1
-
**CC BY 2.5 CA**
1
+
# License
2
2
3
-
An Introduction to Data Science is
4
-
made available under the **Attribution-NonCommercial-ShareAlike 4.0 International** ([CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/)).
3
+
This textbook is made available under the **Attribution-NonCommercial-ShareAlike 4.0 International** ([CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/)).
5
4
6
5
This is a human-readable summary of (and not a substitute for) the [license](https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode).
7
6
8
7
## You are free to:
9
-
**Share** — copy and redistribute the material in any medium or format
10
-
**Adapt** — remix, transform, and build upon the material for any purpose, even commercially.
8
+
9
+
-**Share** — copy and redistribute the material in any medium or format
10
+
-**Adapt** — remix, transform, and build upon the material
11
11
12
12
The licensor cannot revoke these freedoms as long as you follow the license terms.
13
13
14
14
## Under the following terms:
15
15
16
-
**Attribution** — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
17
-
18
-
**NonCommercial** — You may not use the material for commercial purposes.
19
-
20
-
**ShareAlike** — If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.
16
+
-**Attribution** — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
17
+
-**NonCommercial** — You may not use the material for commercial purposes.
18
+
-**ShareAlike** — If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.
21
19
22
20
**No additional restrictions** — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
Copy file name to clipboardExpand all lines: README.md
+35-15Lines changed: 35 additions & 15 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,5 +1,11 @@
1
-
## Introduction to Data Science
2
-
This is the source for the Introduction to Data Science textbook.
1
+
## Data Science: A First Introduction
2
+
This is the source for the *Data Science: A First Introduction* textbook.
3
+
4
+
## License Information
5
+
6
+
This textbook is offered under
7
+
the [Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) License](https://creativecommons.org/licenses/by-nc-sa/4.0/).
8
+
See [the license file](LICENSE.md) for more information.
3
9
4
10
## Setup and Build
5
11
@@ -20,22 +26,19 @@ We provide instructions for both methods here.
20
26
21
27
To build the **html version** of the book, navigate to the repository root folder and run
22
28
```
23
-
./build.sh
29
+
./build_html.sh
24
30
```
25
31
from the command line. This command automatically spawns a docker container
26
-
with the `ubcdsci/intro-to-ds` image, runs the script `build.R` from within the container,
27
-
and then stops the container. It may ask you for a password; this is the password for the
28
-
`sudo` command on your computer. Typically this is just your usual computer user account password.
29
-
But if your setup doesn't require you to use `sudo` to start a docker container, you can just
30
-
open `build.sh` and delete the word `sudo` at the start of the script.
32
+
with the `ubcdsci/intro-to-ds` image, runs the script `_build_html.r` from within the container,
33
+
and then stops the container.
31
34
32
35
To build the **PDF version** of the book, instead run
33
36
```
34
-
./pdfbuild.sh
37
+
./build_pdf.sh
35
38
```
36
-
The same comments regarding passwords and `sudo` as above apply here.
39
+
This command again spawns a docker container and runs `pdf/_build_pdf.r` inside the container.
37
40
38
-
### With RStudio
41
+
### With RStudio (HTML only)
39
42
40
43
1. Run RStudio inside the `ubcdsci/intro-to-ds` docker container:
41
44
- in terminal, navigate to the root of this project repo
@@ -120,23 +123,39 @@ bookdown::gitbook:
120
123
- when saying that students will do things in code, always say "in R"
121
124
- "you will be able to" (not "students will be able to", "the reader will be able to")
122
125
126
+
#### Captions
127
+
- captions should be sentence formatted and end with a period
128
+
- If you have special characters (particularly underscores, quotation marks, plus signs, other LaTeX math symbols) make sure to separate
129
+
the caption out of the code chunk like so
130
+
```
131
+
(ref:blah)
132
+
133
+
\`\`\`
134
+
{r blah, other_options}
135
+
code here
136
+
\`\`\`
137
+
```
138
+
123
139
#### Equations
124
140
- make sure all equations get capitalized labels ("Equation \\@ref(blah)", not "equation below" or "equation above")
125
141
126
142
#### Figures
127
143
- make sure all figures get (capitalized) labels ("Figure \\@ref(blah)", not "figure below" or "figure above")
128
144
- make sure all figures get captions
129
145
- specify image widths in terms of linewidth percent (e.g. `out.width="70%"`)
130
-
- center align all images
146
+
- center align all images via `fig.align = "center"`
131
147
- make sure we have permission for every figure/logo that we use
132
148
- Make sure all figures follow the visualization principles in Chapter 4
133
149
- Make sure axes are set appropriately to not inflate/deflate differences artificially *where it does not compromise clarity* (e.g. in the classification
134
150
chapter there are a few examples where zoomed-in accuracy axes are better than using the full range 0 to 1)
151
+
-
135
152
136
153
#### Tables
137
154
- make sure all tables get capitalized labels ("Table \\@ref(blah)", not "table below" or "table above")
138
155
- make sure all tables get captions
139
156
- make sure the row + column spacing is reasonable
157
+
- Do not put links in table captions, it breaks pdf rendering
158
+
- Do not put underscores in table captions, it breaks pdf rendering
140
159
141
160
#### Note boxes
142
161
- note boxes should be typeset as quote boxes using `>` and start with **Note:**
@@ -178,6 +197,10 @@ Generally the book uses American spelling. Some common British vs American and C
178
197
- c vs s: defense (not defence)
179
198
- er vs re: center (not centre)
180
199
200
+
#### Whitespace
201
+
We need a line of whitespace before and after code fences (code surrounded by three backticks above and below). This is for readability,
202
+
and it is essential for figure captions.
203
+
181
204
#### PDF Output
182
205
These are absolute last steps when rendering the PDF output:
183
206
- Look for and fix bad line breaks (e.g. with only one word on the next line, orphans, and widows)
- `data/` stores datasets processed during compile
212
235
- `docs/.nojekyll` tells github's static site builder not to run [Jekyll](https://jekyllrb.com/). This avoids Jekyll deleting the folder `docs/_main_files` (as it starts with an underscore)
213
236
214
-
## License Information
215
-
216
-
[Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)](https://creativecommons.org/licenses/by-nc-sa/4.0/)
Tiffany Timbers is an Assistant Professor of Teaching in the Department of Statistics and Co-Director for the Master of Data Science program (Vancouver Option) at the University of British Columbia. In these roles she teaches and develops curriculum around the responsible application of Data Science to solve real-world problems. One of her favorite courses she teaches is a graduate course on collaborative software development, which focuses on teaching how to create R and Python packages using modern tools and workflows.
4
+
5
+
6
+
Trevor Campbell is an Assistant Professor in the Department of Statistics at the University of British Columbia. His research focuses on automated, scalable Bayesian inference algorithms, Bayesian nonparametrics, streaming data, and Bayesian theory. He was previously a postdoctoral associate advised by Tamara Broderick in the Computer Science and Artificial Intelligence Laboratory (CSAIL) and Institute for Data, Systems, and Society (IDSS) at MIT, a Ph.D. candidate under Jonathan How in the Laboratory for Information and Decision Systems (LIDS) at MIT, and before that he was in the Engineering Science program at the University of Toronto.
7
+
8
+
9
+
Melissa Lee is an Assistant Professor of Teaching in the Department of Statistics at the University of British Columbia. She teaches and develops curriculum for undergraduate statistics and data science courses. Her work focuses on student-centered approaches to teaching, developing and assessing open educational resources, and promoting equity, diversity, and inclusion initiatives.
0 commit comments