Skip to content

Commit 44c73d3

Browse files
Merge pull request #373 from UBC-DSCI/dev
merge dev into master
2 parents c7cb0b4 + 24040a4 commit 44c73d3

File tree

258 files changed

+8593
-3294
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

258 files changed

+8593
-3294
lines changed

README.md

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -142,13 +142,16 @@ bookdown::gitbook:
142142
#### Figures
143143
- make sure all figures get (capitalized) labels ("Figure \\@ref(blah)", not "figure below" or "figure above")
144144
- make sure all figures get captions
145-
- specify image widths in terms of linewidth percent (e.g. `out.width="70%"`)
145+
- specify image widths of pngs and jpegs in terms of linewidth percent
146+
(e.g. `out.width="70%"`),
147+
for plots we create in R use `fig.width` and `fig.height`.
146148
- center align all images via `fig.align = "center"`
147149
- make sure we have permission for every figure/logo that we use
148150
- Make sure all figures follow the visualization principles in Chapter 4
149151
- Make sure axes are set appropriately to not inflate/deflate differences artificially *where it does not compromise clarity* (e.g. in the classification
150152
chapter there are a few examples where zoomed-in accuracy axes are better than using the full range 0 to 1)
151-
-
153+
- Fig size for bar charts should be: `fig.width=5, fig.height=3` (an exception are figs 1.7 & 1.8 so that we can read the axis labels)
154+
- cropping width for syntax diagrams is 1625 (done using `image_crop`)
152155

153156
#### Tables
154157
- make sure all tables get capitalized labels ("Table \\@ref(blah)", not "table below" or "table above")

_bookdown.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,4 +3,4 @@ language:
33
ui:
44
chapter_name: "Chapter "
55
delete_merged_file: true
6-
rmd_files: ["index.Rmd", "intro.Rmd", "reading.Rmd", "wrangling.Rmd", "viz.Rmd", "classification1.Rmd", "classification2.Rmd", "regression1.Rmd", "regression2.Rmd", "clustering.Rmd", "inference.Rmd", "jupyter.Rmd", "version-control.Rmd", "setup.Rmd", "appendixA.Rmd", "references.Rmd"]
6+
rmd_files: ["index.Rmd", "authors.Rmd", "intro.Rmd", "reading.Rmd", "wrangling.Rmd", "viz.Rmd", "classification1.Rmd", "classification2.Rmd", "regression1.Rmd", "regression2.Rmd", "clustering.Rmd", "inference.Rmd", "jupyter.Rmd", "version-control.Rmd", "setup.Rmd", "appendixA.Rmd", "references.Rmd"]

acknowledgements.Rmd

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,23 @@
11
# Acknowledgments {-}
22

3+
We'd like to thank everyone that has contributed to the development of
4+
[*Data Science: A First Introduction*](https://ubc-dsci.github.io/introduction-to-datascience/).
5+
This is an open source textbook that began as a collection of course readings
6+
for DSCI 100, a new introductory data science course
7+
at the University of British Columbia (UBC).
8+
Several faculty members in the UBC Department of Statistics
9+
were pivotal in shaping the direction of that course,
10+
and as such contributed greatly to the broad structure and
11+
list of topics in this book. We would especially like to thank Matías
12+
Salibían-Barrera for his mentorship during the initial development and roll-out
13+
of both DSCI 100 and this book. His door was always open when
14+
we needed to chat about how to
15+
best introduce and teach data science our first year students.
16+
17+
We also owe a debt of gratitude to all of the students of DSCI 100 over the past
18+
few years. They provided invaluable feedback on the book and worksheets;
19+
they found bugs for us (and stood by very patiently in class while
20+
we frantically fixed those bugs); and they brought a level of enthusiasm to the class
21+
that sustained us during the hard work of creating a new course and writing a textbook.
22+
Our interactions with them taught us how to teach data science, and that learning
23+
is reflected in the content of this book.

authors.Rmd

Lines changed: 22 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,28 @@
11
# About the authors {-}
22

3-
Tiffany Timbers is an Assistant Professor of Teaching in the Department of Statistics and Co-Director for the Master of Data Science program (Vancouver Option) at the University of British Columbia. In these roles she teaches and develops curriculum around the responsible application of Data Science to solve real-world problems. One of her favorite courses she teaches is a graduate course on collaborative software development, which focuses on teaching how to create R and Python packages using modern tools and workflows.
3+
Tiffany Timbers is an Assistant Professor of Teaching in the Department of
4+
Statistics and Co-Director for the Master of Data Science program (Vancouver
5+
Option) at the University of British Columbia. In these roles she teaches and
6+
develops curriculum around the responsible application of Data Science to solve
7+
real-world problems. One of her favorite courses she teaches is a graduate
8+
course on collaborative software development, which focuses on teaching how to
9+
create R and Python packages using modern tools and workflows.
410

511

6-
Trevor Campbell is an Assistant Professor in the Department of Statistics at the University of British Columbia. His research focuses on automated, scalable Bayesian inference algorithms, Bayesian nonparametrics, streaming data, and Bayesian theory. He was previously a postdoctoral associate advised by Tamara Broderick in the Computer Science and Artificial Intelligence Laboratory (CSAIL) and Institute for Data, Systems, and Society (IDSS) at MIT, a Ph.D. candidate under Jonathan How in the Laboratory for Information and Decision Systems (LIDS) at MIT, and before that he was in the Engineering Science program at the University of Toronto.
12+
Trevor Campbell is an Assistant Professor in the Department of Statistics at
13+
the University of British Columbia. His research focuses on automated, scalable
14+
Bayesian inference algorithms, Bayesian nonparametrics, streaming data, and
15+
Bayesian theory. He was previously a postdoctoral associate advised by Tamara
16+
Broderick in the Computer Science and Artificial Intelligence Laboratory
17+
(CSAIL) and Institute for Data, Systems, and Society (IDSS) at MIT, a Ph.D.
18+
candidate under Jonathan How in the Laboratory for Information and Decision
19+
Systems (LIDS) at MIT, and before that he was in the Engineering Science
20+
program at the University of Toronto.
721

822

9-
Melissa Lee is an Assistant Professor of Teaching in the Department of Statistics at the University of British Columbia. She teaches and develops curriculum for undergraduate statistics and data science courses. Her work focuses on student-centered approaches to teaching, developing and assessing open educational resources, and promoting equity, diversity, and inclusion initiatives.
23+
Melissa Lee is an Assistant Professor of Teaching in the Department of
24+
Statistics at the University of British Columbia. She teaches and develops
25+
curriculum for undergraduate statistics and data science courses. Her work
26+
focuses on student-centered approaches to teaching, developing and assessing
27+
open educational resources, and promoting equity, diversity, and inclusion
28+
initiatives.

build_html.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,2 @@
11
# Script to generate HTML book
2-
docker run --rm -m 5g -v $(pwd):/home/rstudio/introduction-to-datascience ubcdsci/intro-to-ds:v0.12.0 /bin/bash -c "cd /home/rstudio/introduction-to-datascience; Rscript _build_html.r"
2+
docker run --rm -m 5g -v $(pwd):/home/rstudio/introduction-to-datascience ubcdsci/intro-to-ds:v0.21.0 /bin/bash -c "cd /home/rstudio/introduction-to-datascience; Rscript _build_html.r"

build_pdf.sh

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,9 @@
22

33
# Copy files
44
cp references.bib pdf/
5+
cp authors.Rmd pdf/
56
cp preface-text.Rmd pdf/
7+
cp acknowledgements.Rmd pdf/
68
cp intro.Rmd pdf/
79
cp reading.Rmd pdf/
810
cp wrangling.Rmd pdf/
@@ -22,11 +24,13 @@ cp -r data/ pdf/data
2224
cp -r img/ pdf/img
2325

2426
# Build the book with bookdown
25-
docker run --rm -m 5g -v $(pwd):/home/rstudio/introduction-to-datascience ubcdsci/intro-to-ds:v0.12.0 /bin/bash -c "cd /home/rstudio/introduction-to-datascience/pdf; Rscript _build_pdf.r"
27+
docker run --rm -m 5g -v $(pwd):/home/rstudio/introduction-to-datascience ubcdsci/intro-to-ds:v0.21.0 /bin/bash -c "cd /home/rstudio/introduction-to-datascience/pdf; Rscript _build_pdf.r"
2628

2729
# clean files in pdf dir
2830
rm -rf pdf/references.bib
29-
rm -rf pdf/preface-text.Rmd
31+
rm -rf pdf/authors.Rmd
32+
rm -rf pdf/preface-text.Rmd
33+
rm -rf pdf/acknowledgements.Rmd
3034
rm -rf pdf/intro.Rmd
3135
rm -rf pdf/reading.Rmd
3236
rm -rf pdf/wrangling.Rmd

0 commit comments

Comments
 (0)