Skip to content

Commit cd44184

Browse files
foreword py update
1 parent bc6b200 commit cd44184

File tree

2 files changed

+8
-7
lines changed

2 files changed

+8
-7
lines changed

source/_toc.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ parts:
44
- caption: Front Matter
55
chapters:
66
- file: preface-text.md
7-
#- file: foreword.md
7+
- file: foreword-text.md
88
- file: acknowledgements.md
99
- file: authors.md
1010
- caption: Chapters

source/foreword-text.md

100755100644
Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -13,13 +13,13 @@ kernelspec:
1313
name: python3
1414
---
1515

16-
# Foreword -- TBD
16+
# Foreword
1717

1818
*Roger D. Peng*
1919

2020
*Johns Hopkins Bloomberg School of Public Health*
2121

22-
*2022-01-04*
22+
*2023-11-30*
2323

2424
The field of data science has expanded and grown significantly in recent years,
2525
attracting excitement and interest from many different directions. The demand for introductory
@@ -44,9 +44,10 @@ is and what the implications are for the activities in which members of the fiel
4444

4545
The first important concept addressed by this book is tidy data, which is a format for
4646
tabular data formally introduced to the statistical community in a 2014 paper by Hadley
47-
Wickham. The tidy data organization strategy has proven a powerful abstract concept for
48-
conducting data analysis, in large part because of the vast toolchain implemented in the
49-
Tidyverse collection of R packages. The second key concept is the development of workflows
47+
Wickham. Although originally popularized within the R programming language community
48+
via the Tidyverse package collection, the tidy data format is a language-independent concept
49+
that facilitates the application of powerful generalized data cleaning and wrangling tools.
50+
The second key concept is the development of workflows
5051
for reproducible and auditable data analyses. Modern data analyses have only grown in
5152
complexity due to the availability of data and the ease with which we can implement complex
5253
data analysis procedures. Furthermore, these data analyses are often part of
@@ -61,7 +62,7 @@ collaboration is a core element of data science.
6162
This book takes these core concepts and focuses on how one can apply them to *do* data
6263
science in a rigorous manner. Students who learn from this book will be well-versed in
6364
the techniques and principles behind producing reliable evidence from data. This book is
64-
centered around the use of the R programming language within the tidy data framework,
65+
centered around the implementation of the tidy data framework within the Python programming language,
6566
and as such employs the most recent advances in data analysis coding. The use of Jupyter
6667
notebooks for exercises immediately places the student in an environment that encourages
6768
auditability and reproducibility of analyses. The integration of git and GitHub into the

0 commit comments

Comments
 (0)