Skip to content

Commit ae5d370

Browse files
committed
closes Issue on page /data-visualise.html #25 to Add contributor #36
1 parent 1bdd6cd commit ae5d370

File tree

5 files changed

+63
-10
lines changed

5 files changed

+63
-10
lines changed

data-transform.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -882,7 +882,7 @@
882882
"id": "dd91d87a",
883883
"metadata": {},
884884
"source": [
885-
"Of course, this is quite tedious if you have lots of columns! There are methods that can help make this easier depending on your context. Perhaps you'd just liked to sort the columns in order? This can be achieved by combining `sorted()` and the `reindex()` command (which works for rows or columns) with `axis=1`, which means the second axis ie columns."
885+
"Of course, this is quite tedious if you have lots of columns! There are methods that can help make this easier depending on your context. Perhaps you'd just liked to sort the columns in order? This can be achieved by combining `sorted()` and the `reindex()` command (which works for rows or columns) with `axis=1`, which means the second axis (i.e. columns)."
886886
]
887887
},
888888
{

data-visualise.ipynb

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,7 @@
5050
"source": [
5151
"We'll also need to have the **pandas** package installed—this package, which we'll be seeing a lot of, is for data. You can similarly install it by running `pip install pandas` on the command line.\n",
5252
"\n",
53-
"Finally, we'll also need some data (you can't science with data). We'll be using the Palmer penguins dataset. Unusually, this can also be installed as a package—normally you would load data from a file, but these data are so popular for tutorials they've found their way into an installable package. Run `pip install palmerpenguins` to get these data."
53+
"Finally, we'll also need some data (you can't science without data). We'll be using the Palmer penguins dataset. Unusually, this can also be installed as a package—normally you would load data from a file, but these data are so popular for tutorials they've found their way into an installable package. Run `pip install palmerpenguins` to get these data."
5454
]
5555
},
5656
{
@@ -182,7 +182,7 @@
182182
"id": "574fe39f",
183183
"metadata": {
184184
"tags": [
185-
"remove-cell"
185+
"remove-input"
186186
]
187187
},
188188
"outputs": [],
@@ -1069,9 +1069,10 @@
10691069
"\n",
10701070
"Start by carefully comparing the code that you're running to the code in the book: A misplaced character can make all the difference!\n",
10711071
"Make sure that every `(` is matched with a `)` and every `\"` is paired with another `\"`. In Visual Studio Code, you can get extensions that colour match brackets so you can easily see if you closed them or not.\n",
1072+
"\n",
10721073
"Sometimes you'll run the code and nothing happens.\n",
10731074
"\n",
1074-
"One common problem when creating **letsplot** graphics is to put the `+` in the wrong place: it has to come at the end of the line, not the start.\n",
1075+
"For those coming from the R statistical programming language, you may be concerned about getting your `+` in the wrong place. Have no fear, however, as in the syntax for **letsplot** the `+` can go at the start or the end of the line.\n",
10751076
"\n",
10761077
"\n",
10771078
"If you're still stuck, try the help.\n",
@@ -1086,7 +1087,7 @@
10861087
"id": "f33dc022",
10871088
"metadata": {},
10881089
"source": [
1089-
"# Summary\n",
1090+
"## Summary\n",
10901091
"\n",
10911092
"In this chapter, you've learned the basics of data visualisation with ggplot2.\n",
10921093
"We started with the basic idea that underpins **letsplot**: a visualisation is a mapping from variables in your data to aesthetic properties like position, colour, size and shape.\n",

welcome.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,3 +19,5 @@ We thank the following contributors:
1919
- [durraniu](https://github.com/durraniu)
2020
- [zekiakyol](https://github.com/zekiakyol)
2121
- [yibenhuang](https://github.com/yibenhuang)
22+
- [crossxwill](https://github.com/crossxwill)
23+
- [udurraniAtPresage](https://github.com/udurraniAtPresage)

workflow-style.ipynb

Lines changed: 53 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -58,7 +58,7 @@
5858
"- use consistent verbs for function names, don't use `get_score()` and `grab_results()` (instead use `get` for both)\n",
5959
"- variable names should be snake_case and all lowercase, eg `first_name`\n",
6060
" - class names should be CamelCase, eg `MyClass`\n",
61-
"function names should be snake_case and all lowercase, eg `quick_sort()`\n",
61+
" - function names should be snake_case and all lowercase, eg `quick_sort()`\n",
6262
" - constants should be snake_case and all uppercase, eg `PI = 3.14159`\n",
6363
" - modules should have short, snake_case names and all lowercase, eg `pandas`\n",
6464
" - single quotes and double quotes are equivalent so pick one and be consistent—most automatic formatters prefer `\"`"
@@ -142,6 +142,56 @@
142142
" return result\n",
143143
"```\n",
144144
"\n",
145+
"When using *method chaining* (something you can see in action in [](data-transform)) it's necessary to put the chain inside parentheses and it's good practice to use a new line for every method. The code snippet below gives an example of what good looks like:"
146+
]
147+
},
148+
{
149+
"cell_type": "code",
150+
"execution_count": null,
151+
"id": "f0f5bb37",
152+
"metadata": {},
153+
"outputs": [],
154+
"source": [
155+
"import pandas as pd\n",
156+
"\n",
157+
"df = pd.DataFrame(\n",
158+
" data={\n",
159+
" \"col0\": [0, 0, 0, 0],\n",
160+
" \"col1\": [0, 0, 0, 0],\n",
161+
" \"col2\": [0, 0, 0, 0],\n",
162+
" \"col3\": [\"a\", \"b\", \"b\", \"a\"],\n",
163+
" \"col4\": [\"alpha\", \"gamma\", \"gamma\", \"gamma\"],\n",
164+
" },\n",
165+
" index=[\"row\" + str(i) for i in range(4)],\n",
166+
")\n",
167+
"\n",
168+
"\n",
169+
"# Chaining inside parentheses works\n",
170+
"\n",
171+
"results = df.groupby([\"col3\", \"col4\"]).agg({\"col1\": \"count\", \"col2\": \"mean\"})\n",
172+
"\n",
173+
"results"
174+
]
175+
},
176+
{
177+
"cell_type": "markdown",
178+
"id": "1d6f3bf8",
179+
"metadata": {},
180+
"source": [
181+
"And this is what *not* to do:\n",
182+
"\n",
183+
"```python\n",
184+
"results = df\n",
185+
" .groupby([\"col3\", \"col4\"]).agg({\"col1\": \"count\", \"col2\": \"mean\"})\n",
186+
"\n",
187+
"```"
188+
]
189+
},
190+
{
191+
"cell_type": "markdown",
192+
"id": "d016d530",
193+
"metadata": {},
194+
"source": [
145195
"## Principles of Clean Code\n",
146196
"\n",
147197
"While automation can help apply style, it can't help you write *clean code*. Clean code is a set of rules and principles that helps to keep your code readable, maintainable, and extendable. Writing code is easy; writing clean code is hard! However, if you follow these principles, you won't go far wong.\n",
@@ -171,7 +221,7 @@
171221
"\n",
172222
"Relatedly, do not have a single function that tries to do everything. Functions should have limits too; they should do approximately one thing. If you're naming a function and you have to use 'and' in the name then it's probably worth splitting it into two functions.\n",
173223
"\n",
174-
"Functions should have no 'side effects' either; that is, they shouldn't only take in value(s), and output value(s) via a return statement. They shouldn't modify global variables or make other changes.\n",
224+
"Functions should have no 'side effects' either; that is, they should only take in value(s), and output value(s) via a return statement. They shouldn't modify global variables or make other changes.\n",
175225
"\n",
176226
"Another good rule of thumb is that each function shouldn't have lots of separate arguments.\n",
177227
"\n",
@@ -220,7 +270,7 @@
220270
"name": "python",
221271
"nbconvert_exporter": "python",
222272
"pygments_lexer": "ipython3",
223-
"version": "3.10.12"
273+
"version": "3.10.13"
224274
},
225275
"toc-showtags": true
226276
},

workflow-writing-code.ipynb

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111
"\n",
1212
"There are different ways to write (and run) code that suit different needs. For example, for creating a reproducible pipeline of tasks or writing production-grade software, you might opt for a script——a file that is mostly code. But for sending instructions to a colleague or exploring a narrative, you might choose to write your code in a notebook because it can present text and code together more naturally than a script can.\n",
1313
"\n",
14-
"We already met some ways to write and run code in the subsequent chapters. Here, we'll be a bit more systematic so that, by the end of the chapter, you'll be comfortable writing code in both scripts and notebooks. For advanced users, there's also information on how to write code with markdown, using markdown files that contain executable code chunks. Scripts and notebooks are by far the most popular ways to write code though.\n",
14+
"We already met some ways to write and run code in the previous chapters. Here, we'll be a bit more systematic so that, by the end of the chapter, you'll be comfortable writing code in both scripts and notebooks. For advanced users, there's also information on how to write code with markdown, using markdown files that contain executable code chunks. Scripts and notebooks are by far the most popular ways to write code though.\n",
1515
"\n",
1616
"Let's start with some definitions.\n",
1717
"\n",
@@ -32,7 +32,7 @@
3232
"|------|----------------|---------------|---------------|---------------|\n",
3333
"| Script, eg `script.py` | 'Run in interactive window' in an integrated development environment (IDE) | Python installation + an IDE with Python support, eg Visual Studio Code. | Can be run all-in-one or step-by-step as needed. Very powerful tools available to aid coding in scripts. De facto standard for production-quality code. Can be imported by other scripts. Version control friendly. | Not very good if you want to have lots of text alongside code.\n",
3434
"| Jupyter Notebook, eg `notebook.ipynb` | Open the file with Visual Studio Code. | Use Visual Studio Code and the VS Code Jupyter extension. | Code and text can alternate in the same document. Rich outputs of code can be integrated into document. Can export to PDF, HTML, and more, with control over whether code inputs/outputs are shown, and either exported directly or via **Quarto**. Can be run all-in-one or step-by-step as needed. | Fussy to use with version control. Code and text cannot be mixed in same 'cell'. Not easy to import in other code files.\n",
35-
"| Markdown with executable code chunks using [**Quarto**](https://quarto.org/), eg `markdown_script.qmd` | To produce output, write in a mix of markdown and code blocks and then export with commands like `quarto render markdown_script.qmd --to html` on the command line or using the [Visual Studio Code extension](https://marketplace.visualstudio.com/items?itemName=quarto.quarto). Other output types available. | Installations of Python and Quarto, plus their dependencies. | Allows for true mixing of text and code. Can export to wide variety of other formats, such as PDF and HTML, with control over whether code inputs/outputs are shown. Version control friendly. | Must be run all-in-one so cannot see outputs of individual code-chunks as you go. Cannot be imported by other code files.\n",
35+
"| Markdown with executable code chunks using [**Quarto**](https://quarto.org/), eg `markdown_script.qmd` | To produce output, write in a mix of markdown and code blocks and then export with commands like `quarto render markdown_script.qmd --to html` on the command line or using the [Visual Studio Code extension](https://marketplace.visualstudio.com/items?itemName=quarto.quarto). Other output types available. | Installations of Python and Quarto, plus their dependencies. | Allows for true mixing of text and code. Can export to wide variety of other formats, such as PDF and HTML, with control over whether code inputs/outputs are shown. Version control friendly. | Cannot be imported by other code files.\n",
3636
"\n",
3737
"Some of the options above make use of the command line, a way to issue text-based instructions to your computer. Remember, the command line (aka the terminal) can be accessed via the Terminal app on Mac, the Command Prompt app on Windows, or <kbd>ctrl</kbd> + <kbd>alt</kbd> + <kbd>t</kbd> on Linux. To open up the command line within Visual Studio Code, you can use the keyboard shortcut <kbd>⌃</kbd> + <kbd>\\`</kbd> (on Mac) or \n",
3838
"<kbd>ctrl</kbd> + <kbd>\\`</kbd> (Windows/Linux), or click \"View > Terminal\".\n",

0 commit comments

Comments
 (0)