forked from hadley/r-pkgs
-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathgit.rmd
More file actions
665 lines (438 loc) · 36.1 KB
/
git.rmd
File metadata and controls
665 lines (438 loc) · 36.1 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
---
title: Git and github
layout: default
output: bookdown::html_chapter
---
# Git and github {#git}
If you're serious about software development, the most important tool to learn is git. Git is a software tool revision management tool which allows you to track changes to your code and share those changes with others. Git is most useful when combined with [github](http://github.com), a website that allows you to share your code with the world, solicit code improvements (pull requests) and track issues.
You can do many of the same things with other tools (like svn or bazaar or mercurial) and other websites (like gitlab and bitbucket). But I think git and github is the friendliest system for new developers, not least because it's the most popular, which means every possible problem has already been asked and answered on StackOverflow.
Why use git and github?
* Do you have a directory full of files like `my-script.R`, `my-script-1.R`,
`myscript-2-I-really-hope-this-works.R`, `myscript-FINALLY.R` and so on?
Git takes care of managing multiple versions of your code so that you
can easily see what's change, and revert any mistakes that you've made.
(But note that git isn't a substitute for a backup sytem, you should be
use git with your backups, not instead of).
* It makes it easy for other people to install your package. Any R user can
get your package with at most two lines of code:
```{r, eval = FALSE}
install.packages("devtools")
devtools::install_github("username/packagename")
```
(The first line isn't necesary if they already have devtools installed.)
* Not only can other people use your code, they can also suggest improvements
with pull requests, patches to your code that fixb bugs or implement new
functionality. Once you've experienced your first pull request, you'll
never want to go back to developing code any other way.
* Github makes it possible to collaboratively code a package with other
people. As long as you're working on different parts of the file,
git can figure out how to combine your changes. If you are working on
the same part of the file, git provides tools to help you choose
between the conflicting changes.
* Track issues (bugs and feature requests) in one central location. You
can discuss the problem and propose solutions. When you resolve an
issue you can connect it to the exact change that solved the problem,
which is very useful when you come back to it in the future.
* The Github site for your repo makes a great minimal website for your
package. Readers can easily browse code and markdown files are rendered
as html. This makes the source code for your package much more accessible
than downloading the `tar.gz` from github.
* Github releases
At first, working in the open seems a little scary. Do you really want the whole world seeing your crappy code? Don't worry, nobody is judging you - everybody writes bad code at some point in their lives, and most people are more interested in helping you write better code than making fun of your existing efforts. Once you get used to it, you'll find that coding in the open is tremendously empowering as it helps other people help you.
This chapter describes git and github together, making no attempt to separate the features of each. The goal is to give you the absolute minimum you need to know to use git for an R package. After reading this chapter and working with git for a while, it's highly likely that you'll want to learn more. Some good resource are:
* If you'd like to practice using git from the command line, try
<https://try.github.io>. It steps you through a number of challenges in
a virtual shell.
* Github help, <https://help.github.com>, not only teaches you about
github, but also has good tutorials on many git features.
* If you'd like to learn more about the details of git, read
[Pro Git](http://git-scm.com/book/en/v2) by Scott Chacon and Ben Straub.
What you won't learn here:
* any history modifying changes like (`git rebase` or `git pull --rebase`).
StackOverflow is a vital part of git - when you have a problem that you don't know how to solve, SO should be your first resource. It's highly likely that some one has had exactly the same problem as you, and there will be a variety of approaches to choose from.
RStudio provides many tools to make your day-to-day use of git as easy as possible. However, there are a huge number of git commands, and they're not at all available in the IDE. That means you'll need to run a handful of commands from a shell (aka a console), especially when you're setting up, dealing with merge conflicts and getting out of jams. The easiest way to get to a shell is Tools > Shell. It's also important to be familiar with using git from the command line because if you're searching for problems, you'll need to know what the standard commands are.
## Initial set up {#git-init}
If you've never used git or github before, you'll need to do a little initial setup:
1. Install git:
* Windows: <http://msysgit.github.io/>.
* OS X: <http://code.google.com/p/git-osx-installer/>.
* Debian/Ubuntu: `sudo apt-get install git-core`.
1. Tell git your name and email address. These are used to label each commit
so that when you start collaborating with others, it's clear who made
each change. In a shell, run:
```bash
git config --global user.name "<YOUR NAME>"
git config --global user.email "<YOUR EMAIL ADDRESS>"
```
1. Create a free account on github: <https://github.com>. Use the same
name and email address as the previous step.
1. If needed, generate ssh keys. You can check if you have an ssh key
already by running:
```{r, eval = FALSE}
file.exists("~/.ssh/id_rsa.pub")
```
If it's `FALSE`, you'll need to create one. You can follow the [github
instructions](https://help.github.com/articles/generating-ssh-keys) or
use RStudio. Go to RStudio preferences, choose the Git/SVN panel, then
click "Create RSA key...":
```{r, echo = FALSE}
bookdown::embed_png("screenshots/git-config.png", dpi = 220)
```
1. Add your public key to <https://github.com/settings/ssh>. The
easiest way to get the key is to click "View public key" in the
Git/SVN preferences pane (as shown above).
## Create a local git repository {#git-init}
Now that you have git installed and configured, you need to initialise a local git repository for your package. This repository (repo for short) exists only on your computer. This is different to old systems like svn which were always connected to a central server. Git makes a clear distinction between local and remote repos; you mostly work locally then push your changes to a remote repo when you're ready to share your work with others.
There are two ways to create a new git repo:
* In RStudio, go to project options, then the Git/SVN panel. Change the
"Version control system" from "None" to "Git":
```{r, echo = FALSE}
bookdown::embed_png("screenshots/git-proj-config.png", dpi = 220)
```
You'll then be prompted to restart RStudio.
* In a shell, run `git init`. Restart RStudio and reopen your package.
Once git has been activated in a project, you'll see two new components in the IDE:
* The git pane, which is shown the top-right (by default). It shows you what
files have changed since you last committed, and exposes the most important
git commands as buttons.
```{r, echo = FALSE}
bookdown::embed_png("screenshots/git-pane.png", dpi = 220)
```
* The git dropdown in the toolbar. This exposes git and github commands
useful for working with the current file:
```{r, echo = FALSE}
bookdown::embed_png("screenshots/git-dropdown.png", dpi = 220)
```
With the repository set up, you can now:
* See what you've changed: [git status](#git-status).
* Add and commit files: [git commit](#git-commit).
* Undo mistakes: [undo](#git-undo)
To get more out of git, you'll need to connect your repo with github:
* [initialising github](#github-init)
* working with others, [`git pull`](#git-pull)
* review history, [`git log`](#git-log)
## See what's changed {#git-status}
The RStudio git pane shows, at a glance, what's changed. Each added, modified or deleted file is listed, along with an icon summarising the change:
* `r bookdown::embed_png("screenshots/git-modified.png", dpi = 220)`,
__Modified__. You've changed the contents of the file.
* `r bookdown::embed_png("screenshots/git-unknown.png", dpi = 220)`,
__Untracked__. You've added a new file that git hasn't seen before.
* `r bookdown::embed_png("screenshots/git-deleted.png", dpi = 220)`,
__Deleted__. You've deleted a file.
You can get the same information (in a slightly different format) in the shell by running `git status`.
For text files, you can get more details about modifications a "diff", `r bookdown::embed_png("screenshots/git-diff.png", dpi = 220)`. This opens a new window showing the detailed **diff**erences:
```{r, echo = FALSE}
bookdown::embed_png("screenshots/git-diff-window.png")
```
Red blocks show removed text and green blocks show added text. Context, lines surrounding the change, are shown in grey, and help you to place the change in the context of complete file. You can get a similar display in the shell by running `git diff`.
## Add and commit files {#git-commit}
The fundamental unit of work in git is a __commit__. A commit is a snapshot of the state of your code at a fixed point in time. You can think of a commit as a set of changes: what files did you add, edit and delete? Committing code is the most common git operation. You'll create many commits each day. A commit is made up of:
* A unique identifier called a sha.
* A human-readable commit message.
* A parent, the change that came before this one. (Commits can have multiple
parents, as you'll learn in XYZ)
* A changeset describing what files were added, modified and changed.
Commit can be a confusing word because it's used in two senses:
* As a noun: a commit is a snapshot of your code.
* As a verb: commit your changes to create a new snapshot.
I'll try to avoid confusing sentences like you commit code to create a commit.
Creating a commmit occurs in two stages:
1. You __stage__ files, telling git that you want to include them in the
next commit. In the shell you use `git add` for new and modified files,
and `git rm` for deleted files.
1. You __commit__ the staged files, describe the changes with a message.
In the shell you use `git commit`.
You perform both these steps in the same place in RStudio: the commit window. Open the commit window by clicking `r bookdown::embed_png("screenshots/git-commit.png", dpi = 220)`, or by pressing Ctrl + Alt + m. Here's a screenshot I took while writing this chapter:
```{r, echo = FALSE}
bookdown::embed_png("screenshots/git-commit-window.png", dpi = 220)
```
There are three panes:
* The top-left pane shows the current status, the same as the git pane in the
main RStudio window.
* The bottom pane shows the changes made (the "diff") in the file currently
selected in the status pane.
* The top-right pane is where you'll enter the commit message, a human
readable message summarising the changes made in the commit. More on
that shortly.
(Yes, this is exactly the same window you see when clicking `r bookdown::embed_png("screenshots/git-diff.png", dpi = 220)`!)
To make a new commit:
1. __Save your changes__.
1. __Open the commit window__ by clicking
`r bookdown::embed_png("screenshots/git-commit.png", dpi = 220)` or
pressing `Ctrl + Alt + M`.
1. __Stage files to be included in the next commit__. To stage a single
file, tick the corresponding check box. To stage all files, press Cmd + A,
then click `r bookdown::embed_png("screenshots/git-stage.png", dpi = 220)`.
As you stage each file, you'll notice that its status changes. There are two
columns: staged status on the left and unstaged status on the right. All
staged change will be included in the next commit.
There are two new new statuses that you'll see when staging files:
* Added: `r bookdown::embed_png("screenshots/git-added.png", dpi = 220)`:
after staging an untracked file, git now knows that you want to add it
to the repo.
* Renamed: `r bookdown::embed_png("screenshots/git-renamed.png", dpi = 220)`:
If you rename a file, git initially sees it as deleting one file and
adding a new file. Once you stage both changes, git will recognise
that it's a rename.
Sometimes you'll see a status in both columns, e.g.
`r bookdown::embed_png("screenshots/git-modified-staged.png", dpi = 220)`.
This means that you have both staged unstaged changes in the same file.
This means that you've made some changes, staged them, and then made some
more. Clicking the staged checkbox will stage your new changes, clicking
it again will unstage both sets of changes.
1. __Write a commit message__ (top-right panel). The first line of a commit
message is called the subject line and should be brief (50 characters or
less). For complicated commits, you can follow it with a blank line and
then a paragraph or bulleted list providing more detail. Write messages in
imperative, like you're telling someone what to do: "fix this bug", not
"fixed this bug".
1. __Click Commit__.
### Ignoring files
There are often files that you don't want to include in the repository. They might be transient artefacts (like stuff you get when building latex files, or compiling C code), or they might be too big, or they might be generated on demand. Instead of carefully not-staging them each time, you should add them to `.gitignore` to prevent them from every being added.
The easiest way to do this is to right-click on the file in the git pane and select `Ignore`. If you want to ignore multiple files, you can use a wildcard "glob" like `*.png`.
Some developers stick to the convention that no files that can be generated from other files should ever be committed into git. For R packages this means the `man/` directory that is autogenerated from R comments. Pragmatically, it's better not to do this because currently R packages have no way to generate `.Rd` files on install, so if you don't check them in and someone installs your package from github they won't have any documentation.
## Commit best practices {#commit-best-practices}
Ideally, each commit should:
* __Related__: All the changes in the commit should be related to the same
goal. This makes it easier to understand the commit at a glance, and to
describe it with a simple message.
* __Minimal__: A commit should contain nothing but the related changes. If
while while making one set of changes you discovered another problem,
do a separate commit.
* __Complete__: A commit should do solve the problem that it claims to solve.
If you think you've fixed a bug, the commit should contain a unit test
that confirms that you're right.
And each commit message should:
* __Be concise, yet evocative__. You want to be able to take in at a glance
what a commit does, but there should be enough detail so you can remember
(or understand) what happened.
* __Describe the why, not the what__. You can always retrieve the diff
associated with a commit, so the message doesn't need to say exactly what
changed. Instead it should provide a high-level summary, and focus on the
motivation for the change.
If you do this then:
* It's easier to work with others. If two people have changed the same file
in the same place, it's easier to resolve the conflict if the commits are
small and it's clear why each change was made.
* Project newcomers can easily understand history by reading the commit logs.
* You can load and run your package at any point in its life. This can
be tremendously useful in conjunction with tools like
[bisectr](https://github.com/wch/bisectr), which allow you to use binary
search to quickly narrow in on the commit that introduced a bug.
* When you discover exactly when you introduced a bug, you can easily
understand what you were doing (and why!).
The time you spend on your commits is inversely proportional to the amount of time others will spend understanding the history of your project. This is a less common operation, but it's important if you want to see what's changed or undo a mistake.
Commit messages are most important when you are collaborating with others. You might think that no one else will ever look at your repo, but there is one important collaborator for every project that you do: future-you! Future-you probably has forgotten a lot of things that present-you knows.
Remember that these directives are aspirational, and you shouldn't let aspirations get in your way. If you look at the commit history of my repositories, you'll notice a lot of them aren't that great, especially when I start to get frustrated that I __still__ haven't managed to fix a bug. I'm a frequent offender on the "complete" front, often missing important parts of a problem.
## Synchronising with github {#github-init}
Most of the time you work locally. This is really convenient because you don't need internet access to use git - you can keep on committing changes even when you're on a plane. However, git is most useful in conjunction with github, allowing you to share your code with the world.
To push your code to github:
1. Create a new repo on github: <https://github.com/new>.
1. In the shell, follow the instructions provided by github on the new
repo page. They'll look something like this:
```bash
git remote add origin git@github.com:rstudio/ggcomp.git
git push -u origin master
```
This tells git where the "origin" of your repo is, and to send it all the
changes you've made so far.
You only need to run these commands once per repo.
1. Add `URL` and `BugReports` fields to your `DESCRIPTION` to link to your
new github site. For example, dplyr has:
```yaml
URL: http://github.com/hadley/dplyr
BugReports: http://github.com/hadley/dplyr/issues
```
Now you can __push__ your changes to github by clicking `r bookdown::embed_png("screenshots/git-push.png", dpi = 220)`. (This is the same as running `git push` in the shell).
Typically, each push will include many commits. Pushing your code to github publishes it for the world to use. For that reason, once you've learned about `R CMD check` in [automated checking](#check), it's a good idea to run that before you push. Fix any problems so that you're always publishing good code.
Once you've connected your repo to github, the git pane will show you how many commits you have locally that are not on github: `r bookdown::embed_png("screenshots/git-local-commits.png", dpi = 220)`. This message indicates that I have 1 commit locally (my branch) that is not on github ("origin/master").
## Github
The first benefit to using git is that you get a decent website autogenerated from your github commits. The home page, e.g. <https://github.com/hadley/testthat> (the github repo for testthat), lists all the files and directories in the top-level. You can navigate throughout the project and look at the files. Any `.md` files well be automatically rendered as html. If you have a `README.md` file in the top-level directory, it will be displayed when on the homepage. You'll learn more about the benefits of creating this file in [README.md](#readme).
The second benefit is that people can now easily install your package with `devtools::install_github("<username>/<repo>")`.
## History {#git-log}
If you click on a file you'll see a nicely formatted version. This is useful for newcomers to your package - they can easily browse around and see what you package does before they download and try it out. There are three useful actions you can do to each file: `r bookdown::embed_png("screenshots/github-comment-line.png", dpi = 220)`:
* Raw: this shows you the raw contents of the file, and can be useful if you
want to copy-and-paste.
* History: this lists every commit that has affected this particular file.
This page is one reason its worth spending the time to write a good commit
message. If you have, you can skim this page to see recent changes to the
repo.
You can jump to this page directly from RStudio by clicking
`r bookdown::embed_png("screenshots/github-view.png", dpi = 220)` in
the git dropdown. If you have lines of code selected it will jump directly
to those lines.
* Blame: this is an alternative view of the history of the file. For each
line of code, it shows you the last commit to touch that line. This is
tremendously helpful when you've discovered a bug and want to understand
its history.
(It's called blame since it also shows you who made the commit so you
know who to blame ;)
You can jump to this page directly from RStudio by clicking
`r bookdown::embed_png("screenshots/github-blame.png", dpi = 220)` in
the git dropdown.
You can also see every commit to your package in the commit view, e.g. <https://github.com/hadley/testthat/commits/master>. When I'm working on a package with other people, I often keep this page open so I can see what they're working on. Individual commits show the same information that you see in the commit/diff window in RStudio, although github has some more advanced features that highlight individual words that have changed in sentences, and you can choose to show changes side-by-side instead of interleave in one file (a style called unified).
In this view, you can comment on the commit as a whole by using the comment box at the bottom of the page, or more usefully you can comment on individual lines by clicking `r bookdown::embed_png("screenshots/github-comment-line.png", dpi = 220)`. This is a great way to let your collaborators know if you see a mistake or have a question. It's better than email because it's public so anyone working on the repo (both present and future) can see the conversation.
## Undo a mistake {#git-undo}
The best thing about using commits is that you can undo mistakes. RStudio provides some tools for the most common types of mistake:
* If you want undo the changes you've made to a file, right click on it in
the git pane and select "revert". Proceed with caution: this is one of the
few operations that you can't undo.
* You undo changes to parts of a file in the diff window. Look for the
discard chunk button above the block of changes that you want to undo:
`r bookdown::embed_png("screenshots/git-chunk.png", dpi = 220)`
* If you made a mistake in a commit, you can modify the previous commit by
clicking `r bookdown::embed_png("screenshots/git-commit-amend.png", dpi = 220)`.
(But don't do this if you've pushed the previous commit to github - you're
effectively rewriting history, which should be done with care when you're
doing it in public.)
If one of these doesn't help, don't panic. The chances are that you can recover whatever was lost. Start by making a backup of the directory. Then even if you make a mistake while trying to fix your first mistake, you can start again. To undo more general problems you'll need to use the command line.
The first step in undoing a mistake is to go back in time and find a good commit. The easiest way to do that is to use the "History" view to find the commit where the mistake occured. Then the __parent__ of that commit will be good. Note the sha (the unique identifier for that commit) of the parent. Then you can:
* See what the file looked in the paste so you can copy-and paste the old
code:
```bash
git show <sha> <filename>
```
* Copy historical version into the present:
```bash
git checkout <sha> <filename>
```
In both cases you'll need finish by staging and commiting the files.
(It's also possible to use git as if you went back in time and prevented the mistake from happening in the first place. This is called __rebasing history__ and is an advanced technique. As you can imaging going back in time to change the past can have profound impacts on the present. It can be useful, but needs extreme care.)
If you're still stuck, try <http://sethrobertson.github.io/GitFixUm/fixup.html> or <http://justinhileman.info/article/git-pretty/>. They give step-by-step approach to fixing many common (and not so common!) problems.
## Working with others {#git-pull}
You use __push__ to send your changes to github. If you're working with other people, they're also pushing their changes to github, so you'll need to __pull__ their changes from github to see them locally. (You can only push to a repo when you have the most recent version of it, so you need to pull before you push your changes).
When you pull, git first downloads (__fetches__) all of the changes then __merges__ them with the changes that you've made. A merge is a commit with two parents. It takes two different lines of development and combines them into a single result. To do so, git needs to combine two sets of changes. In many cases it can do so automatically - if the changes are to different files, or even different parts of the same file. However, if the changes are to the same place in a file, you'll need to resolve the changes yourself. This is called a __merge conflict__.
In RStudio, you'll discover that you have merge conflict when:
* Pull fails with an error.
* In the git pane, you see a status like
`r bookdown::embed_png("screenshots/git-commit-conflict.png", dpi = 220)`
RStudio currently doesn't provide any tools to help with merge conflicts, so you'll need to use the command line. I recommend starting by setting your merge conflict "style" to diff3. The diff3 style shows three things when you get a merge conflict: your local changes, the original file and remote changes. The default style is diff2, which only shows your changes and the remote changes, which generally makes it harder to figure out what's happened.
* If you've encountered your first merge conflict:
```bash
# Abort this merge
git merge --abort
# Set the conflict style
git config --global merge.conflictstyle diff3
# Re-try the merge
git pull
```
* If you're not in the middle of a merge conflict, just run
```bash
git config --global merge.conflictstyle diff3
```
To resolve the merge conflict, you need to open every file with the status `r bookdown::embed_png("screenshots/git-commit-conflict.png", dpi = 220)`. In each file, you'll find a conflict marker that looks like this:
```
<<<<<<< HEAD
||||||| merged common ancestors
=======
>>>>>>> remote
```
This shows all three versions of the conflicting code:
* At top, the code that you code.
* In the middle, the code from the previous commit. (This is missing in the
default merge conflict style, so if you don't see it, follow the instructions
above).
* At bottom, the code that you're trying to merge in.
You need to work through each conflict and decide either which version is better, or how combine the changes from both parents. Make sure you delete the conflict markers to resolve the conflict, then stage the file. Once you've fixed all conficts, make a new commit and push to github.
A couple of pointers when fixing text generated by roxygen:
* Don't fix problems in `man/*.Rd` files. Instead, resolve the conflicts in
the underlying roxygen comments and re-document the package.
* Merge conflicts in the `NAMESPACE` file will prevent you from re-loading or
re-documenting the package. Resolve them enough so that the package
can be loaded, then re-document to generate a clean and correct `NAMESPACE`.
Handling merge conflicts is one of the trickiest parts of git, and you made need to read a few tutorials before you get the hang of it. Google and StackOverflow are great resources. If you get terribly confused, you can always abort the merge and try again by running `git merge --abort` then `git pull`.
## Issues {#github-issues}
Every github repo comes with a page for tracking issues. Use them! If you encounter a bug while working on another project, jot a note down in your issues. When you have a smaller project, don't worry too much about milestones, tags and assigning issues to specific people. Those are more useful once you get over one page of issues (~50). Once you get to that point, read the github guide on issues: <https://guides.github.com/features/issues/>.
One particularly useful tip is that you can close issues with commit messages. Just put `Closes #<issue number>` somewhere in your commit message and github will close the issue for you when you next push. The best thing about closing issues that way is that it makes a link from the issue to the commit. This is useful if you ever have to come back to the bug and see exactly what you did to fix it. If you want to make the link without closing the issue, you can just do `#<issue number>`.
Whenever you close an issue, it's a good idea to get in to the habit of add a bullet point to `NEWS.md`. See [NEWS.md](#news) for more details. The bullet point should describe the issue in terms that users will understand, where commit messages are designed to be read by developers.
## Branches {#git-branch}
Sometimes you need to make a big change to your code. You want to break it up into multiple simple commits so you can easily track what you're doing, but you don't want to merge them into the main stream of development. Or maybe you're not sure what you've done is the best approach and you want someone else to review your code. Or maybe you want to try something experimental - you only want to merge it back in if the experiment succeeds. Branches and pull requests provide powerful tools to handle these situations.
Although you haven't realised it, you're already using branches. The default branch is called __master__, and it's where you've been saving your commits. If you synchronise your code to github, you have another branch called __origin/master__, which is a local copy of all the commits on github (synchronised whenever you push or pull). When you run `git pull`, it does two things:
* `git fetch origin master`, to downloads all the commits on github into the
`origin/master`
* `git merge orgin/master`, to combine the remote changes with your changes.
It's useful to create your own new branches when you want to break off the main stream of development for a while. Create a new branch with `git checkout -b <branch-name>`. Names should use lower case letters, numbers, and `-` to separate words
Switch between branches with `git checkout <branch-name>`. For example, to return to the main line of development with `git branch master`. You can also use the branch switcher at the top right of the git pane:
```{r, echo = FALSE}
bookdown::embed_png("screenshots/git-branch.png", dpi = 220)
```
If you've forgotten the name of your branch in the shell, you can use `git branch` to list all existing branches.
If you try to synchronise this branch to github from inside RStudio, you'll notice that push and pull are disabled: `r bookdown::embed_png("screenshots/git-no-remote.png", dpi = 220)`. To enable them, you'll need to first tell git that your local branch should have a remote equivalent:
```bash
git push --set-upstream origin <branch-name>
```
Once you've done that the first time you can use the pull and push buttons like usual.
If you've been working in a branch for a while, other work might have been going on in master. To integrate that work into your branch, run `git merge master`. You will need to resolve any merge conflicts as described above. It's best to do this fairly frequently - the less your branch has diverged from the master, the easier it will be to merge.
Once you're done working in a branch, merge it back into master, then delete it:
```bash
git checkout master
git merge <branch-name>
git branch -d <branch-name>
```
(Git won't let you delete a branch unless you've merged it back into master. If you do want to abandon a branch without merging it, you'll need to force delete it by using `-D` instead of `-d`.)
## Making a pull request
Instead of merging your changes directly back into the main branch, you might want to first get someone else to review them. A __pull request__ allow to do just that. A pull request is a github tool that allows you to review a set of changes. A pull request has three components:
* A __conversation__,
`r bookdown::embed_png("screenshots/pr-conversation.png", dpi = 220)`,
where you can discuss the changes as a whole.
* The __commits__,
`r bookdown::embed_png("screenshots/pr-commits.png", dpi = 220)`,
where you can see each individual commit.
* The __file changes__,
`r bookdown::embed_png("screenshots/pr-changes.png", dpi = 220)`,
where you see the overall diff of the commits, and you can comment
on individual lines.
We'll talk about the most common usage of pull requests, contributing to other people's code, next. But it's easiest to start using pull requests with your own repo because you own all the pieces, and it's easy to experiment. Pull requests against your own code are particularly useful when you want to get feedback on a proposed change. We use them frequently at RStudio to get feedback before merging major changes.
To create a pull request, you need to work in a branch and synchronise that branch with github. When you next go to github you'll see a header that invites you to submit a pull request. You can also do it by:
* switching branches:
```{r, echo = FALSE}
bookdown::embed_png("screenshots/github-branches.png", dpi = 220)
```
* clicking `r bookdown::embed_png("screenshots/pr-create.png", dpi = 220)`
Learn more at <https://help.github.com/articles/using-pull-requests/>.
## Reviewing and accepting pull requests {#pr-accept}
Recieving a pull request is fantastic. Someone not only cares about your package enough to use, they're actually read the source code and make an improvement.
<http://sarah.thesharps.us/2014/09/01/the-gentle-art-of-patch-review/>
Always remind contributs to add bullet point to `NEWS.md`, thanking themselves, including their github username (which will eventually get turned into a nice clickable link in the release notes. You'll learn about that in [post release](#post-release)).
## Making a pull request on other repos {#pr-make}
Making a pull request to someone else's code is similar to make a pull request to your own repo. The main difference is that there's one more repository involved in the process. In total there are three repos you need to think about:
* The repo that you want to submit the pull request to. We'll call this the
__upstream__ repo. You can't modify this repo, which is why you're making a
pull request.
* Your remote repo, the __origin__. You create this on github by __forking__
the upstream repository on github. This is a copy of the upstream repo
that you have access to - it's somewhere for you to push your local changes
to so that github can see them.
* Your local repo. As usual this is where you work.
Follow these steps to set up the two repos that you'll need to submit a pull request:
1. __Fork__ the upstream repo. A fork is a copy of repo that you belongs
to you. See [fork a repo](https://help.github.com/articles/fork-a-repo)
for more details.
1. __Clone__ the forked repo to create a local copy of the remote repo.
It's possible to do this from RStudio (using "Create new project" from
"Version control") but I think it's easier to do it from the shell:
```bash
git clone git@github.com:<your-name>/<repo>.git
cd <repo>
```
1. __Set up__ your fork so that it knows about the upstream repo:
```bash
git remote add upstream git@github.com:<original-name>/extrafont.git
git fetch upstream
git branch -u upstream/master
```
While you're creating the pull request, the upstream repo might change. To update your branch:
1. Pull the upstream changes down into your local repo:
```bash
git checkout master
git pull
```
1. Merge the changes with your branch:
```bash
git checkout <my-branch>
git merge master
```
The reposities and the connections between them are summarised below:
```{r, echo = FALSE}
bookdown::embed_png("diagrams/pull-request-process.png", dpi = 220)
```