- Git is a tool for managing changes in files (i.e. history) of a project.
- Git allows "coordinating work among programmers collaboratively developing source code during software development".
- Git is a distributed version control system: the entire history of the project is stored on every computer and no computer (server) with a central role is needed.
When you use git you have:
- A working directory (a folder on your computer with the files/subfolders you are working on).
- A local repository (a folder on your computer where git stores the history of the project; this folder is usually called
.gitand is hidden in the working directory).
Typical starting points for a git project:
- A completely new project:
- You create a new working directory and you initialize a new repository in it (
git init). - You then add files to the working directory and commit them to the local repository.
- You can then add a remote repository (e.g. on GitHub) and
pushyour commits to it.
- You create a new working directory and you initialize a new repository in it (
- An existing project without a repository:
- You initialize a new repository in the working directory (
git init). - Then proceed as above.
- You initialize a new repository in the working directory (
- An existing project with a repository:
- You clone the repository to your computer (
git clone). - The cloning process creates a working directory with the newest state of the project and with a full copy of the history in the local repository.
- The location of the remote repository is stored in the local repository and is called
origin.
- You clone the repository to your computer (
Hint: Open separate terminals for each of the following exercises.
Create a new local directory named new_proj_test and initialize the git repository in it. Go to the newly created directory. Use git status to confirm that you indeed created the repository. Find the hidden .git directory in the working directory.
# SOLUTION 1:
# cd to the directory where you want to create the new directory
mkdir new_proj_test
cd new_proj_test
git init
git status# SOLUTION 2:
# cd to the directory where you want to create the new directory
git init new_proj_test
cd new_proj_test
git statusGo to the GitHub top page of this course. Find the HTTPS clone URL of the repository. Clone the repository to your computer into a directory efds_test. Go to the newly created directory. Use git status to confirm that you indeed created the repository. Find the hidden .git directory in the working directory.
# cd to the directory where you want to create the new directory
git clone https://github.com/LUMC/EfDS efds_test
cd efds_test
git statusA commit object in the git repository is a snapshot of the state of the project at a certain point in time. It contains the following information:
- The contents of the project files at the time of the commit.
- A unique identifier (hash) for the commit.
- A pointer to a parent commit that came directly before this commit (when merging a commit can have two parent commits).
- The time, author and message of the commit.
Here is an example graph showing a part of the history of a project:
- The newest commit is at the top. Stars denote commits.
- The chart was generated with
git log --graph --oneline --allcommand.
Notes:
- Only files that have changed are stored in the commit object. Only the changes are stored, not the entire file.
- The commit object does not contain the name of the branch it is on.
- The commit object is immutable - it will stay the same forever.
- The commit object is stored in the
.git/objectsdirectory. - A working directory might contain files that are not in the commit object (they need to be added and committed first).
Go to the efds_test directory:
- Use
git logto see the history of the project. - Use
git log --graph --oneline --all --decorateto see the history of the project in a graph. - What is the hash of the newest commit?
- Find the hash of the commit corresponding to the message
Exam topics marked. - Read how to use
git diffto show differences between two commits. Use it to show the differences between the commit withExam topics marked.message and the parent. - The same information can be found on the GitHub page of the repository at the NN commits link. Find the commits log and how to show the differences introduced by a commit.
git log
git log --graph --oneline --all --decorate
git diff 25a4db7 25a4db7^Go to the efds_test directory:
- Find how to use
git checkoutto go back to the parent of the commit with the messageExam topics marked.. - Use
git statusto see the state of the working directory. - Use
git log --graph --oneline --all --decorateto see the history of the project in a graph. - Look into the
README.mdfile. Confirm that you see the version of the file from the parent commit. - Use
git checkout mainto go back to the newest commit. - Use
git statusto see the state of the working directory. - Look into the
README.mdfile. Confirm that you see the version of the file from the parent commit.
git checkout 25a4db7^
git status
git log --graph --oneline --all --decorate
# check README.md whether the past version
git checkout main
git status
git log --graph --oneline --all --decorate
# check README.md whether the newest versionStudy again the image:
- Starting from the commit
1ec8a10there were two evolution paths (branches) of the project:- one leading to the commit
cbcf948 - another leading to the series of commits
70bee6b...7281d3b
- one leading to the commit
- Both paths were merged in the commit
ee71d8b.
Notes:
- Commits have no names, only hashes. Commits remember only their parents (past), not their children (future). Given such data, there is no simple way to find the newest commit.
- Therefore, another mechanism must exist to find the newest commits. This is done by branches. Branches are just user-defined names (e.g.
main/master) which point to commits and move along with the commits when new commits are added. - The commit denoted by
HEADis the one that is currently checked out in the working directory. This is where a new commit will be added when you usegit commitcommand. - The commit denoted by
masteris the newest commit in its branch known to the local repository. - The commit denoted by
origin/masteris the newest commit in its branch known to the remote (origin) repository. Agit pushoperation will update this commit.
- Create a file in the repository.
- Go to the
new_proj_testdirectory and create a filepattern.pywith the contents given below. - Check that the code works and prints a square pattern.
- Add the file to the repository and commit the changes.
- Check the history of the project with
git log --graph --oneline --all --decorate
- Go to the
def printSquare(n):
for r in range(n):
for c in range(n):
if r == 0 or r == n - 1 or c == 0 or c == n - 1:
print('*', end='')
else:
print(' ', end='')
print()
printSquare(5)- Start a new
featurebranch.
Let's assume that a new feature is planned for the project: a function that prints X shape. This feature will be developed in a separate branchfeature. Themain(master) branch will be used for the stable (always well working) version of the project.- Use
git branchto see the list of available branches (*denotes the current branch). - Use
git branch featureto create a new branch. - Use
git branchto see the list of branches. - Study
git log --graph --oneline --all --decorateto see how branch information is presented. - Note: the
HEADpointer is still on themainbranch. If you add something to the project, it will be added to themainbranch. - Use
git checkout featureto switch to the new branch. - Use
git branchto confirm that you are now on thefeaturebranch. - Use
git log --graph --oneline --all --decorateto confirm that you are now on thefeaturebranch. - Now add the
printXfunction to thepattern.pyfile. The code is given below. - Add the file to the repository and commit the changes.
- Check the history of the project with
git log --graph --oneline --all --decorate(ensure that you understand the locations ofHEAD,featureandmain/master).
- Use
def printX(n):
for r in range(n):
for c in range(n):
if r == c or r == n - 1 - c:
print('*', end='')
else:
print(' ', end='')
print()
printX(7)- Urgent change in the
main/masterbranch.
Let's assume that an urgent change needs to be made in themain/masterbranch. Let's aso assume that thefeaturebranch is not yet ready to be merged into themain/masterbranch.- Use
git branchto see the list of branches. - Use also
git statusto see the current branch. - Use
git checkoutto switch to themain/masterbranch. - Verify that you are now on the
main/masterbranch. Check location ofHEADin the commit graph. - Introduce a change in the
pattern.pyfile. Let's pretend that the urgent correction is to change the name of the argument of theprintSquarefunction fromntosize. The code is given below. - Use
git diffto see the changes before adding them to the repository. - You are about to modify the
main/masterbranch - be sure that the code works (let's pretend that other developers may be using themain/mastercode). - Add the file to the repository and commit the changes. Study the commit graph.
- Use
def printSquare(size):
for r in range(size):
for c in range(size):
if r == 0 or r == size - 1 or c == 0 or c == size - 1:
print('*', end='')
else:
print(' ', end='')
print()
printSquare(5)- Merge the
featurebranch into themain/masterbranch.
Let's assume now that it has been decided that thefeaturebranch is ready to be merged into themain/masterbranch.- Confirm that you are on the
main/masterbranch. - Use
git merge featureto merge thefeaturebranch into the current branch. - The merge might be successful or it may lead to conflicts:
- A conflict happens when the same lines of the same files were modified in different branches. You need then to manually correct the conflicts (open each file and edit it so the code is correct). After that, you need to
git addthe files to the repository andgit committhe changes. - When there are no conflicts, the merge is done automatically. You will be asked for a commit message. Use the default one.
- A conflict happens when the same lines of the same files were modified in different branches. You need then to manually correct the conflicts (open each file and edit it so the code is correct). After that, you need to
- Use
git log --graph --oneline --all --decorateto see the history of the project in a graph. Ensure that you understand the locations ofHEAD,featureandmain/master. Your current branch ismain/master. Thefeaturebranch is merged into themain/masterbranch but it is still there. It can be either deleted (git branch -d feature) or used for further development.
- Confirm that you are on the
* 28277f9 (HEAD -> master) Merge branch 'feature'
|\
| * c63d595 (feature) added printing of X
* | 01a7ede Changed argument name
|/
* d689bf2 first file
- Examples of the commit graph in a large project:
- Python pandas: https://github.com/pandas-dev/pandas/network
- R ggplot2: https://github.com/tidyverse/ggplot2/network
- Common branching strategies:
