This is a tutorial going over some computational workflow basics that will be useful for students in GU4260 Modeling the Universe course in the Columbia Astronomy Department.
Ana Lam
Oct. 2, 2025
To clone a repository (repo), let's first fork my dummy repo found at here. Forking in GitHub simply means creating a personal copy of an existing repository in your own account. To fork simply click the fork icon.
Now, this repo should appear in your own GitHub account. We can now go ahead and clone this dummy repo. Git cloning is a fundamental git command used to create a local copy of an existing Git repo. Once you clone the repo onto your local computer, all the files and commit history of that repo will appear in your local copy of that repo.
Let's go ahead and clone the dummy repo in the command line and navigate to that repo. First, let's grab the url of our repo so we can easily clone it when at the command line.
Open a terminal session and do the following:
foo@bar:~$ cd Documents
foo@bar:Documents$ git clone https://github.com/[your-username]/modelingtheuniverse_tutorial.git
foo@bar:Documents$ cd modelingtheuniverse_tutorialLet's go ahead and check the repo status:
foo@bar:modelingtheuniverse_tutorial$ git statusYou should see something like this which will report a) what branch you are on, b) if your branch is up to date with the branch, c) if you have made changes to files in the repo that you haven't committed or added in order to be "tracked" by your git history.
On branch main
No commits yet
nothing to commit (create/copy files and use "git add" to track)As you can see here, we are on the main branch and we do not have any commits or created/changed file to commit.
Now let's learn some git basics. After creating your files or jupyter notebook solutions, you will need to git add them in order to track them and git commit -m "your message" to commit/save those changes. Let's try creating a dummy file that we will track and save.
foo@bar:modelingtheuniverse_tutorial$ echo "This is a dummy file." > dummy.txt
foo@bar:modelingtheuniverse_tutorial$ git add dummy.txtLet's take a look at our git status:
foo@bar:modelingtheuniverse_tutorial$ git status
On branch main
Your branch is up to date with 'origin/main'.
Changes to be committed:
(use "git restore --staged <file>..." to unstage)
new file: dummy.txtSo we are tracking the changes to that file because we have added it! Let's git commit the changes/creation of that file so it will appear in our git history. Our git history is essentially a tracker of all the tiny changes we make to files in a repo. Imagine it like "saving"/version history of a word document.
With commits we need to pass a message which we flag with -m:
foo@bar:modelingtheuniverse_tutorial$ git commit -m "create dummy.txt"
[main 100a41d] create dummy.txt
1 file changed, 1 insertion(+)
create mode 100644 dummy.txtAs you can see here, we've created and changed a file and that commit id is 100a41d.
Now, we want these changes in our local repo that we've saved to be reflected in our remote repo on github.com! In order to do this we need to push this branch with the changes which is just the simple git push operation:
foo@bar:modelingtheuniverse_tutorial$ git push
Enumerating objects: 4, done.
Counting objects: 100% (4/4), done.
Delta compression using up to 8 threads
Compressing objects: 100% (2/2), done.
Writing objects: 100% (3/3), 299 bytes | 299.00 KiB/s, done.
Total 3 (delta 0), reused 0 (delta 0), pack-reused 0
To github.com:ana-lam/modelingtheuniverse_tutorial.git
406de9f..100a41d main -> mainAnd voila! You should see the creation of dummy.txt and it's commit message in github.
Branches in Git let you work on new features, bug fixes, or problem set solutions without interfering with the main branch. Think of them as "parallel copies" of your repo where you can make changes safely without necessarily saving them to a "final" version of your repo.
Let's check our current branches:
foo@bar:modelingtheuniverse_tutorial$ git branch -a
* main
remotes/origin/mainThe * indicates the branch you're currently on (here, main). As you can see that is our only branch and it tracks/pushes to remotes/origin/main, the github.com/remote version of your main branch.
Let's create a new branch:
foo@bar:modelingtheuniverse_tutorial$ git checkout -b question1
Switched to a new branch 'question1'Since the question1 branch does not exist yet, when we git checkout this branch we will both create it and switch to it.
foo@bar:modelingtheuniverse_tutorial$ git branch
main
* question1We can now make changes in this branch and save them, for example:
foo@bar:modelingtheuniverse_tutorial$ echo "print('Hello from Problem Set 1')" > pset1.py
foo@bar:modelingtheuniverse_tutorial$ git add pset1.py
foo@bar:modelingtheuniverse_tutorial$ git commit -m "Add pset1.py"And push those changes to the remote version of this branch:
foo@bar:modelingtheuniverse_tutorial$ git push -u origin question1The -u sets this branch to track the remote one, so in the future you can just git push.
If you are happy with the changes in this branch and want to them in our main branch you can open a pull request in github.com.
You will be prompted to add a description of this pull request you create. The pull request is essentially a request to pull the changes in the question1 branch into the main branch.
Then simply click Merge pull request to merge the pull request.
Now an important part of developing code for scientific projects (especially in collaborative settings and for the sake of reproducibility) is the follow some best practices of environment management on top of the github workflow for version tracking.
Staying organized with environments is important because different projects/problem sets might need different packages/libraries or even different versions of said packages/libraries.
To avoid conflicts, it is a good practice to use a per-repo isolated environment instead of installing several packages globally on your computer. uv is a very helpful and powerful tool for this.
uv is a fast and efficient Python package and environment manager. It creates lightweight virtual environments, installs dependencies and keeps them reproducible, and keeps environments clean on a per-repo basis.
Inside your repo/problem set directory, let's create a virtual environment.
foo@bar:modelingtheuniverse_tutorial$ uv venvThis will create a .venv/ folder inside your repo. This is a hidden folder that contains its own Python and common Python libraries/packages + additional ones you add.
We can activate this virtual environment like so:
foo@bar:modelingtheuniverse_tutorial$ source .venv/bin/activate
(modelingtheuniverse_tutorial) foo@bar:modelingtheuniverse_tutorial$You can now add packages:
(modelingtheuniverse_tutorial) foo@bar:modelingtheuniverse_tutorial$ uv pip install astropyYou can save all the installed packages so collaborators can reproduce your code by doing the following which will create a requirements.txt file with the packages listed:
(modelingtheuniverse_tutorial) foo@bar:modelingtheuniverse_tutorial$ uv pip freeze > requirements.txtIn the future, you can transfer this requirements.txt file to other repos and install it in other uv virtual environments by doing the following:
(another_venv) foo@bar:[another_repo]$ uv pip install -r requirements.txt-
Always activate your environment before running code or Jupyter notebooks.
-
Keep a requirements.txt in every repo.
-
If you break your environment, you can safely delete .venv/ and recreate it with
uv venv+uv pip install -r requirements.txt.
Jupyter notebooks are very useful tools. They are essentially interactive documents where you can code and add markdown (text explanations). This is a Jupyter notebook!
To launch a notebook simply run:
(modelingtheuniverse_tutorial) foo@bar:modelingtheuniverse_tutorial$ jupyter notebook*** If this doesn't work first run uv pip install notebook to install the classic interface
This will open a browser window at http://localhost:8888 where you can create and run notebooks (.ipynb files).
I highly suggest using Jupyter Notebooks in VS code. VS Code is a very powerful code editor that comes with many extensions (including Jupyter + Github) that make things much easier.
To do this, install VS Code and in the extensions tab download the Python extension and the Jupyter extension. This will allow you to edit interactive .ipynb notebooks and you can easily select your environments (like .venv) as the kernel. Kernels are the engines that run your code. Kernels are tied to specific environments like the one we created. You can follow this YouTube video that goes through the steps of downloading the Jupyter extension in VS Code.






