|
| 1 | +--- |
| 2 | +title: "GitHub" |
| 3 | +date: "2025-05-18" |
| 4 | +# date-modified: "" |
| 5 | +--- |
| 6 | + |
| 7 | +This document describes the reasons why we decided to and recommend |
| 8 | +using GitHub for research and communication purposes. |
| 9 | + |
| 10 | +## Needs |
| 11 | + |
| 12 | +Increasingly, researchers around the world are using Git to work |
| 13 | +collaboratively together and using GitHub to share those Git projects. |
| 14 | +For articles discussing GitHub and research, see |
| 15 | +[here](https://pmc.ncbi.nlm.nih.gov/articles/PMC11828340/), |
| 16 | +[here](https://besjournals.onlinelibrary.wiley.com/doi/full/10.1111/2041-210X.14108), |
| 17 | +[here](https://www.uu.nl/en/research/research-data-management/tools/software-and-computing/github-and-git), |
| 18 | +[here](https://peer.asee.org/31594), |
| 19 | +[here](https://academic.oup.com/proteincell/article/14/10/713/7147618?login=false), |
| 20 | +and |
| 21 | +[here](https://academic.oup.com/gigascience/article/doi/10.1093/gigascience/giad113/7516267?login=false) |
| 22 | +to name a few. Tools like RStudio for R or Jupyter for Python have |
| 23 | +built-in support for Git and excellent integration with GitHub. Many |
| 24 | +courses for PhD students also include GitHub, or reference GitHub in |
| 25 | +some way, so increasingly, they are being exposed to and using GitHub in |
| 26 | +their own work. For example, in the [Reproducible Research in |
| 27 | +R](https://r-cubed.rostools.org) workshops taught with the [Danish |
| 28 | +Endocrinology and Diabetes Academy](https://ddeacademy.dk/). |
| 29 | + |
| 30 | +As researchers at Steno Diabetes Center Aarhus, we need a platform where |
| 31 | +we can share procedures, guidelines, host project websites, and put |
| 32 | +documents that we collaborate on. Preferably, we want a tool |
| 33 | +that researchers are already using or will likely use in the near |
| 34 | +future, and we want a tool that integrates easily into how we already |
| 35 | +work. We would rather not have to teach and learn a tool that is only |
| 36 | +used in one area (e.g. only to host websites or keep documents), but |
| 37 | +rather use a tool that is increasingly part of a researcher's current or |
| 38 | +future toolbox. |
| 39 | + |
| 40 | +We also want a tool that is relatively easy to contribute to, even for |
| 41 | +those outside of Steno Aarhus (e.g. external collaborators). We want as |
| 42 | +little friction as possible between a person wanting to add, change, or |
| 43 | +create something on a website and it going online in as little time as |
| 44 | +possible. That means, the person wanting to contribute should not need |
| 45 | +to coordinate with someone else, wait for someone else to do it, or that |
| 46 | +it takes a long time for a change to be approved and put online. We want |
| 47 | +to empower researchers. |
| 48 | + |
| 49 | +Because of this, GitHub is the best and optimal tool to use. We already |
| 50 | +currently use GitHub at Steno Diabetes Center Aarhus for a wide variety |
| 51 | +of purposes, ranging from working on research projects like the [UK |
| 52 | +Biobank](https://steno-aarhus.github.io/ukbAid/), developing tools like |
| 53 | +the [Open Source Diabetes |
| 54 | +Classifier](https://steno-aarhus.github.io/osdc/), to hosting websites |
| 55 | +for projects like [ON-LiMiT](https://steno-aarhus.github.io/on-limit/), |
| 56 | +to listing all Steno Aarhus GitHub webpages at the |
| 57 | +[main](https://steno-aarhus.github.io/) website. |
| 58 | + |
| 59 | +## So what is Git and GitHub? |
| 60 | + |
| 61 | +This could be a full document on its own. But to keep it short, |
| 62 | +[Git](https://git-scm.com/) is a version control system that allows you |
| 63 | +to track changes to your files and store them in a "history of changes" |
| 64 | +called a repository. It is a powerful tool to use when working alone, |
| 65 | +but especially when working with others. It allows you to work on the |
| 66 | +same files together, and keep track of who made which changes when. It |
| 67 | +is a tool used by millions of people around the world, especially in the |
| 68 | +software development community. Software like RStudio and Jupyter or |
| 69 | +services like Netflix or Airbnb are all developed with Git, while |
| 70 | +companies like Microsoft or Apple or governments like the UK government |
| 71 | +use Git to develop their software. You can, however, use Git for more |
| 72 | +than just software; you use it to track changes to *any type of* file on |
| 73 | +your computer. |
| 74 | + |
| 75 | +[GitHub](https://github.com) is a web-based platform for hosting/storing |
| 76 | +Git repositories. It adds extra features on top of Git's file history |
| 77 | +tracking, like issue and task tracking, pull requests for collaboration, |
| 78 | +project management, and the ability to host files for a website. GitHub |
| 79 | +is the [most |
| 80 | +popular](https://hutte.io/trails/git-based-development-statistics/) Git |
| 81 | +hosting platform, is owned by Microsoft, and is used by millions of |
| 82 | +people around the world every day. It hosts software and documentation |
| 83 | +used in all areas of life, from healthcare to finance to security to the |
| 84 | +functioning of the internet itself. |
| 85 | + |
| 86 | +## GitHub fits our needs |
| 87 | + |
| 88 | +GitHub is not just a tool to host Git repositories, it also has a |
| 89 | +massive community of open science enthusiasts, researchers, and |
| 90 | +developers. Aside from this, it is: |
| 91 | + |
| 92 | +- Free to use for both private repositories (with limited features) |
| 93 | + and public ones (with nearly all features). |
| 94 | +- Incredibly reliable and stable, primarily because it is depended on |
| 95 | + by so many people and organizations which places high pressure on |
| 96 | + GitHub to be reliable and stable. |
| 97 | +- It is very secure and has dedicated teams working to protect it from |
| 98 | + attacks. Individual Git repositories can be secured even further |
| 99 | + with multiple other settings. |
| 100 | +- It has a very user friendly interface, including with many |
| 101 | + accessibility features such as for colour-blindness. |
| 102 | +- Many researchers, especially new researchers, are already using Git |
| 103 | + and GitHub. Or they will very likely be using it in the near future |
| 104 | + because of the increasing popularity of Git and GitHub in the |
| 105 | + research community. |
| 106 | +- Git and GitHub are tools that empower participation and |
| 107 | + collaboration, because of the way it is designed. Anyone can |
| 108 | + contribute to a public repository through features called pull |
| 109 | + requests, which means that if someone sees a typo or mistake in a |
| 110 | + webpage or in analysis code, they can easily and directly make a fix |
| 111 | + and suggest it to be added. |
| 112 | + |
| 113 | +## Security |
| 114 | + |
| 115 | +Security is a major concern for most organizations, especially nowadays. |
| 116 | +This concern is not just for academic institutions, but also for |
| 117 | +companies and governments. Since so many organizations around the world |
| 118 | +depend on GitHub, security is a big priority for them. Because so many |
| 119 | +people depend on it, GitHub experiences regular and repeated attacks. |
| 120 | +Even still, it almost never goes down, and when it does, it is usually |
| 121 | +for an hour or two. That's because of the dedicated teams of people at |
| 122 | +GitHub working to defend against these attacks, e.g. see some discussion |
| 123 | +of security practices |
| 124 | +[here](https://wardenshield.com/how-safe-is-github-a-deep-dive-into-understanding-how-github-claims-to-protect-without-spying-on-users) |
| 125 | +or |
| 126 | +[here](https://www.thousandeyes.com/blog/how-github-successfully-mitigated-ddos-attack). |
| 127 | +While they do internal security work, they also have extensive |
| 128 | +[documentation](https://docs.github.com/en/get-started/learning-about-github/about-github-advanced-security) |
| 129 | +to guide their users on what the user can do to improve security of |
| 130 | +their repositories. |
| 131 | + |
| 132 | +And, if something does happen to GitHub, because of the way Git works it |
| 133 | +is very easy to move your content to another Git hosting platform. |
| 134 | + |
| 135 | +Some security practices we use for the Steno Aarhus GitHub organization: |
| 136 | + |
| 137 | +- We follow the principle of least privilege. By default, all members |
| 138 | + can only read other repositories unless they are assigned to a team |
| 139 | + that has write access to a repository or are added directly to a |
| 140 | + repository. |
| 141 | +- All main "branches" in our repositories are protected and can't be |
| 142 | + deleted by anyone aside from the organization owners (in case of an |
| 143 | + emergency). Only a few people are given owner status in the |
| 144 | + organization. |
| 145 | + |
| 146 | +Having said that, no system is truly secure. No organisation or service |
| 147 | +can protect against social engineering attacks like phishing or other |
| 148 | +attacks that target the user. Users are generally the weakest link in |
| 149 | +the security chain. We can't control that aside from limiting access and |
| 150 | +educate users. This requires educating and training users on basic |
| 151 | +security practices. |
| 152 | + |
| 153 | +## Hosting websites |
| 154 | + |
| 155 | +One of the biggest reasons we at Steno Aarhus use GitHub is to host |
| 156 | +websites. GitHub has a service called [GitHub |
| 157 | +Pages](https://pages.github.com/) that allows you to host a website for |
| 158 | +free. |
| 159 | + |
| 160 | +What does "host" and "website" mean? Websites are simply a collection of |
| 161 | +plain HTML files that are linked together and that have styling |
| 162 | +instructions from files called [CSS](https://www.w3schools.com/css/) |
| 163 | +files. If you open these files in a browser like Firefox or Chrome, they |
| 164 | +will be displayed as a regular website. But if you open those same files |
| 165 | +in a text editor, you will see HTML code and text. That's because |
| 166 | +browsers take the HTML text and visually render it into the nice form we |
| 167 | +see on websites. |
| 168 | + |
| 169 | +To host a website means to put those HTML and CSS files on a server in a |
| 170 | +way that a browser anywhere in the world can read if given the correct |
| 171 | +URL address. That's it. Hosting is simply copying the files to the |
| 172 | +server that connects to the internet. |
| 173 | + |
| 174 | +In the case of GitHub Pages, you are giving GitHub a set of static (not |
| 175 | +interactive or dynamic) HTML files that GitHub will put online (host) |
| 176 | +for you. It does not do anything else for you. How you generate the HTML |
| 177 | +files is up to you. We use a tool called [Quarto](https://quarto.org/) |
| 178 | +to generate the HTML files from the more human-friendly |
| 179 | +[Markdown](https://quarto.org/docs/authoring/markdown-basics.html) files |
| 180 | +to make the Steno Aarhus websites. Read about our reasons for using |
| 181 | +Quarto on our [Quarto](quarto.md) page. |
| 182 | + |
| 183 | +The advantage of using GitHub Pages for this is that because the HTML |
| 184 | +files are simply copied static files, our websites are quick to load and |
| 185 | +won't go down unless GitHub itself goes down. Because they are static |
| 186 | +files that aren't connected to any database or backend, there is no way |
| 187 | +to access or exploit anything, making them secure by default. |
| 188 | +Even if a contributor makes a mistake in writing the Markdown files and |
| 189 | +the Quarto tool can't regenerate the updated website, the existing |
| 190 | +website stays online. It only gets updated when a completed |
| 191 | +re-generation of the HTML files happens. If it isn't completed, nothing |
| 192 | +changes on the website. |
| 193 | + |
| 194 | +All this makes GitHub Pages a very reliable, secure, fast, and easy way |
| 195 | +to host websites. |
| 196 | + |
| 197 | +## Potential consequences |
| 198 | + |
| 199 | +Every decision always has some consequences. For us, the biggest one is |
| 200 | +the time commitment needed to learn how to use Git and GitHub. Although |
| 201 | +it isn't too complicated to learn the basics, it is still a complex tool |
| 202 | +with a lot of functionality. That means it can take time to become |
| 203 | +familiar and comfortable with it. |
| 204 | + |
| 205 | +The advantage of GitHub compared to a dedicated website hosting service |
| 206 | +is that, if a researcher learns Git and GitHub, they can easily |
| 207 | +incorporate that knowledge into their own workflow and research. If they |
| 208 | +learn a website-specific tool, that knowledge and skill will only apply |
| 209 | +to that tool and not translate to improving their research workflow. So |
| 210 | +learning Git and GitHub in this case is an investment that has multiple |
| 211 | +sources of return, such as improving collaboration, improving |
| 212 | +reproducibility and transparency, and simplifying the process of sharing |
| 213 | +their work with others. |
| 214 | + |
| 215 | +Plus, many groups across Steno Aarhus are already using GitHub (e.g. the |
| 216 | +epidemiology group) and we have workshops and documentation to train |
| 217 | +people on using Git and GitHub. Which means there is support and help |
| 218 | +that is available. |
| 219 | + |
| 220 | +## Conclusion |
| 221 | + |
| 222 | +For the reasons and explanations given above, we have chosen to use |
| 223 | +GitHub for various purposes at Steno Diabetes Center Aarhus. It is fast, |
| 224 | +powerful, reliable, and secure, and it is a tool that researchers are |
| 225 | +either already using or will very likely use at some point in the |
| 226 | +future. |
0 commit comments