|
1 | 1 | # git-scm.com architecture |
2 | 2 |
|
3 | 3 | This document describes the general setup and architecture that runs the |
4 | | -git-scm.com site. The idea is to document all the moving parts that |
5 | | -_aren't_ checked in to this repository. That may help new people joining |
6 | | -the project to help out, as well provide some continuity in case the |
7 | | -maintainer is hit by a bus. |
| 4 | +git-scm.com site. |
8 | 5 |
|
9 | 6 | ## Content |
10 | 7 |
|
11 | | -Though the site is a rails app, it can _mostly_ be thought of as serving |
12 | | -static content. It's just that we suck in that static content and |
13 | | -pre-process it using nightly scheduled jobs. We never write anything to |
14 | | -the database on behalf of user requests. |
| 8 | +This site is served via GitHub Pages and is a [Hugo](https://gohugo.io/) site |
| 9 | +with the search implemented using [Pagefind](https://pagefind.app/). |
15 | 10 |
|
16 | 11 | The content is a mix of: |
17 | 12 |
|
18 | | - - actual static content in this repository |
| 13 | + - original content from this repository |
19 | 14 |
|
20 | 15 | - community book content brought in from https://github.com/progit; |
21 | | - see the `lib/tasks/book2.rake` file. |
| 16 | + see the `script/update-book2.rb` and `script/book.rb` files. |
22 | 17 |
|
23 | | - - manpages from releases of the git project, imported and formatted |
24 | | - via asciidoctor; see the `lib/tasks/index.rake` task. |
| 18 | + The content is pre-rendered and tracked in the `external/book/` directory |
| 19 | + tree. |
25 | 20 |
|
| 21 | + - manual pages from releases of the git project, imported and formatted via |
| 22 | + AsciiDoctor, and translated versions of the manual pages from |
| 23 | + https://github.com/jnavila/git-manpages-l10n/ (which itself contains |
| 24 | + pre-rendered pages from https://github.com/jnavila/git-manpages-l10n/); see |
| 25 | + the `script/update-docs.rb` file. |
26 | 26 |
|
27 | | -## Heroku |
| 27 | + The pre-rendered pages are tracked in the `external/docs/` directory tree. |
28 | 28 |
|
29 | | -The app itself is served by Heroku. The app name is `git-scm` (so you |
30 | | -can visit it directly as https://git-scm.herokuapp.com). The site is |
31 | | -owned by the git-scm.com team. If you want to be involved in managing |
32 | | -uptime/deploys/etc, you'll need a Heroku account and request to be added |
33 | | -to that team. |
| 29 | +To deploy to GitHub Pages, it is necessary to turn off the default setting to |
| 30 | +"publish from a branch" and instead change the setting to "publish with a |
| 31 | +custom GitHub Actions workflow": |
| 32 | +https://docs.github.com/en/pages/getting-started-with-github-pages/configuring-a-publishing-source-for-your-github-pages-site#publishing-with-a-custom-github-actions-workflow |
| 33 | +With this change, the site can be tested in the fork by pushing to the |
| 34 | +`gh-pages` branch (which will trigger the `deploy.yml` workflow) and then |
| 35 | +navigating to https://git-scm.<user>.github.io/. |
34 | 36 |
|
35 | | -We use a few Heroku add-ons: |
| 37 | +## Non-static parts |
36 | 38 |
|
37 | | - - Bonsai elasticsearch (see below) |
| 39 | +While the site consists mostly of static content, there are a couple of |
| 40 | +parts that are sort of dynamic. |
38 | 41 |
|
39 | | - - Heroku Postgres as the database |
| 42 | +The search is implemented client-side, via [Pagefind](https://pagefind.app/). |
40 | 43 |
|
41 | | - - Heroku Redis for rails caching |
| 44 | +A few scheduled GitHub workflows keep the content up to date: |
42 | 45 |
|
43 | | - - Heroku scheduler for cron jobs |
| 46 | + - `update-git-version-and-manual-pages` and `update-download-data` (pick |
| 47 | + up newly released git versions) |
44 | 48 |
|
45 | | -The nightly scheduled jobs are: |
| 49 | + - `update-translated-manual-pages` (fetch and format translated manual |
| 50 | + pages from the jnavila/git-html-l10n repository) |
46 | 51 |
|
47 | | - - `rake downloads` (pick up newly released git versions) |
48 | | - |
49 | | - - `rake preindex` (pull in and format manpages for released git |
50 | | - versions) |
51 | | - |
52 | | - - `rake remote_genbook2` (pull in and format progit2 book content, |
| 52 | + - `update-book` (fetch and format progit2 book content, |
53 | 53 | including translations) |
54 | 54 |
|
55 | | -It should be safe to run any of those jobs more frequently. E.g., if you |
56 | | -know there's a new Git release out, then: |
57 | | - |
58 | | - heroku run rake preindex |
59 | | - heroku run rake downloads |
60 | | - |
61 | | -will get it on the site without waiting for the nightly run. |
62 | | - |
63 | | -Merges to the `main` branch on GitHub auto-deploy to Heroku, so unless |
64 | | -you're doing something tricky you generally shouldn't need to manually |
65 | | -deploy. |
66 | | - |
67 | | -Note that some of the formatting of manpages and book content happens |
68 | | -when they are imported by the rake tasks. So after fixing some |
69 | | -formatting and deploying, the rake jobs may need to be re-run with a |
70 | | -special flag to re-import (see the individual tasks for details). |
71 | | - |
72 | | - |
73 | | -## Cloudflare |
74 | | - |
75 | | -We get enough requests that it's easy to overwhelm the single Heroku |
76 | | -dyno. So we have Cloudflare sitting in front of it, aggressively caching |
77 | | -everything. That also should make the site faster to serve to regions |
78 | | -far away from Heroku's servers. |
79 | | - |
80 | | -The Cloudflare setup is mostly pretty simple: |
| 55 | +These workflows are also marked as `workflow_dispatch`, i.e. they can be run |
| 56 | +manually (e.g. to update the download links just after Git for Windows |
| 57 | +published a new release). |
81 | 58 |
|
82 | | - - they serve DNS for the whole domain (that's where they insert the CDN |
83 | | - magic) |
84 | | - |
85 | | - - Cloudflare provides `https://` support to the user. Obviously the |
86 | | - site is totally open and doesn't have any sensitive data, so this is |
87 | | - really more about integrity. The certificate is generated by |
88 | | - Cloudflare (and requires SNI on the browser side). |
89 | | - |
90 | | - - the Cloudflare connection to Heroku is passed over TLS; they provide an |
91 | | - "internal" certificate that we ask Heroku to use, so the connection |
92 | | - is secured between the two (again, mostly for integrity) |
93 | | - |
94 | | - - the most exotic config is that we use "page rules" to mark the whole |
95 | | - site to be cached aggressively, regardless of any caching headers |
96 | | - sent from Heroku. This is a bit of a hack, but there's very little on |
97 | | - the site that can't be cached (which is perhaps a sign that the rails |
98 | | - setup needs to be tweaked to send more reasonable caching headers, |
99 | | - but this has been simple and effective so far). |
100 | | - |
101 | | - There are a few special page rules to lift this caching for cases |
102 | | - where we do server-side logic (e.g., |
103 | | - https://github.com/git/git-scm.com/issues/1129#issuecomment-363067019"), |
104 | | - but the long-term goal is to push that logic onto the client side as |
105 | | - much as possible. |
106 | | - |
107 | | -Both domains (c.f., the section on [DNS](#DNS) below) are owned by a |
108 | | -Cloudflare "Team", and membership of that team is required to |
109 | | -administrate the domains. Similar to the Heroku setup, you can ask to |
110 | | -join this team if you wish to help out. The information about the team |
111 | | -setup is in escrow with the Git PLC at Software Freedom Conservancy. |
112 | | -Cloudflare provides the project with enough credits that it doesn't cost |
113 | | -anything (though we're not using very many features, so it's possible |
114 | | -that a free account would be sufficient, too). |
115 | | - |
116 | | -## Bonsai Elasticsearch |
117 | | - |
118 | | -The search functionality on the site is served by an elasticsearch |
119 | | -cluster. The index can be populated by running `rake search_index` |
120 | | -(manpages) and `rake search_index_book` (book) on Heroku (we only index |
121 | | -the manpages and book). This perhaps should be run nightly, or at least |
122 | | -after pulling in new content, but it currently isn't done automatically. |
123 | | - |
124 | | -The elasticsearch cluster is provided by Bonsai via their Heroku plugin. |
125 | | -Our needs are larger than their free tier provides, but we receive |
126 | | -credits from them that provide the service for free. |
| 59 | +Merges to the `gh-pages` branch on GitHub auto-deploy to GitHub Pages via the |
| 60 | +`deploy` GitHub workflow. |
127 | 61 |
|
| 62 | +Note that some of the formatting of manual pages and book content happens |
| 63 | +when they are imported by the GitHub workflows. Therefore, whenever there are |
| 64 | +changes to the scripts/workflows/automation that affect formatting, these |
| 65 | +workflows may need to be triggered using the force-rebuild flag to be toggled |
| 66 | +(see the individual workflows for details). |
128 | 67 |
|
129 | 68 | ## DNS |
130 | 69 |
|
131 | | -The actual DNS service is provided by Cloudflare (see above). The domain |
132 | | -itself is registered with Gandi, and is owned by the project via |
133 | | -Software Freedom Conservancy. Funds for the registration are provided |
134 | | -from the Git project's Conservancy funds, and both the Git PLC and |
135 | | -Conservancy have credentials to modify the setup. |
| 70 | +The actual DNS service is provided by Cloudflare. The domain itself is |
| 71 | +registered with Gandi, and is owned by the project via Software Freedom |
| 72 | +Conservancy. Funds for the registration are provided from the Git project's |
| 73 | +Conservancy funds, and both the Git PLC and Conservancy have credentials to |
| 74 | +modify the setup. |
136 | 75 |
|
137 | 76 | Note that we own both git-scm.com and git-scm.org; the latter redirects |
138 | 77 | to the former. |
139 | 78 |
|
140 | | - |
141 | 79 | ## Manual Intervention |
142 | 80 |
|
143 | 81 | The site mostly just runs without intervention: |
144 | 82 |
|
145 | | - - code merged to `main` is auto-deployed |
| 83 | + - code merged to `gh-pages` is auto-deployed |
146 | 84 |
|
147 | | - - new git versions are detected daily and manpages and download links |
| 85 | + - new git versions are detected daily and manual pages and download links |
148 | 86 | updated |
149 | 87 |
|
150 | 88 | - book updates (including translations) are picked up daily |
151 | 89 |
|
152 | 90 | There are a few tasks that still need to be handled by a human: |
153 | 91 |
|
154 | | - - new images added to the book have to be copied manually from |
155 | | - progit/progit2 |
156 | | - |
157 | 92 | - new languages for book translations need to be added to |
158 | | - `lib/tasks/book2.rake` |
| 93 | + `script/book.rb` |
159 | 94 |
|
160 | | - - forced re-imports of content (e.g., a formatting fix to imported |
161 | | - manpages) must be triggered manually |
| 95 | + - forced re-imports of content (e.g., when fixing formatting in the |
| 96 | + imported manual pages) must be triggered manually with `force-rebuild` |
| 97 | + toggled |
0 commit comments