Skip to content

Commit 0c534ca

Browse files
authored
Merge pull request #380 from github/lildude/docs-update-reorg
Documentation update and re-organisation
2 parents b9a81ed + 4b9968c commit 0c534ca

File tree

9 files changed

+356
-282
lines changed

9 files changed

+356
-282
lines changed

README.md

Lines changed: 18 additions & 264 deletions
Original file line numberDiff line numberDiff line change
@@ -1,24 +1,13 @@
1-
GitHub Enterprise Backup Utilities
2-
==================================
1+
# GitHub Enterprise Backup Utilities
32

43
This repository includes backup and recovery utilities for [GitHub Enterprise][1].
54

6-
- **[Features](#features)**
7-
- **[Requirements](#requirements)**
8-
- **[Backup host requirements](#backup-host-requirements)**
9-
- **[Storage requirements](#storage-requirements)**
10-
- **[GitHub Enterprise version requirements](#github-enterprise-version-requirements)**
11-
- **[Getting started](#getting-started)**
12-
- **[Migrating from GitHub Enterprise v11.10.34x to v2.0](#migrating-from-github-enterprise-v111034x-to-v20-or-v21)**
13-
- **[Using the backup and restore commands](#using-the-backup-and-restore-commands)**
14-
- **[Scheduling backups](#scheduling-backups)**
15-
- **[Backup snapshot file structure](#backup-snapshot-file-structure)**
16-
- **[How does backup utilities differ from a High Availability replica?](#how-does-backup-utilities-differ-from-a-high-availability-replica)**
17-
- **[Support](#support)**
5+
**Note**: the [GitHub Enterprise version requirements](docs/requirements.md#github-enterprise-version-requirements) have
6+
changed starting with Backup Utilities v2.13.0, released on 27 March 2018.
187

198
### Features
209

21-
The backup utilities implement a number of advanced capabilities for backup
10+
Backup Utilities implement a number of advanced capabilities for backup
2211
hosts, built on top of the backup and restore features already included in
2312
GitHub Enterprise.
2413

@@ -37,260 +26,25 @@ GitHub Enterprise.
3726
- Runs under most Linux/Unix environments.
3827
- MIT licensed, open source software maintained by GitHub, Inc.
3928

40-
### Requirements
41-
42-
The backup utilities should be run on a host dedicated to long-term permanent
43-
storage and must have network connectivity with the GitHub Enterprise appliance.
44-
45-
##### Backup host requirements
46-
47-
Backup host software requirements are modest: Linux or other modern Unix
48-
operating system with [bash][13], [git][14], [OpenSSH][15] 5.6 or newer, and [rsync][4] v2.6.4 or newer.
49-
50-
The backup host must be able to establish network connections outbound to the
51-
GitHub appliance over SSH. TCP port 122 is used to backup GitHub Enterprise 2.0
52-
or newer instances, and TCP port 22 is used for older versions (11.10.34X).
53-
54-
##### Storage requirements
55-
56-
Storage requirements vary based on current Git repository disk usage and growth
57-
patterns of the GitHub appliance. We recommend allocating at least 5x the amount
58-
of storage allocated to the primary GitHub appliance for historical snapshots
59-
and growth over time.
60-
61-
The backup utilities use [hard links][12] to store data efficiently, so the backup
62-
snapshots must be written to a filesystem with support for hard links.
63-
64-
Using a [case sensitive][16] file system is strongly recommended to avoid conflicts.
65-
66-
##### GitHub Enterprise version requirements
67-
68-
The backup utilities are fully supported under GitHub Enterprise 2.0 or
69-
greater.
70-
71-
The previous release series (11.10.34x) is also supported but must meet minimum
72-
version requirements. For online and incremental backup support, the GitHub
73-
Enterprise instance must be running version 11.10.342 or above.
74-
75-
Earlier versions are supported, but online and incremental backups are not
76-
supported. We strongly recommend upgrading to the latest release if you're
77-
running a version prior to 11.10.342. Visit [enterprise.github.com][5] to
78-
download the most recent GitHub Enterprise version.
79-
80-
Note: You can restore a snapshot that's at most two feature releases behind the restore target's version of GitHub Enterprise. For example, to restore a snapshot of GitHub Enterprise 2.4, the target GitHub Enterprise appliance must be running GitHub Enterprise 2.5.x or 2.6.x. You can't restore a snapshot from 2.4 to 2.7, because that's three releases ahead.
81-
82-
83-
### Getting started
84-
85-
1. [Download the latest release version][release] and extract the repository using `tar`:
86-
87-
`tar -xzvf /path/to/github-backup-utils-vMAJOR.MINOR.PATCH.tar.gz`
88-
89-
*or* clone the repository using Git:
90-
91-
`git clone -b stable https://github.com/github/backup-utils.git`
92-
93-
2. Copy the [`backup.config-example`][2] file to `backup.config` and modify as
94-
necessary. The `GHE_HOSTNAME` value must be set to the GitHub Enterprise
95-
host name. Additional options are available and documented in the
96-
configuration file but none are required for basic backup functionality.
97-
98-
* backup-utils will attempt to load the backup configuration from the following locations, in this order:
99-
100-
```
101-
$GHE_BACKUP_CONFIG (User configurable environment variable)
102-
$GHE_BACKUP_ROOT/backup.config (Root directory of backup-utils install)
103-
$HOME/.github-backup-utils/backup.config
104-
/etc/github-backup-utils/backup.config
105-
```
106-
* In a clustering environment, the `GHE_EXTRA_SSH_OPTS` key must be configured with the `-i <abs path to private key>` SSH option.
107-
108-
3. Add the backup host's SSH key to the GitHub appliance as an *Authorized SSH
109-
key*. See [Adding an SSH key for shell access][3] for instructions.
110-
111-
4. Run `bin/ghe-host-check` to verify SSH connectivity with the GitHub
112-
appliance.
113-
114-
5. Run `bin/ghe-backup` to perform an initial full backup.
115-
116-
[release]: https://github.com/github/backup-utils/releases
117-
118-
### Migrating from GitHub Enterprise v11.10.34x to v2.0, or v2.1
119-
120-
If you are migrating from GitHub Enterprise version 11.10.34x to 2.0 or 2.1
121-
(note, migrations to versions greater than 2.1 are not officially supported),
122-
please see the [Migrating from GitHub Enterprise v11.10.34x][10] documentation
123-
in the [GitHub Enterprise System Administrator's Guide][11]. It includes
124-
important information on using the backup utilities to migrate data from your
125-
v11.10.34x instance to v2.0 or v2.1.
126-
127-
### Using the backup and restore commands
128-
129-
After the initial backup, use the following commands:
130-
131-
- The `ghe-backup` command creates incremental snapshots of repository data,
132-
along with full snapshots of all other pertinent data stores.
133-
- The `ghe-restore` command restores snapshots to the same or separate GitHub
134-
Enterprise appliance. You must add the backup host's SSH key to the target
135-
GitHub Enterprise appliance before using this command.
136-
137-
##### Example backup and restore usage
138-
139-
The following assumes that `GHE_HOSTNAME` is set to "github.example.com" in
140-
`backup.config`.
141-
142-
Creating a backup snapshot:
143-
144-
$ ghe-backup
145-
Starting backup of github.example.com in snapshot 20140727T224148
146-
Connect github.example.com OK (v11.10.343)
147-
Backing up GitHub settings ...
148-
Backing up SSH authorized keys ...
149-
Backing up SSH host keys ...
150-
Backing up MySQL database ...
151-
Backing up Redis database ...
152-
Backing up Git repositories ...
153-
Backing up GitHub Pages ...
154-
Backing up Elasticsearch indices ...
155-
Completed backup of github.example.com in snapshot 20140727T224148 at 23:01:58
156-
157-
Restoring from last successful snapshot to a newly provisioned GitHub Enterprise
158-
appliance at IP "5.5.5.5":
159-
160-
$ ghe-restore 5.5.5.5
161-
Starting rsync restore of 5.5.5.5 from snapshot 20140727T224148
162-
Connect 5.5.5.5 OK (v11.10.343)
163-
Enabling maintenance mode on 5.5.5.5 ...
164-
Restoring Git repositories ...
165-
Restoring GitHub Pages ...
166-
Restoring MySQL database ...
167-
Restoring Redis database ...
168-
Restoring SSH authorized keys ...
169-
Restoring Elasticsearch indices ...
170-
Restoring SSH host keys ...
171-
Completed restore of 5.5.5.5 from snapshot 20140817T174152
172-
Visit https://5.5.5.5/setup/settings to configure the recovered appliance.
173-
174-
A different backup snapshot may be selected by passing the `-s` argument and the
175-
datestamp-named directory from the backup location.
176-
177-
The `ghe-backup` and `ghe-restore` commands also have a verbose output mode
178-
(`-v`) that lists files as they're being transferred. It's often useful to
179-
enable when output is logged to a file.
180-
181-
When restoring to an already configured GHE instance, settings, certificate, and license data
182-
are *not* restored to prevent overwriting manual configuration on the restore
183-
host. This behavior can be overridden by passing the `-c` argument to `ghe-restore`,
184-
forcing settings, certificate, and license data to be overwritten with the backup copy's data.
185-
186-
### Scheduling backups
187-
188-
Regular backups should be scheduled using `cron(8)` or similar command
189-
scheduling service on the backup host. The backup frequency will dictate the
190-
worst case recovery point objective (RPO) in your backup plan. We recommend the
191-
following:
192-
193-
- **Hourly backups** for GitHub Enterprise versions 11.10.342 or greater (due to
194-
improved online and incremental backup support)
195-
- **Daily backups** for versions prior to 11.10.342.
196-
197-
Note: the time required to do full offline backups of large datasets under
198-
GitHub Enterprise versions prior to 11.10.342 may prohibit the use of daily
199-
backups. We strongly recommend upgrading to 11.10.342 or greater in that case.
200-
201-
##### Example scheduling usage
202-
203-
The following examples assume the backup utilities are installed under
204-
`/opt/backup-utils`. The crontab entry should be made under the same user that
205-
manual backup/recovery commands will be issued under and must have write access
206-
to the configured `GHE_DATA_DIR` directory.
207-
208-
Note that the `GHE_NUM_SNAPSHOTS` option in `backup.config` should be tuned
209-
based on the frequency of backups. The ten most recent snapshots are retained by
210-
default. The number should be adjusted based on backup frequency and available
211-
storage.
212-
213-
To schedule hourly backup snapshots with verbose informational output written to
214-
a log file and errors generating an email:
215-
216-
217-
218-
0 * * * * /opt/backup-utils/bin/ghe-backup -v 1>>/opt/backup-utils/backup.log 2>&1
219-
220-
To schedule nightly backup snapshots instead, use:
221-
222-
223-
224-
0 0 * * * /opt/backup-utils/bin/ghe-backup -v 1>>/opt/backup-utils/backup.log 2>&1
225-
226-
### Backup snapshot file structure
227-
228-
Backup snapshots are stored in rotating increment directories named after the
229-
date and time the snapshot was taken. Each snapshot directory contains a full
230-
backup snapshot of all relevant data stores. Repository, Search, and Pages data
231-
is stored efficiently via hard links.
232-
233-
*Please note* Symlinks must be maintained when archiving backup snapshots.
234-
Dereferencing or excluding symlinks, or storing the snapshot contents on a
235-
filesystem which does not support symlinks will result in operational
236-
problems when the data is restored.
237-
238-
The following example shows a snapshot file hierarchy for hourly frequency.
239-
There are five snapshot directories, with the `current` symlink pointing to the
240-
most recent successful snapshot:
241-
242-
./data
243-
|- 20140724T010000
244-
|- 20140725T010000
245-
|- 20140726T010000
246-
|- 20140727T010000
247-
|- 20140728T010000
248-
|- authorized-keys.json
249-
|- elasticsearch/
250-
|- enterprise.ghl
251-
|- mysql.sql.gz
252-
|- pages/
253-
|- redis.rdb
254-
|- repositories/
255-
|- settings.json
256-
|- ssh-host-keys.tar
257-
|- strategy
258-
|- version
259-
|- current -> 20140728T010000
260-
261-
Note: the `GHE_DATA_DIR` variable set in `backup.config` can be used to change
262-
the disk location where snapshots are written.
263-
264-
### How does backup utilities differ from a High Availability replica?
265-
It is recommended that both backup utilities and an [High Availability replica](https://help.github.com/enterprise/admin/guides/installation/high-availability-cluster-configuration/) are used as part of a GitHub Enterprise deployment but they serve different roles.
266-
267-
##### The purpose of the High Availability replica
268-
The High Availability replica is a fully redundant secondary GitHub Enterprise instance, kept in sync with the primary instance via replication of all major datastores. This active/passive cluster configuration is designed to minimize service disruption in the event of hardware failure or major network outage affecting the primary instance. Because some forms of data corruption or loss may be replicated immediately from primary to replica, it is not a replacement for the backup utilities as part of your disaster recovery plan.
269-
270-
##### The purpose of the backup utilities
271-
Backup utilities are a disaster recovery tool. This tool takes date-stamped snapshots of all major datastores. These snapshots are used to restore an instance to a prior state or set up a new instance without having another always-on GitHub Enterprise instance (like the High Availability replica).
29+
### Documentation
27230

31+
- **[Requirements](docs/requirements.md)**
32+
- **[Backup host requirements](docs/requirements.md#backup-host-requirements)**
33+
- **[Storage requirements](docs/requirements.md#storage-requirements)**
34+
- **[GitHub Enterprise version requirements](docs/requirements.md#github-enterprise-version-requirements)**
35+
- **[Getting started](docs/getting-started.md)**
36+
- **[Using the backup and restore commands](docs/usage.md)**
37+
- **[Scheduling backups](docs/scheduling-backups.md)**
38+
- **[Backup snapshot file structure](docs/backup-snapshot-file-structure.md)**
39+
- **[How does Backup Utilities differ from a High Availability replica?](docs/faq.md)**
40+
- **[Docker](docs/docker.md)**
27341

27442
### Support
27543

276-
If you find a bug or would like to request a feature in backup-utils, please
44+
If you find a bug or would like to request a feature in Backup Utilities, please
27745
open an issue or pull request on this repository. If you have a question related
27846
to your specific GitHub Enterprise setup or would like assistance with backup
279-
site setup or recovery, please contact our [Enterprise support team][7] instead.
47+
site setup or recovery, please contact our [Enterprise support team][2] instead.
28048

28149
[1]: https://enterprise.github.com
282-
[2]: https://github.com/github/enterprise-backup-site/blob/master/backup.config-example
283-
[3]: https://enterprise.github.com/help/articles/adding-an-ssh-key-for-shell-access
284-
[4]: http://rsync.samba.org/
285-
[5]: https://enterprise.github.com/download
286-
[6]: https://enterprise.github.com/help/articles/upgrading-to-a-newer-release
287-
[7]: https://enterprise.github.com/support/
288-
[8]: https://enterprise.github.com/help/articles/backing-up-enterprise-data
289-
[9]: https://enterprise.github.com/help/articles/restoring-enterprise-data
290-
[10]: https://help.github.com/enterprise/2.0/admin-guide/migrating-to-a-different-platform-or-from-github-enterprise-11-10-34x/
291-
[11]: https://help.github.com/enterprise/2.0/admin-guide/
292-
[12]: https://en.wikipedia.org/wiki/Hard_link
293-
[13]: https://www.gnu.org/software/bash/
294-
[14]: https://git-scm.com/
295-
[15]: https://www.openssh.com/
296-
[16]: https://en.wikipedia.org/wiki/Case_sensitivity
50+
[2]: https://enterprise.github.com/support/

docs/README.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
# GitHub Enterprise Backup Utilities Documentation
2+
3+
- **[Requirements](requirements.md)**
4+
- **[Backup host requirements](requirements.md#backup-host-requirements)**
5+
- **[Storage requirements](requirements.md#storage-requirements)**
6+
- **[GitHub Enterprise version requirements](requirements.md#github-enterprise-version-requirements)**
7+
- **[Getting started](getting-started.md)**
8+
- **[Using the backup and restore commands](usage.md)**
9+
- **[Scheduling backups](scheduling-backups.md)**
10+
- **[Backup snapshot file structure](backup-snapshot-file-structure.md)**
11+
- **[How does Backup Utilities differ from a High Availability replica?](faq.md)**
12+
- **[Docker](docker.md)**
Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
# Backup snapshot file structure
2+
3+
Backup snapshots are stored in rotating increment directories named after the
4+
date and time the snapshot was taken. Each snapshot directory contains a full
5+
backup snapshot of all relevant data stores. Repository, Search, and Pages data
6+
is stored efficiently via hard links.
7+
8+
*Please note* Symlinks must be maintained when archiving backup snapshots.
9+
Dereferencing or excluding symlinks, or storing the snapshot contents on a
10+
filesystem which does not support symlinks will result in operational
11+
problems when the data is restored.
12+
13+
The following example shows a snapshot file hierarchy for hourly frequency.
14+
There are five snapshot directories, with the `current` symlink pointing to the
15+
most recent successful snapshot:
16+
17+
./data
18+
|- 20180124T010000
19+
|- 20180125T010000
20+
|- 20180126T010000
21+
|- 20180127T010000
22+
|- 20180128T010000
23+
|- audit-log
24+
|- benchmarks
25+
|- elasticsearch
26+
|- git-hooks
27+
|- hookshot
28+
|- pages
29+
|- repositories
30+
|- storage
31+
|- authorized-keys.json
32+
|- enterprise.ghl
33+
|- es-scan-complete
34+
|- manage-password
35+
|- mysql.sql.gz
36+
|- redis.rdb
37+
|- settings.json
38+
|- ssh-host-keys.tar
39+
|- ssl-ca-certificates.tar
40+
|- strategy
41+
|- uuid
42+
|- version
43+
|- current -> 20180128T010000
44+
45+
Note: the `GHE_DATA_DIR` variable set in `backup.config` can be used to change
46+
the disk location where snapshots are written.

0 commit comments

Comments
 (0)