Skip to content

Commit c889c15

Browse files
authored
Merge pull request #1127 from NASA-IMPACT/1126-managepy-command-for-database-backups
1126 managepy command for database backups
2 parents 2b811b6 + a9e63bb commit c889c15

File tree

7 files changed

+826
-33
lines changed

7 files changed

+826
-33
lines changed

README.md

Lines changed: 68 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -70,56 +70,97 @@ $ docker-compose -f local.yml run --rm django python manage.py createsuperuser
7070

7171
Create additional users through the admin interface (/admin).
7272

73-
### Loading Fixtures
73+
### Database Backup and Restore
7474

75-
To load collections:
75+
COSMOS provides dedicated management commands for backing up and restoring your PostgreSQL database. These commands handle both compressed and uncompressed backups and automatically detect your server environment from your configuration.
7676

77-
```bash
78-
$ docker-compose -f local.yml run --rm django python manage.py loaddata sde_collections/fixtures/collections.json
79-
```
77+
#### Creating a Database Backup
8078

81-
### Manually Creating and Loading a ContentTypeless Backup
82-
Navigate to the server running prod, then to the project folder. Run the following command to create a backup:
79+
To create a backup of your database:
8380

8481
```bash
85-
docker-compose -f production.yml run --rm --user root django python manage.py dumpdata --natural-foreign --natural-primary --exclude=contenttypes --exclude=auth.Permission --indent 2 --output /app/backups/prod_backup-20241114.json
82+
# Create a compressed backup (recommended)
83+
docker-compose -f local.yml run --rm django python manage.py database_backup
84+
85+
# Create an uncompressed backup
86+
docker-compose -f local.yml run --rm django python manage.py database_backup --no-compress
87+
88+
# Specify custom output location
89+
docker-compose -f local.yml run --rm django python manage.py database_backup --output /path/to/output.sql
8690
```
87-
This will have saved the backup in a folder outside of the docker container. Now you can copy it to your local machine.
91+
92+
The backup command will automatically:
93+
- Detect your server environment (Production/Staging/Local)
94+
- Use database credentials from your environment settings
95+
- Generate a dated filename if no output path is specified
96+
- Compress the backup by default (can be disabled with --no-compress)
97+
98+
#### Restoring from a Database Backup
99+
100+
To restore your database from a backup:
88101

89102
```bash
90-
mv ~/prod_backup-20240812.json <project_path>/prod_backup-20240812.json
91-
scp sde:/home/ec2-user/sde_indexing_helper/backups/prod_backup-20240812.json prod_backup-20240812.json
103+
# Restore from a backup (handles both .sql and .sql.gz files)
104+
docker-compose -f local.yml run --rm django python manage.py database_restore path/to/backup.sql[.gz]
92105
```
93106

94-
Finally, load the backup into your local database:
107+
The restore command will:
108+
- Automatically detect if the backup is compressed (.gz)
109+
- Terminate existing database connections
110+
- Drop and recreate the database
111+
- Restore all data from the backup
112+
- Handle all database credentials from your environment settings
113+
114+
#### Working with Remote Servers
115+
116+
When working with production or staging servers:
95117

118+
1. First, SSH into the appropriate server:
96119
```bash
97-
docker-compose -f local.yml run --rm django python manage.py loaddata prod_backup-20240812.json
120+
# For production
121+
ssh user@production-server
122+
cd /path/to/project
123+
124+
# For staging
125+
ssh user@staging-server
126+
cd /path/to/project
98127
```
99128

100-
### Loading the Database from an Arbitrary Backup
129+
2. Then run the backup command with the production configuration:
130+
```bash
131+
docker-compose -f production.yml run --rm django python manage.py database_backup
132+
```
101133

102-
1. Build the project and run the necessary containers (as documented above).
103-
2. Clear out content types using the Django shell:
134+
3. Copy the backup to your local machine:
135+
```bash
136+
scp user@remote-server:/path/to/backup.sql.gz ./local-backup.sql.gz
137+
```
104138

139+
4. Finally, restore locally:
105140
```bash
106-
$ docker-compose -f local.yml run --rm django python manage.py shell
107-
>>> from django.contrib.contenttypes.models import ContentType
108-
>>> ContentType.objects.all().delete()
109-
>>> exit()
141+
docker-compose -f local.yml run --rm django python manage.py database_restore local-backup.sql.gz
110142
```
111143

112-
3. Load your backup database:
144+
#### Alternative Methods
145+
146+
While the database_backup and database_restore commands are the recommended approach, there are alternative methods available:
147+
148+
##### Using JSON Fixtures (for smaller datasets)
149+
If you're working with a smaller dataset, you can use Django's built-in fixtures:
113150

114151
```bash
115-
$ docker cp /path/to/your/backup.json container_name:/path/inside/container/backup.json
116-
$ docker-compose -f local.yml run --rm django python manage.py loaddata /path/inside/the/container/backup.json
117-
$ docker-compose -f local.yml run --rm django python manage.py migrate
152+
# Create a backup excluding content types
153+
docker-compose -f production.yml run --rm --user root django python manage.py dumpdata \
154+
--natural-foreign --natural-primary \
155+
--exclude=contenttypes --exclude=auth.Permission \
156+
--indent 2 \
157+
--output /app/backups/prod_backup-$(date +%Y%m%d).json
158+
159+
# Restore from a fixture
160+
docker-compose -f local.yml run --rm django python manage.py loaddata /path/to/backup.json
118161
```
119-
### Restoring the Database from a SQL Dump
120-
If the JSON file is particularly large (>1.5GB), Docker might struggle with this method. In such cases, you can use SQL dump and restore commands as an alternative, as described [here](./SQLDumpRestoration.md).
121-
122162

163+
Note: For large databases (>1.5GB), the database_backup and database_restore commands are strongly recommended over JSON fixtures as they handle large datasets more efficiently.
123164

124165
## Additional Commands
125166

compose/local/django/Dockerfile

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,13 +38,20 @@ WORKDIR ${APP_HOME}
3838

3939
# Install required system dependencies
4040
RUN apt-get update && apt-get install --no-install-recommends -y \
41+
wget \
42+
gnupg \
4143
# psycopg2 dependencies
4244
libpq-dev \
4345
# Translations dependencies
4446
gettext \
4547
# pycurl dependencies
4648
libcurl4-openssl-dev \
4749
libssl-dev \
50+
# PostgreSQL 15
51+
&& sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt bullseye-pgdg main" > /etc/apt/sources.list.d/pgdg.list' \
52+
&& wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | apt-key add - \
53+
&& apt-get update \
54+
&& apt-get install -y postgresql-15 postgresql-client-15 \
4855
# cleaning up unused files
4956
&& apt-get purge -y --auto-remove -o APT::AutoRemove::RecommendsImportant=false \
5057
&& rm -rf /var/lib/apt/lists/*

compose/production/django/Dockerfile

Lines changed: 8 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,6 @@ COPY ./requirements .
2323
RUN pip wheel --wheel-dir /usr/src/app/wheels \
2424
-r ${BUILD_ENVIRONMENT}.txt
2525

26-
2726
# Python 'run' stage
2827
FROM python AS python-run-stage
2928

@@ -39,16 +38,22 @@ WORKDIR ${APP_HOME}
3938
RUN addgroup --system django \
4039
&& adduser --system --ingroup django django
4140

42-
4341
# Install required system dependencies
4442
RUN apt-get update && apt-get install --no-install-recommends -y \
43+
wget \
44+
gnupg \
4545
# psycopg2 dependencies
4646
libpq-dev \
4747
# Translations dependencies
4848
gettext \
4949
# pycurl dependencies
5050
libcurl4-openssl-dev \
5151
libssl-dev \
52+
# PostgreSQL 15
53+
&& sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt bullseye-pgdg main" > /etc/apt/sources.list.d/pgdg.list' \
54+
&& wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | apt-key add - \
55+
&& apt-get update \
56+
&& apt-get install -y postgresql-15 postgresql-client-15 \
5257
# cleaning up unused files
5358
&& apt-get purge -y --auto-remove -o APT::AutoRemove::RecommendsImportant=false \
5459
&& rm -rf /var/lib/apt/lists/*
@@ -61,25 +66,22 @@ COPY --from=python-build-stage /usr/src/app/wheels /wheels/
6166
RUN pip install --no-cache-dir --no-index --find-links=/wheels/ /wheels/* \
6267
&& rm -rf /wheels/
6368

64-
6569
COPY --chown=django:django ./compose/production/django/entrypoint /entrypoint
6670
RUN sed -i 's/\r$//g' /entrypoint
6771
RUN chmod +x /entrypoint
6872

69-
7073
COPY --chown=django:django ./compose/production/django/start /start
7174
RUN sed -i 's/\r$//g' /start
7275
RUN chmod +x /start
76+
7377
COPY --chown=django:django ./compose/production/django/celery/worker/start /start-celeryworker
7478
RUN sed -i 's/\r$//g' /start-celeryworker
7579
RUN chmod +x /start-celeryworker
7680

77-
7881
COPY --chown=django:django ./compose/production/django/celery/beat/start /start-celerybeat
7982
RUN sed -i 's/\r$//g' /start-celerybeat
8083
RUN chmod +x /start-celerybeat
8184

82-
8385
COPY ./compose/production/django/celery/flower/start /start-flower
8486
RUN sed -i 's/\r$//g' /start-flower
8587
RUN chmod +x /start-flower
Lines changed: 142 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,142 @@
1+
"""
2+
Management command to backup PostgreSQL database.
3+
4+
Usage:
5+
docker-compose -f local.yml run --rm django python manage.py database_backup
6+
docker-compose -f local.yml run --rm django python manage.py database_backup --no-compress
7+
docker-compose -f local.yml run --rm django python manage.py database_backup --output /path/to/output.sql
8+
docker-compose -f production.yml run --rm django python manage.py database_backup
9+
"""
10+
11+
import enum
12+
import gzip
13+
import os
14+
import shutil
15+
import socket
16+
import subprocess
17+
from contextlib import contextmanager
18+
from datetime import datetime
19+
20+
from django.conf import settings
21+
from django.core.management.base import BaseCommand
22+
23+
24+
class Server(enum.Enum):
25+
PRODUCTION = "PRODUCTION"
26+
STAGING = "STAGING"
27+
UNKNOWN = "UNKNOWN"
28+
29+
30+
def detect_server() -> Server:
31+
hostname = socket.gethostname().upper()
32+
if "PRODUCTION" in hostname:
33+
return Server.PRODUCTION
34+
elif "STAGING" in hostname:
35+
return Server.STAGING
36+
return Server.UNKNOWN
37+
38+
39+
@contextmanager
40+
def temp_file_handler(filename: str):
41+
"""Context manager to handle temporary files, ensuring cleanup."""
42+
try:
43+
yield filename
44+
finally:
45+
if os.path.exists(filename):
46+
os.remove(filename)
47+
48+
49+
class Command(BaseCommand):
50+
help = "Creates a PostgreSQL backup using pg_dump"
51+
52+
def add_arguments(self, parser):
53+
parser.add_argument(
54+
"--no-compress",
55+
action="store_true",
56+
help="Disable backup file compression (enabled by default)",
57+
)
58+
parser.add_argument(
59+
"--output",
60+
type=str,
61+
help="Output file path (default: auto-generated based on server name and date)",
62+
)
63+
64+
def get_backup_filename(self, server: Server, compress: bool, custom_output: str = None) -> tuple[str, str]:
65+
"""Generate backup filename and actual dump path.
66+
67+
Args:
68+
server: Server enum indicating the environment
69+
compress: Whether the output should be compressed
70+
custom_output: Optional custom output path
71+
72+
Returns:
73+
tuple[str, str]: A tuple containing (final_filename, temp_filename)
74+
- final_filename: The name of the final backup file (with .gz if compressed)
75+
- temp_filename: The name of the temporary dump file (always without .gz)
76+
"""
77+
if custom_output:
78+
# Ensure the output directory exists
79+
output_dir = os.path.dirname(custom_output)
80+
if output_dir:
81+
os.makedirs(output_dir, exist_ok=True)
82+
83+
if compress:
84+
return custom_output + (".gz" if not custom_output.endswith(".gz") else ""), custom_output.removesuffix(
85+
".gz"
86+
)
87+
return custom_output, custom_output
88+
else:
89+
date_str = datetime.now().strftime("%Y%m%d")
90+
temp_filename = f"{server.value.lower()}_backup_{date_str}.sql"
91+
final_filename = f"{temp_filename}.gz" if compress else temp_filename
92+
return final_filename, temp_filename
93+
94+
def run_pg_dump(self, output_file: str, env: dict) -> None:
95+
"""Execute pg_dump with given parameters."""
96+
db_settings = settings.DATABASES["default"]
97+
cmd = [
98+
"pg_dump",
99+
"-h",
100+
db_settings["HOST"],
101+
"-U",
102+
db_settings["USER"],
103+
"-d",
104+
db_settings["NAME"],
105+
"--no-owner",
106+
"--no-privileges",
107+
"-f",
108+
output_file,
109+
]
110+
subprocess.run(cmd, env=env, check=True)
111+
112+
def compress_file(self, input_file: str, output_file: str) -> None:
113+
"""Compress input file to output file using gzip."""
114+
with open(input_file, "rb") as f_in:
115+
with gzip.open(output_file, "wb") as f_out:
116+
shutil.copyfileobj(f_in, f_out)
117+
118+
def handle(self, *args, **options):
119+
server = detect_server()
120+
compress = not options["no_compress"]
121+
backup_file, dump_file = self.get_backup_filename(server, compress, options.get("output"))
122+
123+
env = os.environ.copy()
124+
env["PGPASSWORD"] = settings.DATABASES["default"]["PASSWORD"]
125+
126+
try:
127+
if compress:
128+
with temp_file_handler(dump_file):
129+
self.run_pg_dump(dump_file, env)
130+
self.compress_file(dump_file, backup_file)
131+
else:
132+
self.run_pg_dump(backup_file, env)
133+
134+
self.stdout.write(
135+
self.style.SUCCESS(
136+
f"Successfully created {'compressed ' if compress else ''}backup for {server.value}: {backup_file}"
137+
)
138+
)
139+
except subprocess.CalledProcessError as e:
140+
self.stdout.write(self.style.ERROR(f"Backup failed on {server.value}: {str(e)}"))
141+
except Exception as e:
142+
self.stdout.write(self.style.ERROR(f"Error during backup process: {str(e)}"))

0 commit comments

Comments
 (0)