Skip to content

Commit fd21340

Browse files
committed
operations: Add docs on compaction
1 parent 2f8ef59 commit fd21340

File tree

7 files changed

+29
-11
lines changed

7 files changed

+29
-11
lines changed

docs/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ To develop using Skytable and maintain your deployment you will want to learn ab
2222
- [**Configuration**](system/configuration): Information to help you configure Skytable with custom settings such as custom ports, hosts, TLS, and etc.
2323
- [**User management**](system/user-management): Information on access control, user and other administration features
2424
- [**Global management**](system/global-management): Global settings management
25-
- [**Data recovery**](system/recovery): Database recovery
25+
- [**Operations**](system/operations): Learn about administration operations
2626
- **Resources**:
2727
- [**Useful links**](resources/useful-links): Links to helpful resources
2828
- [**Migration**](resources/migration): For old our returning Skytable users who are coming from older versions

docs/system/1.configuration.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,7 @@ To start the server with a configuration file, simply run `skyd --config <path t
4444
Here's an explanation of all the keys:
4545
- `system`:
4646
- `mode`: set to either `dev` / `prod` mode. `prod` mode will generally make some things stricters (such as background services)
47-
- `rs_window`: **This is a very important setting!** It is set to `300` by default and is called the "reliability service window" which ensures that if any changes are observed in `300` (or whatever value you set) seconds, then they reach the disk as soon as that time elapses. For example, in the default configuration the system checks for changes every 5 minutes and if there are any dataset changes, they are immediately synced. [Read more here](recovery#understanding-data-loss)
47+
- `rs_window`: **This is a very important setting!** It is set to `300` by default and is called the "reliability service window" which ensures that if any changes are observed in `300` (or whatever value you set) seconds, then they reach the disk as soon as that time elapses. For example, in the default configuration the system checks for changes every 5 minutes and if there are any dataset changes, they are immediately synced. [Read more here](operations#understanding-data-loss)
4848
- `auth`:
4949
- `plugin`: this is the authentication plugin. we currently only have `pwd` that is a simple password based authentication system where the password is stored as an [`rcrypt` hash](https://github.com/ohsayan/rcrypt) on disk. More `plugin` options are set to be implemented for more advanced authentication, especially in enterprise settings
5050
- `root_pass`: this is the root account password. **It must have atleast 16 characters**

docs/system/3.global-management.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ The following query returns an `Empty` response or an error code depending on th
1111
SYSCTL REPORT STATUS
1212
```
1313

14-
If you receive an error code, we recommend you to connect to the host and check logs. If the server has crashed, you may need to [recover the database](recovery).
14+
If you receive an error code, we recommend you to connect to the host and check logs. If the server has crashed, you may need to [recover the database](operations#data-recovery).
1515

1616
## Inspecting all spaces
1717

docs/system/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,4 +11,4 @@ Here's an overview of the different administration guides:
1111
- [**Configuration**](configuration): Understand how Skytable can be configured using command-line arguments, environment variables or a configuration file and what all configuration options are available
1212
- [**User management**](user-management): Learn about account types, permissions and how you can manage multiple users
1313
- [**Global management**](global-management): Learn how to check system health and manage the global state of your database instances
14-
- [**Data recovery**](recovery): Understand what to do after a system crash and how to recover data if needed
14+
- [**Operations**](operations): Understand administrator operations tasks such as backups, recovery and more
Lines changed: 20 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,30 @@
11
---
2-
id: recovery
3-
title: Recovery
2+
title: Operations
43
---
54

5+
## Managing disk usage
6+
7+
Over time, as you continue to use your database your database files will grow in size, as you would expect. However, sometimes database files may grow beyond an efficient size resulting in high memory usage or slowdowns. To counter this, Skytable uses internal heuristics to determine when a database file is "larger than needed" and automatically compacts them at startup.
8+
9+
However, in some cases you may wish to perform a compaction regardless in order to reduce the file size. In order to do this you will have to run:
10+
11+
```sh
12+
skyd compact
13+
```
14+
15+
The server will then compact all files (even if a compaction wasn't triggered by internal heuristics) to their optimum size.
16+
17+
## Data recovery
18+
619
In the unforeseen event that a power failure or other catastrophic system failure causes the database to crash, the Skytable server will fail to start normally. Usually it will exit with a nonzero code and an error message such as "journal-corrupted." In such cases, you will need to recover the journal(s) and/or any other corrupted file(s).
720

8-
## Understanding data loss
21+
### Understanding data loss
922

1023
All DDL and DCL queries are immediately written to disk when they're run and hence usually no data loss will occur due to a runtime crash (unless a crash occurs in the middle of a disk write). On the other hand, DML queries are written in optimized delayed-durability batches, i.e when the engine determines that either there are too many pending changes or if too much memory is being used (alongside other factors). This however means that in the case of a runtime crash with pending changes, some of these changes may be lost.
1124

1225
This is why it is so important to tune the [`rs_window`] value or the "Reliability Service" window which ensures that irrespective of the number of changes, all changes will be flushed in that given duration. We're further working on supporting optimized immediate writes for DML queries (which however as expected would come with a significant performance penalty).
1326

14-
## Recovering database files
27+
### Recovering database files
1528

1629
To repair the database, simply run this on the command line **in the working directory of the database**:
1730

@@ -20,12 +33,13 @@ skyd repair
2033
```
2134
The recovery system will first create a full backup of the current data files in a subdirectory in the `backups/` directory. It will then go over each database file, try to detect any errors and make any approriate repairs.
2235

23-
## Important notes
36+
### Important notes
2437

2538
- The recovery system is *very conservative* and will attempt to restore the database to the most recent working state. Any remaining data is deemed unreliable and not loaded
2639
- Please ensure that you have sufficient disk space before attempting a repair
2740
- The earlier in the file the corruption happens, the greater the amount of data lost
2841

29-
## Post recovery
42+
### Post recovery
3043

3144
After running a repair operation, if a signficant amount of data loss has occurred (as reported by `skyd`) then we strongly recommend you to manually look through your datasets. The recovery process guarantees that the *restored data* is intact. If this failure resulted from power loss, in the future you may consider installing power backup systems if self-hosting or choosing a reliable cloud provider.
45+

docusaurus.config.js

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -161,6 +161,10 @@ module.exports = {
161161
{
162162
from: '/protocol/networking',
163163
to: '/protocol/specification'
164+
},
165+
{
166+
from: '/system/recovery',
167+
to: '/system/operations#data-recovery'
164168
}
165169
]
166170
}]

sidebars.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ module.exports = {
2727
"system/configuration",
2828
"system/user-management",
2929
"system/global-management",
30-
"system/recovery",
30+
"system/operations",
3131
],
3232
link: {
3333
type: 'doc',

0 commit comments

Comments
 (0)