|
| 1 | +This guide explains how to configure data and metadata storage in OpenObserve. The information applies to both the open-source and enterprise versions. |
| 2 | + |
| 3 | +## Overview |
| 4 | +There are 2 primary items that need to be stored in OpenObserve. |
| 5 | + |
| 6 | +- Ingested stream data |
| 7 | +- Metadata for ingested stream data |
| 8 | + |
| 9 | +By default: |
| 10 | + |
| 11 | +- Metadata is always stored on disk using **SQLite** in **Local mode**. |
| 12 | +- Metadata is always stored on disk using **postgres** in **Cluster mode**. |
| 13 | +- Stream data can be stored on disk or S3-compatible object storage such as Amazon S3, minIO, Google GCS, Alibaba OSS, or Tencent COS. |
| 14 | + |
| 15 | +## Storage Modes |
| 16 | + |
| 17 | +- OpenObserve runs in **Local mode** by default. |
| 18 | +- To enable **Cluster mode**, set the environment variable `LOCAL_MODE=false`. |
| 19 | +- In **Local mode**, stream data can be stored in S3 by setting `ZO_LOCAL_MODE_STORAGE=s3`. |
| 20 | +- GCS and OSS support the S3 SDK and can be treated as S3-compatible storages. Azure Blob storage is also supported. |
| 21 | + |
| 22 | +## Data Storage Format |
| 23 | + |
| 24 | +Stream data is stored in Parquet format. Parquet is columnar storage format optimized for storage efficiency and query performance. |
| 25 | + |
| 26 | +## Stream Data Storage Options |
| 27 | + |
| 28 | +### Disk |
| 29 | + |
| 30 | +Disk is default storage place for stream data. **Ensure that sufficient disk space is available for storing stream data.** |
| 31 | + |
| 32 | + |
| 33 | +### Amazon S3 |
| 34 | + |
| 35 | +To use Amazon S3 for storing stream data: |
| 36 | + |
| 37 | +1. Create the bucket in S3 first. |
| 38 | +2. Provide AWS credentials through one of the supported AWS SDK mechanisms: |
| 39 | + |
| 40 | + - Set environment variables `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY`. This is not recommended due to security concerns. |
| 41 | + - Use AWS CLI credentials in `~/.aws/credentials`. |
| 42 | + - Use EC2 instance metadata for instances with IAM roles, or assign IAM roles directly to ECS or Fargate tasks. These roles are accessed through the Instance Metadata Service (IMDS or IMDSv2). ECS is not recommended for stateful workloads. |
| 43 | + - Use IAM Roles for service Accounts in Amazon EKS. |
| 44 | + |
| 45 | + |
| 46 | +### MinIO |
| 47 | +To use MinIO for storing stream data, first create the bucket in MinIO. |
| 48 | +Then set the following environment variables: |
| 49 | + |
| 50 | +| Environment Variable | Value | Description | |
| 51 | +| -------------------- | ----- | ----------------------------------------------- | |
| 52 | +| ZO_S3_SERVER_URL | - | MinIO server address | |
| 53 | +| ZO_S3_REGION_NAME | - | Region name, such as `us-west-1` | |
| 54 | +| ZO_S3_ACCESS_KEY | - | Access key | |
| 55 | +| ZO_S3_SECRET_KEY | - | Secret key | |
| 56 | +| ZO_S3_BUCKET_NAME | - | Bucket name | |
| 57 | +| ZO_S3_PROVIDER | minio | Used to specify settings like `force_style=true` | |
| 58 | + |
| 59 | + |
| 60 | +### Openstack Swift |
| 61 | +To use OpenStack Swift for storing stream data, first create the bucket in Swift. |
| 62 | +Then set the following environment variables: |
| 63 | + |
| 64 | +| Environment Variable | Value | Description | |
| 65 | +| ------------------------- | ----- | ----------------------------------------------- | |
| 66 | +| ZO_S3_SERVER_URL | - | Swift server address, such as `https://us-west-1.example.com` | |
| 67 | +| ZO_S3_REGION_NAME | - | Region name, such as `us-west-1` | |
| 68 | +| ZO_S3_ACCESS_KEY | - | Access key | |
| 69 | +| ZO_S3_SECRET_KEY | - | Secret key | |
| 70 | +| ZO_S3_BUCKET_NAME | - | Bucket name | |
| 71 | +| ZO_S3_FEATURE_HTTP1_ONLY | true | Enables compatibility with Swift | |
| 72 | +| ZO_S3_PROVIDER | s3 | Enables S3-compatible API | |
| 73 | +| AWS_EC2_METADATA_DISABLED | true | Disables EC2 metadata access, which is not supported by Swift | |
| 74 | + |
| 75 | + |
| 76 | +### Google GCS |
| 77 | +To use GCS for storing stream data, first create the bucket in GCS. |
| 78 | + |
| 79 | +**Using the S3-compatible API:** |
| 80 | + |
| 81 | +| Environment Variable | Value | Description | |
| 82 | +| ------------------------ | -------| --------------------------------------------------------------- | |
| 83 | +| ZO_S3_SERVER_URL | - | GCS server address. Should be sent to `https://storage.googleapis.com` | |
| 84 | +| ZO_S3_REGION_NAME | - | GCS region name, or set to `auto` | |
| 85 | +| ZO_S3_ACCESS_KEY | - | Access key | |
| 86 | +| ZO_S3_SECRET_KEY | - | Secret key | |
| 87 | +| ZO_S3_BUCKET_NAME | - | Bucket name | |
| 88 | +| ZO_S3_FEATURE_HTTP1_ONLY | true | Required for compatibility | |
| 89 | +| ZO_S3_PROVIDER | s3 | Enables S3-compatible API | |
| 90 | + |
| 91 | +Refer to [GCS AWS migration documentation]((https://cloud.google.com/storage/docs/aws-simple-migration)) for more information. |
| 92 | + |
| 93 | +**Using GCS directly:** |
| 94 | + |
| 95 | +| Environment Variable | Value | Description | |
| 96 | +| ------------------------ | -------| ----------------------------------------------------------------------- | |
| 97 | +| ZO_S3_SERVER_URL | - | GCS server address. should be: `https://storage.googleapis.com` | |
| 98 | +| ZO_S3_REGION_NAME | - | region name, gcs region name, or: `auto` | |
| 99 | +| ZO_S3_ACCESS_KEY | - | Path to gcp json private key if not available through instance metadata | |
| 100 | +| ZO_S3_BUCKET_NAME | - | bucket name | |
| 101 | +| ZO_S3_PROVIDER | gcs | Use GCS API | |
| 102 | + |
| 103 | +OpenObserve uses the [object_store crate](https://docs.rs/object_store/0.10.1/object_store/gcp/struct.GoogleCloudStorageBuilder.html) to initialize the storage configuration. It calls the with_env() function by default. If the ZO_S3_ACCESS_KEY variable is set, OpenObserve additionally uses the with_service_account_path() function to load the GCP service account key. |
| 104 | + |
| 105 | +### Alibaba OSS (aliyun) |
| 106 | +To use Alibaba OSS for storing stream data, first create the bucket in Alibaba Cloud. |
| 107 | +Then set the following environment variables: |
| 108 | + |
| 109 | +| Environment Variable | Value | Description | |
| 110 | +| ------------------------------ | ----- | --------------------------------------------------------------- | |
| 111 | +| ZO_S3_SERVER_URL | - | OSS endpoint, such as `https://bucketname.oss-ap-southeast-1.aliyuncs.com` | |
| 112 | +| ZO_S3_REGION_NAME | - | OSS region name, such as `oss-cn-beijing`. | |
| 113 | +| ZO_S3_BUCKET_NAME | - | Bucket name | |
| 114 | +| ZO_S3_ACCESS_KEY | - | Access key | |
| 115 | +| ZO_S3_SECRET_KEY | - | Secret key | |
| 116 | +| ZO_S3_FEATURE_FORCE_HOSTED_STYLE | true | Enables hosted-style addressing | |
| 117 | + |
| 118 | +Refer to [Alibaba OSS region and endpoint documentation](https://help.aliyun.com/zh/oss/user-guide/regions-and-endpoints). |
| 119 | + |
| 120 | +### Tencent COS |
| 121 | +To use Tencent COS for storing stream data, first create the bucket in Tencent Cloud. |
| 122 | +Then set the following environment variables: |
| 123 | + |
| 124 | +| Environment Variable | Value | Description | |
| 125 | +| -------------------- | ----- | ---------------------------- | |
| 126 | +| ZO_S3_SERVER_URL | - | COS endpoint address | |
| 127 | +| ZO_S3_REGION_NAME | - | COS region name | |
| 128 | +| ZO_S3_ACCESS_KEY | - | Access key | |
| 129 | +| ZO_S3_SECRET_KEY | - | Secret key | |
| 130 | +| ZO_S3_BUCKET_NAME | - | Bucket name | |
| 131 | + |
| 132 | +Refer to [Tencent COS documentation](https://cloud.tencent.com/document/product/436/37421). |
| 133 | + |
| 134 | +### UCloud US3 |
| 135 | +To use UCloud US3 for storing stream data, first create the bucket in UCloud. |
| 136 | +Then set the following environment variables: |
| 137 | + |
| 138 | +| Environment Variable | Value | Description | |
| 139 | +| -------------------- | ----- | ---------------------------------------------------- | |
| 140 | +| ZO_S3_SERVER_URL | - | US3 endpoint, such as `http://internal.s3-sg.ufileos.com` | |
| 141 | +| ZO_S3_ACCESS_KEY | - | Access key | |
| 142 | +| ZO_S3_SECRET_KEY | - | Secret key | |
| 143 | +| ZO_S3_BUCKET_NAME | - | Bucket name | |
| 144 | +| ZO_S3_FEATURE_HTTP1_ONLY | true | Required for HTTP1 compatibility | |
| 145 | + |
| 146 | +Refer to [UCloud S3 documentation](https://docs.ucloud.cn/ufile/s3/s3_introduction). |
| 147 | + |
| 148 | +### Baidu BOS |
| 149 | +To use Baidu BOS for storing stream data, first create the bucket in Baidu Cloud. |
| 150 | +Then set the following environment variables: |
| 151 | + |
| 152 | +| Environment Variable | Value | Description | |
| 153 | +| -------------------- | ----- | ---------------------------------------------------- | |
| 154 | +| ZO_S3_SERVER_URL | - | BOS endpoint, such as `https://s3.bj.bcebos.com` | |
| 155 | +| ZO_S3_REGION_NAME | - | BOS region name, such as `bj` | |
| 156 | +| ZO_S3_ACCESS_KEY | - | Access key | |
| 157 | +| ZO_S3_SECRET_KEY | - | Secret key | |
| 158 | +| ZO_S3_BUCKET_NAME | - | Bucket name | |
| 159 | + |
| 160 | +Refer to [Baidu BOS documentation](https://cloud.baidu.com/doc/BOS/s/xjwvyq9l4). |
| 161 | + |
| 162 | +### Azure Blob |
| 163 | + |
| 164 | +OpenObserve can use azure blob for storing stream data. Following environment variables needs to be setup: |
| 165 | + |
| 166 | +| Environment Variable | Value | Description | |
| 167 | +| -------------------------- | -------------------- | -------------------------------------------- | |
| 168 | +| ZO_S3_PROVIDER | azure | Enables Azure Blob storage support | |
| 169 | +| ZO_LOCAL_MODE_STORAGE | s3 | Required only if running in single node mode | |
| 170 | +| AZURE_STORAGE_ACCOUNT_NAME | Storage account name | Need to provide mandatorily | |
| 171 | +| AZURE_STORAGE_ACCOUNT_KEY | Access key | Need to provide mandatorily | |
| 172 | +| ZO_S3_BUCKET_NAME | Blob Container name | Need to provide mandatorily | |
| 173 | + |
| 174 | + |
| 175 | +## Metadata Storage |
| 176 | + |
| 177 | +OpenObserve supports multiple metadata store backends, configurable using the `ZO_META_STORE` environment variable. |
| 178 | + |
| 179 | +### SQLite |
| 180 | +- Set `ZO_META_STORE=sqlite`. |
| 181 | +- No additional configuration is required. |
| 182 | +- Suitable for single-node installations. |
| 183 | +- This is generally not recommended as losing the SQLite data will make OpenObserve inoperable. |
| 184 | + |
| 185 | +### PostgreSQL |
| 186 | +- Set `ZO_META_STORE=postgres`. |
| 187 | +- Recommended for production deployments due to reliability and scalability. |
| 188 | +- The default Helm chart (after February 23, 2024) uses [cloudnative-pg](https://cloudnative-pg.io/) to create a postgres cluster (primary + replica) which is used as the meta store. These instances provide high availability and backup support. |
| 189 | + |
| 190 | +### etcd (Deprecated) |
| 191 | +- Set `ZO_META_STORE=etcd`. |
| 192 | +- While etcd is used as the cluster coordinator, it was also the default metadata store in Helm charts released before 23 February 2024. This configuration is now deprecated. Helm charts released after 23 February 2024 use PostgreSQL as the default metadata store. |
| 193 | + |
| 194 | +### MySQL (Deprecated) |
| 195 | +- Set `ZO_META_STORE=mysql`. |
| 196 | +- Deprecated. |
| 197 | +- Use PostgreSQL instead. |
0 commit comments