You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Fix: Resolve CI test errors for pull request apache#149
This commit addresses various CI test failures identified in pull request apache#149.
The following issues have been resolved:
- Markdown linting errors: Corrected capitalization, list formatting, and blank line issues in several documentation files.
- Sidebar configuration errors: Added missing files with appropriate frontmatter (sidebar_label, sidebar_position, DocCardList import/tag) to ensure correct sidebar generation.
- YAML linting: Manually verified YAML files for correct syntax, as automated linting was unavailable.
These changes aim to bring the documentation in line with project standards and pass CI checks.
Copy file name to clipboardExpand all lines: docs/03-core-concepts/01-architecture.md
+18-15Lines changed: 18 additions & 15 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -82,7 +82,7 @@ The OM is the entry point for all namespace operations. It tracks which objects
82
82
83
83
### Storage Container Manager (SCM)
84
84
85
-
The Storage Container Manager orchestrates the container lifecycle and coordinates datanodes:
85
+
The Storage Container Manager orchestrates the container lifecycle and coordinates Datanodes:
86
86
87
87
- Manages container creation and allocation
88
88
- Tracks datanode status and health
@@ -103,7 +103,7 @@ Datanodes are the workhorses that store the actual data:
103
103
- Participate in replication pipelines
104
104
- Handle data integrity checks
105
105
106
-
Each datanode manages a set of containers and serves read/write requests from clients.
106
+
Each Datanode manages a set of containers and serves read/write requests from clients.
107
107
108
108
### Recon
109
109
@@ -145,9 +145,9 @@ The Ozone client is the software component that enables applications to interact
145
145
146
146
- Provides Java libraries for programmatic access
147
147
- Handles communication with OM for namespace operations
148
-
- Manages direct data transfer with datanodes
148
+
- Manages direct data transfer with Datanodes
149
149
- Implements client-side caching for improved performance
150
-
- Offers pluggable interfaces for different protocols (S3, OFS)
150
+
- Offers pluggable interfaces for different protocols (S3, ofs)
151
151
- Handles authentication and token management
152
152
153
153
The client library abstracts away the complexity of the distributed system, providing applications with a simple, consistent interface to Ozone storage.
@@ -178,7 +178,9 @@ For reads, the process is simpler:
178
178
#### Monitoring and Management
179
179
180
180

181
+
181
182
The Recon service continuously:
183
+
182
184
- Collects metrics from the Ozone Manager, Storage Container Manager, and Datanodes
183
185
- Provides consolidated views of system health and performance
184
186
- Facilitates troubleshooting and management
@@ -193,7 +195,7 @@ Containers are the fundamental storage units in Ozone:
193
195
194
196
- Fixed-size (typically 5GB) units of storage
195
197
- Managed by the Storage Container Manager (SCM)
196
-
- Replicated or erasure-coded across datanodes
198
+
- Replicated or erasure-coded across Datanodes
197
199
- Contain multiple blocks
198
200
- Include metadata and chunk files
199
201
@@ -209,7 +211,7 @@ Blocks are logical units of data within containers:
209
211
- Allocated by the Ozone Manager
210
212
- Secured with block tokens
211
213
212
-
When a client writes data, the OM allocates blocks from SCM, and the client writes data to these blocks through datanodes.
214
+
When a client writes data, the OM allocates blocks from SCM, and the client writes data to these blocks through Datanodes.
213
215
214
216
### Chunks
215
217
@@ -230,34 +232,34 @@ Ozone provides durability through container replication:
230
232
231
233
- Default replication factor is 3
232
234
- Uses Ratis (Raft) consensus protocol
233
-
- Synchronously replicates data across datanodes
235
+
- Synchronously replicates data across Datanodes
234
236
- Provides strong consistency guarantees
235
237
- Handles node failures transparently
236
238
237
-
Replicated containers ensure data durability by storing multiple copies of each container across different datanodes.
239
+
Replicated containers ensure data durability by storing multiple copies of each container across different Datanodes.
238
240
239
-

241
+

- Splits data across multiple datanodes with parity
247
+
- Splits data across multiple Datanodes with parity
246
248
- Supports various coding schemes (e.g., RS-3-2-1024k)
247
249
- Reduces storage overhead compared to replication
248
250
- Trades some performance for storage efficiency
249
251
- Suitable for cold data storage
250
252
251
253
Erasure coding allows for data durability with less storage overhead than full replication.
252
254
253
-

255
+

254
256
255
257
### Pipelines
256
258
257
-
Pipelines are groups of datanodes that work together to store data:
259
+
Pipelines are groups of Datanodes that work together to store data:
258
260
259
261
- Managed by SCM
260
-
- Consist of multiple datanodes
262
+
- Consist of multiple Datanodes
261
263
- Handle write operations as a unit
262
264
- Support different replication strategies
263
265
@@ -267,7 +269,7 @@ For detailed information, see [Write Pipelines](./02-data-replication/02-write-p
267
269
268
270
Ratis pipelines use the Raft consensus protocol:
269
271
270
-
- Typically three datanodes per pipeline
272
+
- Typically three Datanodes per pipeline
271
273
- One leader and multiple followers
272
274
- Synchronous replication
273
275
- Strong consistency guarantees
@@ -292,7 +294,7 @@ Ozone supports multiple access protocols, making it versatile for different appl
292
294
- Full feature access
293
295
- Highest performance
294
296
295
-
### Ozone File System (OFS)
297
+
### Ozone File System (ofs)
296
298
297
299
- Hadoop-compatible filesystem interface
298
300
- Works with all Hadoop ecosystem applications
@@ -310,6 +312,7 @@ Ozone supports multiple access protocols, making it versatile for different appl
Copy file name to clipboardExpand all lines: docs/03-core-concepts/01-namespace/02-buckets/02-owners.md
+6-6Lines changed: 6 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,14 +8,14 @@ Every bucket in Apache Ozone has an **owner**. The owner is typically the user w
8
8
9
9
## Significance of Ownership
10
10
11
-
***Default Permissions:** The owner of a bucket implicitly has full control (ALL permissions) over the bucket itself, including the ability to modify its properties (like quotas, versioning) and delete it.
12
-
***ACL Management:** The owner (or an Ozone administrator) is usually responsible for managing the Access Control Lists (ACLs) for the bucket, granting permissions to other users or groups.
13
-
***Quota Accountability:** While quotas are set on the bucket itself, ownership can be relevant for tracking resource usage back to a specific user or tenant, especially in multi-tenant environments where volumes might represent tenants and buckets represent applications or projects within that tenant.
11
+
-**Default Permissions:** The owner of a bucket implicitly has full control (ALL permissions) over the bucket itself, including the ability to modify its properties (like quotas, versioning) and delete it.
12
+
-**ACL Management:** The owner (or an Ozone administrator) is usually responsible for managing the Access Control Lists (ACLs) for the bucket, granting permissions to other users or groups.
13
+
-**Quota Accountability:** While quotas are set on the bucket itself, ownership can be relevant for tracking resource usage back to a specific user or tenant, especially in multi-tenant environments where volumes might represent tenants and buckets represent applications or projects within that tenant.
14
14
15
15
## Determining the Owner
16
16
17
-
***Creation Time:** When a bucket is created, the user principal making the creation request (authenticated via Kerberos, token, etc.) is typically set as the owner.
18
-
***Inheritance (S3 Interface):** When creating buckets via the S3 gateway, the owner might be determined differently based on the S3 multi-tenancy configuration. In some setups, the owner might be mapped from the S3 access key or assumed role.
17
+
-**Creation Time:** When a bucket is created, the user principal making the creation request (authenticated via Kerberos, token, etc.) is typically set as the owner.
18
+
-**Inheritance (S3 Interface):** When creating buckets via the S3 Gateway, the owner might be determined differently based on the S3 multi-tenancy configuration. In some setups, the owner might be mapped from the S3 access key or assumed role.
19
19
20
20
## Viewing the Owner
21
21
@@ -31,4 +31,4 @@ The output will include an `owner` field displaying the user principal who owns
31
31
32
32
Changing the owner of a bucket after creation is generally **not** a standard user operation in Ozone and typically requires administrative privileges or specific tooling, depending on the deployment and security configuration. Ownership is primarily established at creation time.
33
33
34
-
Understanding bucket ownership is important for managing permissions and accountability within the Ozone namespace.
34
+
Understanding bucket ownership is important for managing permissions and accountability within the Ozone namespace.
Copy file name to clipboardExpand all lines: docs/03-core-concepts/01-namespace/02-buckets/03-quotas.md
+14-14Lines changed: 14 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,16 +10,16 @@ Apache Ozone allows administrators or bucket owners to set **quotas** on buckets
10
10
11
11
Ozone supports two types of quotas at the bucket level:
12
12
13
-
1.**Namespace Quota:**
14
-
* Limits the total number of **keys** (objects, files, directories) that can exist within the bucket.
15
-
* Helps control the size of the Ozone Manager's metadata.
16
-
* Expressed as a simple count (e.g., 1,000,000 keys).
13
+
-**Namespace Quota:**
14
+
- Limits the total number of **keys** (objects, files, directories) that can exist within the bucket.
15
+
- Helps control the size of the Ozone Manager's metadata.
16
+
- Expressed as a simple count (e.g., 1,000,000 keys).
17
17
18
-
2.**Storage Space Quota:**
19
-
* Limits the total **disk space** consumed by the data blocks of all objects within the bucket.
20
-
* This limit applies to the logical size of the data *before* replication. For example, a 1 GB object will count as 1 GB towards the quota, regardless of whether it's replicated 3 times (consuming 3 GB physically) or uses erasure coding.
21
-
* Helps control the overall physical storage usage.
22
-
* Expressed in bytes, often using units like KB, MB, GB, TB, PB (e.g., `100GB`, `1TB`).
18
+
-**Storage Space Quota:**
19
+
- Limits the total **disk space** consumed by the data blocks of all objects within the bucket.
20
+
- This limit applies to the logical size of the data *before* replication. For example, a 1 GB object will count as 1 GB towards the quota, regardless of whether it's replicated 3 times (consuming 3 GB physically) or uses erasure coding.
21
+
- Helps control the overall physical storage usage.
22
+
- Expressed in bytes, often using units like KB, MB, GB, TB, PB (e.g., `100GB`, `1TB`).
Copy file name to clipboardExpand all lines: docs/03-core-concepts/01-namespace/02-buckets/04-layouts/01-object-store.md
+12-12Lines changed: 12 additions & 12 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,21 +8,21 @@ The **Object Store (OBS)** layout is one of the modern bucket layouts in Apache
8
8
9
9
## Key Characteristics
10
10
11
-
***Flat Namespace:** Unlike the [File System Optimized (FSO)](./02-file-system-optimized.md) layout, OBS does not simulate a hierarchical directory structure internally. Keys are stored directly within the bucket. While delimiters like `/` can be used in key names (e.g., `images/archive/photo.jpg`) for organizational purposes (as commonly interpreted by S3 clients), Ozone treats the entire string as the unique object key. There are no separate directory entries maintained by Ozone itself.
12
-
***Strict S3 Compatibility:** OBS is the recommended layout for workloads demanding the highest fidelity with the Amazon S3 API and its object storage semantics. Cloud-native applications built using AWS SDKs, Boto3, or other S3 client libraries can interact with OBS buckets seamlessly.
13
-
***OFS/HCFS Incompatibility:** Buckets created with the OBS layout **cannot** be accessed using Hadoop Compatible File System (HCFS) protocols like `ofs://` or `o3fs://`. Attempting filesystem operations (like creating directories or listing files using Hadoop FS APIs) on an OBS bucket will fail.
14
-
***Use Cases:** Ideal for:
15
-
***Cloud-Native Applications:** Applications built using S3 SDKs (AWS SDK, Boto3, etc.) expecting standard S3 behavior.
16
-
***Unstructured Data:** Storing large amounts of unstructured or semi-structured data like images, videos, audio files, sensor data, logs, backups, and archives where hierarchical filesystem access is not the primary requirement.
17
-
***S3 Compatibility:** Workloads requiring the highest fidelity with the S3 API and object storage semantics.
18
-
***Data Exploration:** Enabling exploration of unstructured data using S3-compatible tools.
19
-
* It's the preferred choice when filesystem semantics (like atomic directory renames) are not required.
20
-
***Performance:** Optimized for typical object storage operations.
11
+
-**Flat Namespace:** Unlike the [File System Optimized (FSO)](./02-file-system-optimized.md) layout, OBS does not simulate a hierarchical directory structure internally. Keys are stored directly within the bucket. While delimiters like `/` can be used in key names (e.g., `images/archive/photo.jpg`) for organizational purposes (as commonly interpreted by S3 clients), Ozone treats the entire string as the unique object key. There are no separate directory entries maintained by Ozone itself.
12
+
-**Strict S3 Compatibility:** OBS is the recommended layout for workloads demanding the highest fidelity with the Amazon S3 API and its object storage semantics. Cloud-native applications built using AWS SDKs, Boto3, or other S3 client libraries can interact with OBS buckets seamlessly.
13
+
-**ofs/HCFS Incompatibility:** Buckets created with the OBS layout **cannot** be accessed using Hadoop Compatible File System (HCFS) protocols like `ofs://` or `o3fs://`. Attempting filesystem operations (like creating directories or listing files using Hadoop FS APIs) on an OBS bucket will fail.
14
+
-**Use Cases:** Ideal for:
15
+
-**Cloud-Native Applications:** Applications built using S3 SDKs (AWS SDK, Boto3, etc.) expecting standard S3 behavior.
16
+
-**Unstructured Data:** Storing large amounts of unstructured or semi-structured data like images, videos, audio files, sensor data, logs, backups, and archives where hierarchical filesystem access is not the primary requirement.
17
+
-**S3 Compatibility:** Workloads requiring the highest fidelity with the S3 API and object storage semantics.
18
+
-**Data Exploration:** Enabling exploration of unstructured data using S3-compatible tools.
19
+
- It's the preferred choice when filesystem semantics (like atomic directory renames) are not required.
20
+
-**Performance:** Optimized for typical object storage operations.
21
21
22
22
## Multi-Protocol Access Considerations
23
23
24
-
***S3 Access:** This is the primary and intended access method for OBS buckets.
25
-
***OFS/o3fs Access:** Not supported.
24
+
-**S3 Access:** This is the primary and intended access method for OBS buckets.
0 commit comments