Skip to content

Commit 0b6ae86

Browse files
committed
Fix: Resolve CI test errors for pull request apache#149
This commit addresses various CI test failures identified in pull request apache#149. The following issues have been resolved: - Markdown linting errors: Corrected capitalization, list formatting, and blank line issues in several documentation files. - Sidebar configuration errors: Added missing files with appropriate frontmatter (sidebar_label, sidebar_position, DocCardList import/tag) to ensure correct sidebar generation. - YAML linting: Manually verified YAML files for correct syntax, as automated linting was unavailable. These changes aim to bring the documentation in line with project standards and pass CI checks.
1 parent 27ce82b commit 0b6ae86

File tree

10 files changed

+128
-125
lines changed

10 files changed

+128
-125
lines changed

docs/03-core-concepts/01-architecture.md

Lines changed: 18 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -82,7 +82,7 @@ The OM is the entry point for all namespace operations. It tracks which objects
8282

8383
### Storage Container Manager (SCM)
8484

85-
The Storage Container Manager orchestrates the container lifecycle and coordinates datanodes:
85+
The Storage Container Manager orchestrates the container lifecycle and coordinates Datanodes:
8686

8787
- Manages container creation and allocation
8888
- Tracks datanode status and health
@@ -103,7 +103,7 @@ Datanodes are the workhorses that store the actual data:
103103
- Participate in replication pipelines
104104
- Handle data integrity checks
105105

106-
Each datanode manages a set of containers and serves read/write requests from clients.
106+
Each Datanode manages a set of containers and serves read/write requests from clients.
107107

108108
### Recon
109109

@@ -145,9 +145,9 @@ The Ozone client is the software component that enables applications to interact
145145

146146
- Provides Java libraries for programmatic access
147147
- Handles communication with OM for namespace operations
148-
- Manages direct data transfer with datanodes
148+
- Manages direct data transfer with Datanodes
149149
- Implements client-side caching for improved performance
150-
- Offers pluggable interfaces for different protocols (S3, OFS)
150+
- Offers pluggable interfaces for different protocols (S3, ofs)
151151
- Handles authentication and token management
152152

153153
The client library abstracts away the complexity of the distributed system, providing applications with a simple, consistent interface to Ozone storage.
@@ -178,7 +178,9 @@ For reads, the process is simpler:
178178
#### Monitoring and Management
179179

180180
![Diagram showing Recon collecting data from OM, SCM, and Datanodes for monitoring.](../../static/img/ozone/recon.svg)
181+
181182
The Recon service continuously:
183+
182184
- Collects metrics from the Ozone Manager, Storage Container Manager, and Datanodes
183185
- Provides consolidated views of system health and performance
184186
- Facilitates troubleshooting and management
@@ -193,7 +195,7 @@ Containers are the fundamental storage units in Ozone:
193195

194196
- Fixed-size (typically 5GB) units of storage
195197
- Managed by the Storage Container Manager (SCM)
196-
- Replicated or erasure-coded across datanodes
198+
- Replicated or erasure-coded across Datanodes
197199
- Contain multiple blocks
198200
- Include metadata and chunk files
199201

@@ -209,7 +211,7 @@ Blocks are logical units of data within containers:
209211
- Allocated by the Ozone Manager
210212
- Secured with block tokens
211213

212-
When a client writes data, the OM allocates blocks from SCM, and the client writes data to these blocks through datanodes.
214+
When a client writes data, the OM allocates blocks from SCM, and the client writes data to these blocks through Datanodes.
213215

214216
### Chunks
215217

@@ -230,34 +232,34 @@ Ozone provides durability through container replication:
230232

231233
- Default replication factor is 3
232234
- Uses Ratis (Raft) consensus protocol
233-
- Synchronously replicates data across datanodes
235+
- Synchronously replicates data across Datanodes
234236
- Provides strong consistency guarantees
235237
- Handles node failures transparently
236238

237-
Replicated containers ensure data durability by storing multiple copies of each container across different datanodes.
239+
Replicated containers ensure data durability by storing multiple copies of each container across different Datanodes.
238240

239-
![Diagram illustrating how data blocks and chunks are stored across datanodes in a replicated container setup.](../../static/img/ozone/ozone-storage-hierarchy-replicated.svg)
241+
![Diagram illustrating how data blocks and chunks are stored across Datanodes in a replicated container setup.](../../static/img/ozone/ozone-storage-hierarchy-replicated.svg)
240242

241243
### Erasure Encoded Containers
242244

243245
Erasure coding provides space-efficient durability:
244246

245-
- Splits data across multiple datanodes with parity
247+
- Splits data across multiple Datanodes with parity
246248
- Supports various coding schemes (e.g., RS-3-2-1024k)
247249
- Reduces storage overhead compared to replication
248250
- Trades some performance for storage efficiency
249251
- Suitable for cold data storage
250252

251253
Erasure coding allows for data durability with less storage overhead than full replication.
252254

253-
![Diagram illustrating how data and parity blocks are stored across datanodes in an erasure coded container setup.](../../static/img/ozone/ozone-storage-hierarchy-ec.svg)
255+
![Diagram illustrating how data and parity blocks are stored across Datanodes in an erasure coded container setup.](../../static/img/ozone/ozone-storage-hierarchy-ec.svg)
254256

255257
### Pipelines
256258

257-
Pipelines are groups of datanodes that work together to store data:
259+
Pipelines are groups of Datanodes that work together to store data:
258260

259261
- Managed by SCM
260-
- Consist of multiple datanodes
262+
- Consist of multiple Datanodes
261263
- Handle write operations as a unit
262264
- Support different replication strategies
263265

@@ -267,7 +269,7 @@ For detailed information, see [Write Pipelines](./02-data-replication/02-write-p
267269

268270
Ratis pipelines use the Raft consensus protocol:
269271

270-
- Typically three datanodes per pipeline
272+
- Typically three Datanodes per pipeline
271273
- One leader and multiple followers
272274
- Synchronous replication
273275
- Strong consistency guarantees
@@ -292,7 +294,7 @@ Ozone supports multiple access protocols, making it versatile for different appl
292294
- Full feature access
293295
- Highest performance
294296

295-
### Ozone File System (OFS)
297+
### Ozone File System (ofs)
296298

297299
- Hadoop-compatible filesystem interface
298300
- Works with all Hadoop ecosystem applications
@@ -310,6 +312,7 @@ Ozone supports multiple access protocols, making it versatile for different appl
310312
- Enables web applications to access Ozone
311313
- Path format: `http://httpfs-host/webhdfs/v1/volume/bucket/key`
312314

315+
313316
The multi-protocol architecture allows for flexible integration with existing applications and workflows.
314317

315318
## Summary

docs/03-core-concepts/01-namespace/01-volumes/02-owners.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -14,10 +14,10 @@ Ownership is a fundamental aspect of Ozone volumes that determines who has admin
1414
- In secure clusters, ownership is tied to authentication mechanisms like Kerberos
1515
- Volume ownership enables delegation of administrative responsibilities
1616
- Only the volume owner and Ozone administrators can perform certain operations like:
17-
- Creating buckets within the volume
18-
- Setting volume properties
19-
- Modifying volume quotas
20-
- Deleting the volume
17+
- Creating buckets within the volume
18+
- Setting volume properties
19+
- Modifying volume quotas
20+
- Deleting the volume
2121

2222
### Ozone Administrators
2323

@@ -60,4 +60,4 @@ ozone sh volume list /
6060

6161
# Change the owner of an existing volume (requires current owner or admin privileges)
6262
ozone sh volume update --user=new-owner /marketing
63-
```
63+
```

docs/03-core-concepts/01-namespace/01-volumes/03-quotas.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,6 @@ sidebar_label: Quotas
44

55
# Volume Quotas
66

7-
87
## Quota Management
98

109
Quotas provide a mechanism to control resource utilization at the volume level:
@@ -29,4 +28,4 @@ ozone sh volume clrquota --space /marketing
2928

3029
# Clear the namespace quota (set object count limit to unlimited)
3130
ozone sh volume clrquota --count /marketing
32-
```
31+
```

docs/03-core-concepts/01-namespace/02-buckets/02-owners.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -8,14 +8,14 @@ Every bucket in Apache Ozone has an **owner**. The owner is typically the user w
88

99
## Significance of Ownership
1010

11-
* **Default Permissions:** The owner of a bucket implicitly has full control (ALL permissions) over the bucket itself, including the ability to modify its properties (like quotas, versioning) and delete it.
12-
* **ACL Management:** The owner (or an Ozone administrator) is usually responsible for managing the Access Control Lists (ACLs) for the bucket, granting permissions to other users or groups.
13-
* **Quota Accountability:** While quotas are set on the bucket itself, ownership can be relevant for tracking resource usage back to a specific user or tenant, especially in multi-tenant environments where volumes might represent tenants and buckets represent applications or projects within that tenant.
11+
- **Default Permissions:** The owner of a bucket implicitly has full control (ALL permissions) over the bucket itself, including the ability to modify its properties (like quotas, versioning) and delete it.
12+
- **ACL Management:** The owner (or an Ozone administrator) is usually responsible for managing the Access Control Lists (ACLs) for the bucket, granting permissions to other users or groups.
13+
- **Quota Accountability:** While quotas are set on the bucket itself, ownership can be relevant for tracking resource usage back to a specific user or tenant, especially in multi-tenant environments where volumes might represent tenants and buckets represent applications or projects within that tenant.
1414

1515
## Determining the Owner
1616

17-
* **Creation Time:** When a bucket is created, the user principal making the creation request (authenticated via Kerberos, token, etc.) is typically set as the owner.
18-
* **Inheritance (S3 Interface):** When creating buckets via the S3 gateway, the owner might be determined differently based on the S3 multi-tenancy configuration. In some setups, the owner might be mapped from the S3 access key or assumed role.
17+
- **Creation Time:** When a bucket is created, the user principal making the creation request (authenticated via Kerberos, token, etc.) is typically set as the owner.
18+
- **Inheritance (S3 Interface):** When creating buckets via the S3 Gateway, the owner might be determined differently based on the S3 multi-tenancy configuration. In some setups, the owner might be mapped from the S3 access key or assumed role.
1919

2020
## Viewing the Owner
2121

@@ -31,4 +31,4 @@ The output will include an `owner` field displaying the user principal who owns
3131

3232
Changing the owner of a bucket after creation is generally **not** a standard user operation in Ozone and typically requires administrative privileges or specific tooling, depending on the deployment and security configuration. Ownership is primarily established at creation time.
3333

34-
Understanding bucket ownership is important for managing permissions and accountability within the Ozone namespace.
34+
Understanding bucket ownership is important for managing permissions and accountability within the Ozone namespace.

docs/03-core-concepts/01-namespace/02-buckets/03-quotas.md

Lines changed: 14 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -10,16 +10,16 @@ Apache Ozone allows administrators or bucket owners to set **quotas** on buckets
1010

1111
Ozone supports two types of quotas at the bucket level:
1212

13-
1. **Namespace Quota:**
14-
* Limits the total number of **keys** (objects, files, directories) that can exist within the bucket.
15-
* Helps control the size of the Ozone Manager's metadata.
16-
* Expressed as a simple count (e.g., 1,000,000 keys).
13+
- **Namespace Quota:**
14+
- Limits the total number of **keys** (objects, files, directories) that can exist within the bucket.
15+
- Helps control the size of the Ozone Manager's metadata.
16+
- Expressed as a simple count (e.g., 1,000,000 keys).
1717

18-
2. **Storage Space Quota:**
19-
* Limits the total **disk space** consumed by the data blocks of all objects within the bucket.
20-
* This limit applies to the logical size of the data *before* replication. For example, a 1 GB object will count as 1 GB towards the quota, regardless of whether it's replicated 3 times (consuming 3 GB physically) or uses erasure coding.
21-
* Helps control the overall physical storage usage.
22-
* Expressed in bytes, often using units like KB, MB, GB, TB, PB (e.g., `100GB`, `1TB`).
18+
- **Storage Space Quota:**
19+
- Limits the total **disk space** consumed by the data blocks of all objects within the bucket.
20+
- This limit applies to the logical size of the data *before* replication. For example, a 1 GB object will count as 1 GB towards the quota, regardless of whether it's replicated 3 times (consuming 3 GB physically) or uses erasure coding.
21+
- Helps control the overall physical storage usage.
22+
- Expressed in bytes, often using units like KB, MB, GB, TB, PB (e.g., `100GB`, `1TB`).
2323

2424
## Setting Quotas
2525

@@ -46,13 +46,13 @@ ozone sh bucket setquota --quota 500GB /vol1/limited-bucket
4646
ozone sh bucket setquota --quota -1 --namespace-quota 5000000 /vol1/limited-bucket
4747
```
4848

49-
* Use `-1` or `none` to remove a specific quota limit.
49+
- Use `-1` or `none` to remove a specific quota limit.
5050

5151
## Enforcement
5252

53-
* When a write operation (creating an object, uploading data) would cause a bucket to exceed its quota, the operation will fail.
54-
* Namespace quota is checked when creating new keys.
55-
* Space quota is checked based on the size of the data being written.
53+
- When a write operation (creating an object, uploading data) would cause a bucket to exceed its quota, the operation will fail.
54+
- Namespace quota is checked when creating new keys.
55+
- Space quota is checked based on the size of the data being written.
5656

5757
## Viewing Quotas
5858

@@ -64,4 +64,4 @@ ozone sh bucket info /volumeName/bucketName
6464

6565
The output will show fields like `quotaInBytes`, `quotaInNamespace`, `usedBytes`, and `usedNamespace`.
6666

67-
Bucket quotas are a fundamental tool for managing storage resources effectively in Ozone.
67+
Bucket quotas are a fundamental tool for managing storage resources effectively in Ozone.

docs/03-core-concepts/01-namespace/02-buckets/04-layouts/01-object-store.md

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -8,21 +8,21 @@ The **Object Store (OBS)** layout is one of the modern bucket layouts in Apache
88

99
## Key Characteristics
1010

11-
* **Flat Namespace:** Unlike the [File System Optimized (FSO)](./02-file-system-optimized.md) layout, OBS does not simulate a hierarchical directory structure internally. Keys are stored directly within the bucket. While delimiters like `/` can be used in key names (e.g., `images/archive/photo.jpg`) for organizational purposes (as commonly interpreted by S3 clients), Ozone treats the entire string as the unique object key. There are no separate directory entries maintained by Ozone itself.
12-
* **Strict S3 Compatibility:** OBS is the recommended layout for workloads demanding the highest fidelity with the Amazon S3 API and its object storage semantics. Cloud-native applications built using AWS SDKs, Boto3, or other S3 client libraries can interact with OBS buckets seamlessly.
13-
* **OFS/HCFS Incompatibility:** Buckets created with the OBS layout **cannot** be accessed using Hadoop Compatible File System (HCFS) protocols like `ofs://` or `o3fs://`. Attempting filesystem operations (like creating directories or listing files using Hadoop FS APIs) on an OBS bucket will fail.
14-
* **Use Cases:** Ideal for:
15-
* **Cloud-Native Applications:** Applications built using S3 SDKs (AWS SDK, Boto3, etc.) expecting standard S3 behavior.
16-
* **Unstructured Data:** Storing large amounts of unstructured or semi-structured data like images, videos, audio files, sensor data, logs, backups, and archives where hierarchical filesystem access is not the primary requirement.
17-
* **S3 Compatibility:** Workloads requiring the highest fidelity with the S3 API and object storage semantics.
18-
* **Data Exploration:** Enabling exploration of unstructured data using S3-compatible tools.
19-
* It's the preferred choice when filesystem semantics (like atomic directory renames) are not required.
20-
* **Performance:** Optimized for typical object storage operations.
11+
- **Flat Namespace:** Unlike the [File System Optimized (FSO)](./02-file-system-optimized.md) layout, OBS does not simulate a hierarchical directory structure internally. Keys are stored directly within the bucket. While delimiters like `/` can be used in key names (e.g., `images/archive/photo.jpg`) for organizational purposes (as commonly interpreted by S3 clients), Ozone treats the entire string as the unique object key. There are no separate directory entries maintained by Ozone itself.
12+
- **Strict S3 Compatibility:** OBS is the recommended layout for workloads demanding the highest fidelity with the Amazon S3 API and its object storage semantics. Cloud-native applications built using AWS SDKs, Boto3, or other S3 client libraries can interact with OBS buckets seamlessly.
13+
- **ofs/HCFS Incompatibility:** Buckets created with the OBS layout **cannot** be accessed using Hadoop Compatible File System (HCFS) protocols like `ofs://` or `o3fs://`. Attempting filesystem operations (like creating directories or listing files using Hadoop FS APIs) on an OBS bucket will fail.
14+
- **Use Cases:** Ideal for:
15+
- **Cloud-Native Applications:** Applications built using S3 SDKs (AWS SDK, Boto3, etc.) expecting standard S3 behavior.
16+
- **Unstructured Data:** Storing large amounts of unstructured or semi-structured data like images, videos, audio files, sensor data, logs, backups, and archives where hierarchical filesystem access is not the primary requirement.
17+
- **S3 Compatibility:** Workloads requiring the highest fidelity with the S3 API and object storage semantics.
18+
- **Data Exploration:** Enabling exploration of unstructured data using S3-compatible tools.
19+
- It's the preferred choice when filesystem semantics (like atomic directory renames) are not required.
20+
- **Performance:** Optimized for typical object storage operations.
2121

2222
## Multi-Protocol Access Considerations
2323

24-
* **S3 Access:** This is the primary and intended access method for OBS buckets.
25-
* **OFS/o3fs Access:** Not supported.
24+
- **S3 Access:** This is the primary and intended access method for OBS buckets.
25+
- **ofs/o3fs Access:** Not supported.
2626

2727
## Creating OBS Buckets
2828

0 commit comments

Comments
 (0)