Skip to content

Commit c47a178

Browse files
committed
checkpoint
1 parent da76cbb commit c47a178

File tree

28 files changed

+1890
-124
lines changed

28 files changed

+1890
-124
lines changed

docs/03-core-concepts/01-architecture.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -165,6 +165,7 @@ The typical write sequence follows these steps:
165165
1. **Namespace Operations**: The client contacts the Ozone Manager to create or locate the key in the namespace
166166
2. **Block Allocation**: The Ozone Manager requests blocks from the Storage Container Manager
167167
3. **Data Transfer**: The client directly writes data to the selected Datanodes according to the replication pipeline
168+
4. **Key Commit**: After successful data transfer, the client commits the key to the Ozone Manager
168169

169170
#### Read Path Sequence
170171

docs/03-core-concepts/01-namespace/01-volumes/01-overview.md

Lines changed: 24 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,29 @@ Volumes are the top-level entity in the Apache Ozone namespace hierarchy. They s
1414
- **Resource Management**: Volumes support quota enforcement for storage space and object counts
1515
- **Command Line Access**: Volumes can be created and managed through the Ozone shell (`ozone sh`)
1616

17+
## Volume Name Limitations
18+
19+
When creating volumes in Ozone, the following limitations apply to volume names:
20+
21+
- **Length**: Volume names must be between 3 and 63 characters long
22+
- **Allowed Characters**: Volume names can only contain:
23+
- Lowercase letters (a-z)
24+
- Numbers (0-9)
25+
- Hyphens (-)
26+
- Periods (.)
27+
- Underscore (_) when not in S3-strict mode
28+
- **Formatting Rules**:
29+
- Cannot start with a period or dash
30+
- Cannot end with a period or dash
31+
- Cannot contain two consecutive periods
32+
- Cannot have a period following a dash
33+
- Cannot have a dash following a period
34+
- Cannot contain uppercase letters
35+
- Cannot be an IPv4 address format or consist of only numbers
36+
- Cannot contain other special characters
37+
38+
These limitations ensure that volume names are compatible with DNS naming conventions and maintain consistency across the Ozone ecosystem.
39+
1740

1841
## Use Cases for Volumes
1942

@@ -31,5 +54,5 @@ Volumes can be accessed through multiple Ozone interfaces:
3154

3255
- **Ozone Shell**: `ozone sh volume ...` commands
3356
- **Ozone File System (OFS)**: `ofs://om-service-id/volume/bucket/key`
34-
- **S3 Gateway**: Volume information is not directly exposed in the S3 protocol
57+
- **S3 Gateway**: Volume information is not directly exposed in the S3 protocol. A default volume `s3v` is automatically created for S3 operations.
3558
- **Programmatic Access**: Through the Java Client API

docs/03-core-concepts/01-namespace/02-buckets/01-overview.md

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,45 @@ In Apache Ozone, a **Bucket** is the primary container for storing objects (keys
2424

2525
If a Volume is like a top-level user directory or tenant space, a Bucket can be thought of as a specific project folder or data category within that space. Keys within the bucket are the actual files or objects belonging to that project or category.
2626

27+
## Bucket Name Limitations
28+
29+
Bucket names in Ozone must adhere to specific naming conventions:
30+
31+
* Length: 3-63 characters
32+
* Must start and end with a lowercase letter or number
33+
* Can contain lowercase letters, numbers, hyphens (-), and periods (.)
34+
* Cannot start or end with a hyphen or period
35+
* Cannot have two consecutive periods
36+
* Cannot have a period adjacent to a hyphen
37+
* Cannot be an IPv4 address or all numeric
38+
39+
By default, Ozone adheres strictly to S3 naming conventions, which is important for S3 API compatibility. In this "strict S3" mode:
40+
41+
* Uppercase letters are not allowed
42+
* Underscores (_) are not allowed
43+
44+
### Hadoop Compatibility Trade-offs
45+
46+
When using Ozone with Hadoop-compatible file systems (HCFS/OFS), you may want to use underscores in bucket names, which are common in Hadoop environments but not allowed in S3.
47+
48+
Ozone provides a configuration option to relax the S3 naming restrictions:
49+
50+
```xml
51+
<property>
52+
<name>ozone.om.namespace.s3.strict</name>
53+
<value>false</value>
54+
</property>
55+
```
56+
57+
When set to `false`, this configuration allows:
58+
* Underscores (_) in bucket names
59+
* More flexibility for working with Hadoop ecosystem tools
60+
61+
**Trade-offs:**
62+
* Setting `ozone.om.namespace.s3.strict` to `false` enhances Hadoop compatibility
63+
* However, buckets with underscores will not be accessible through the S3 API
64+
* Choose based on your primary access pattern (S3 vs HCFS/OFS)
65+
2766
## Usage
2867

2968
Buckets are created using Ozone tools (CLI, Java API) or compatible interfaces (S3 API, HCFS API).

docs/03-core-concepts/01-namespace/02-buckets/04-layouts/01-object-store.md

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,7 @@ You can specify the OBS layout during bucket creation using the Ozone CLI:
3232

3333
```bash
3434
ozone sh bucket create --layout OBJECT_STORE /volumeName/bucketName
35+
ozone sh bucket create --layout obs /volumeName/bucketName
3536
```
3637

3738
Alternatively, you can set OBS as the default layout for all new buckets in the Ozone Manager configuration (`ozone-site.xml`):
@@ -47,9 +48,5 @@ Alternatively, you can set OBS as the default layout for all new buckets in the
4748
</property>
4849
```
4950

50-
## Choosing OBS vs. FSO
51-
52-
* Choose **OBS** if your primary need is **S3 compatibility** and a flat object namespace.
53-
* Choose **FSO** if your primary need is **HCFS/OFS compatibility**, filesystem semantics (atomic renames/deletes via OFS), and integration with traditional Hadoop ecosystem tools (Spark, Hive, Impala).
5451

5552
See the [FSO documentation](../file-system-optimized) for more details on the alternative layout.

docs/03-core-concepts/01-namespace/02-buckets/04-layouts/02-file-system-optimized.md

Lines changed: 2 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,8 @@ FSO is one of the recommended layouts for new deployments, alongside OBS.
4949
You can specify the FSO layout during bucket creation using the Ozone CLI:
5050

5151
```bash
52-
ozone sh bucket create --layout FILE_SYSTEM_OPTIMIZED /volumeName/bucketName
52+
ozone sh bucket create --layout fso /volumeName/bucketName
53+
ozone sh bucket create --layout <FILE_SYSTEM_OPTIMIZED|fso> /volumeName/bucketName
5354
```
5455

5556
Alternatively, you can set FSO as the default layout for all new buckets in the Ozone Manager configuration (`ozone-site.xml`):
@@ -65,20 +66,4 @@ Alternatively, you can set FSO as the default layout for all new buckets in the
6566
</property>
6667
```
6768

68-
## Choosing FSO vs. OBS
69-
70-
* **Choose FSO (File System Optimized) when:**
71-
* Your primary access pattern involves **HCFS/OFS compatible tools** like Spark, Hive, Impala, Presto, Flink, etc.
72-
* You require **fast, atomic directory operations** (renames, deletes) for tasks like `INSERT OVERWRITE`, job commit phases, or large-scale directory restructuring.
73-
* You need **strong consistency** for directory modifications, especially in concurrent environments.
74-
* You are migrating workloads from HDFS and want similar directory semantics and performance characteristics.
75-
* Workloads involve frequent metadata operations or operations on large directories (thousands/millions of files).
76-
77-
* **Choose OBS (Object Store) when:**
78-
* Your primary access pattern is through the **S3 API** using S3-native tools, SDKs, or applications.
79-
* You prioritize **strict S3 compatibility** and behavior (e.g., for cloud-native applications).
80-
* Your workload involves mostly **object-level GET/PUT/DELETE** operations rather than frequent directory manipulations.
81-
* You are storing large amounts of unstructured data like media files, backups, or logs where filesystem hierarchy is less important than S3 API access.
82-
* You need features specific to object storage paradigms, such as object versioning or complex bucket policies managed via S3 APIs.
83-
8469
See the [OBS documentation](../object-store) for more details on the alternative layout.

docs/03-core-concepts/01-namespace/02-buckets/04-layouts/README.mdx

Lines changed: 21 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -6,25 +6,27 @@ Ozone's flexibility in bucket layouts allows a single cluster to serve as both a
66

77
## Available Layouts
88

9-
Ozone provides the following bucket layouts:
10-
11-
1. **[Object Store (OBS)](object-store):**
12-
* **Namespace:** Flat (like S3).
13-
* **Compatibility:** Optimized for strict S3 compatibility.
14-
* **Use Case:** S3-native applications, cloud-native workloads, unstructured data storage (media, backups).
15-
* **HCFS Access (`ofs://`):** Not supported.
16-
17-
2. **[File System Optimized (FSO)](file-system-optimized):**
18-
* **Namespace:** Hierarchical (like HDFS).
19-
* **Compatibility:** Optimized for HCFS compatibility (`ofs://`, `o3fs://`). Supports atomic directory operations via HCFS.
20-
* **Use Case:** HDFS replacement, traditional analytics (Spark, Hive, Impala), workloads requiring filesystem semantics.
21-
* **S3 Access:** Supported, but directory operations are not atomic via S3.
22-
23-
3. **Legacy:**
24-
* This was the original layout before OBS and FSO were introduced.
25-
* It provides a flat namespace.
26-
* Its behavior, especially regarding S3 access and filesystem path interpretation, can depend on the `ozone.om.enable.filesystem.paths` configuration setting.
27-
* **Recommendation:** For all new buckets and use cases, prefer either **OBS** or **FSO** over the Legacy layout.
9+
Ozone provides three bucket layouts, each optimized for different use cases and access patterns:
10+
11+
| Feature | **[Object Store (OBS)](object-store)** | **[File System Optimized (FSO)](file-system-optimized)** | **Legacy** |
12+
|-------------------------------------|--------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------|
13+
| **Namespace Structure** | Flat (like S3) | Hierarchical (like HDFS) | Flat |
14+
| **Primary Compatibility** | S3 API | HCFS API (`ofs://`, `o3fs://`) | Mixed |
15+
| **S3 Access** | ✅ Native and fully optimized | ✅ Supported, but with limitations | ✅ Depends on config |
16+
| **OFS/HCFS Access** | ❌ Not supported | ✅ Native and fully optimized | ✅ Limited |
17+
| **Atomic Directory Operations** | ❌ Not applicable (flat namespace) | ✅ Via HCFS only (not via S3) | ❌ Limited |
18+
| **Creation Command** | `ozone sh bucket create --layout obs /volume/bucket` | `ozone sh bucket create --layout fso /volume/bucket` | `ozone sh bucket create --layout legacy /volume/bucket` |
19+
| **Best For** | • Cloud-native applications<br/>• S3-compatible workloads<br/>• Unstructured data storage<br/>• Media, backups, archives | • HDFS replacement<br/>• Analytics (Spark, Hive, Impala)<br/>• Directory-centric operations<br/>• File system semantics | • Legacy applications<br/>• Backwards compatibility |
20+
| **Performance Characteristics** | • Optimized for object operations<br/> | • Fast directory operations via HCFS<br/>• Atomic renames via HCFS<br/>• Enhanced metadata operations | Deprecated |
21+
| **Recommended For New Deployments** | ✅ Yes, for S3-centric workloads | ✅ Yes, for HDFS-centric workloads | ❌ Deprecated, prefer OBS or FSO |
22+
23+
### Key Differences
24+
25+
- **OBS (Object Store)** provides a true S3-compatible object store experience with a flat namespace. It cannot be accessed through Hadoop filesystem interfaces.
26+
27+
- **FSO (File System Optimized)** offers filesystem-like semantics with directories and atomic operations when accessed via `ofs://` or `o3fs://`. While it supports S3 access, directory operations through S3 are not atomic and may be significantly slower.
28+
29+
- **Legacy** was the original bucket layout before OBS and FSO were introduced. Its behavior depends on configuration settings. For all new deployments, we recommend using either OBS or FSO.
2830

2931
## Choosing the Right Layout
3032

docs/03-core-concepts/01-namespace/README.mdx

Lines changed: 4 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -11,24 +11,9 @@ Ozone organizes data within a hierarchical namespace. Understanding this structu
1111

1212
<div className="container" style={{marginTop: '2rem'}}>
1313
<div className="row">
14-
{/* Overview Card */}
15-
<div className="col col--4" style={{textAlign: 'center', padding: '1rem'}}>
16-
<Link to="./namespace/overview" style={{textDecoration: 'none', color: 'inherit'}}>
17-
<img
18-
src={useBaseUrl('/img/namespace/overview.svg')}
19-
alt="Namespace Overview"
20-
width="80"
21-
height="80"
22-
style={{marginBottom: '1rem'}}
23-
/>
24-
<h3>Overview</h3>
25-
</Link>
26-
<p>General concepts and the overall structure of the Ozone namespace.</p>
27-
</div>
28-
2914
{/* Volumes Card */}
30-
<div className="col col--4" style={{textAlign: 'center', padding: '1rem'}}>
31-
<Link to="./namespace/volumes/" style={{textDecoration: 'none', color: 'inherit'}}>
15+
<div className="col col--6" style={{textAlign: 'center', padding: '1rem'}}>
16+
<Link to="./volumes/" style={{textDecoration: 'none', color: 'inherit'}}>
3217
<img
3318
src={useBaseUrl('/img/namespace/hierarchy.svg')}
3419
alt="Ozone Volumes"
@@ -42,8 +27,8 @@ Ozone organizes data within a hierarchical namespace. Understanding this structu
4227
</div>
4328

4429
{/* Buckets Card */}
45-
<div className="col col--4" style={{textAlign: 'center', padding: '1rem'}}>
46-
<Link to="./namespace/buckets/" style={{textDecoration: 'none', color: 'inherit'}}>
30+
<div className="col col--6" style={{textAlign: 'center', padding: '1rem'}}>
31+
<Link to="./buckets/" style={{textDecoration: 'none', color: 'inherit'}}>
4732
<img
4833
src={useBaseUrl('/img/namespace/bucket.svg')}
4934
alt="Ozone Buckets"

0 commit comments

Comments
 (0)