Skip to content

feat(io): add OSS storage implementation#1153

Merged
Xuanwo merged 1 commit intoapache:mainfrom
divinerapier:main
Apr 10, 2025
Merged

feat(io): add OSS storage implementation#1153
Xuanwo merged 1 commit intoapache:mainfrom
divinerapier:main

Conversation

@divinerapier
Copy link
Copy Markdown
Contributor

@divinerapier divinerapier commented Apr 2, 2025

What changes are included in this PR?

  • Support AliyunOSS backend by OpenDAL
  • Update doc explains why feature storage-oss is not included in storage-all.
  • Fix typo

Are these changes tested?

The support for AliyunOSS is based on OpenDAL, a production-verified project. Therefore, no additional testing has been added.

Add a new example.

Copy link
Copy Markdown
Contributor

@liurenjie1024 liurenjie1024 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @divinerapier for this pr. But it seems that supporting oss scheme in s3 would be enough? Also there is no tests for it.

Oss {
/// oss storage could have `oss://`.
/// Storing the scheme string here to return the correct path.
scheme_str: String,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need this? We need it for s3 to be compatible with both s3 and s3a, is this also the case for oss?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

Because data written to Iceberg via Flink/Spark + OSS shows paths starting with oss:// in the REST catalog, which cannot be parsed or read using the S3 protocol.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean why we need to keep the schema_str field for oss?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, I will fix it .

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

    #[cfg(feature = "storage-oss")]
    Oss {
        /// uses the same client for one FileIO Storage.
        ///
        /// TODO: allow users to configure this client.
        client: reqwest::Client,
        config: Arc<OssConfig>,
    },

Fixed, PTAL @liurenjie1024

@divinerapier
Copy link
Copy Markdown
Contributor Author

@Xuanwo PTAL 😆

I am in the process of implementing support for Aliyun OSS storage backend using OpenDAL, can you provide me with some advice on how to test this?

Thanks.

@Xuanwo
Copy link
Copy Markdown
Member

Xuanwo commented Apr 7, 2025

I am in the process of implementing support for Aliyun OSS storage backend using OpenDAL, can you provide me with some advice on how to test this?

There are no good OSS-compatible services like MinIO for S3 that suit our testing needs. OpenDAL tests OSS using a user-sponsored, dedicated testing bucket. Perhaps we should consider taking a similar approach.

@divinerapier
Copy link
Copy Markdown
Contributor Author

There are no good OSS-compatible services like MinIO for S3 that suit our testing needs. OpenDAL tests OSS using a user-sponsored, dedicated testing bucket. Perhaps we should consider taking a similar approach.

Thank you (@Xuanwo ) for raising this critical issue about OSS-compatible testing infrastructure. I fully agree that a real OSS service would be ideal for ensuring authentic testing behavior.

I previously contacted Aliyun (OSS) pre-sales support to inquire about dedicated testing environments, but unfortunately, they currently do not offer such programs for open-source projects. Given OpenDAL's successful model of using community-sponsored resources, I want to explore two potential paths:

  • Community Connections:​​ Does anyone in our project community work at Aliyun(for OSS) or have access to internal testing buckets that could be securely shared?

  • ​​Sponsored Testing Bucket:​​ If official support remains unavailable, would iceberg team consider co-maintaining a dedicated testing bucket?

cc @liurenjie1024

@Xuanwo
Copy link
Copy Markdown
Member

Xuanwo commented Apr 8, 2025

I don't think it's a blocker.

@liurenjie1024, do you think it's a good idea to implement it first but leave it out of storage-all, so users who want to try it out have to manually enable storage-oss?

@divinerapier divinerapier reopened this Apr 8, 2025
@divinerapier
Copy link
Copy Markdown
Contributor Author

crates/examples/src/oss_backend.rs​​ provides an implementation example for loading data from Aliyun OSS.

and run it by:

cargo run --package iceberg-examples --example oss-backend --features storage-oss

@liurenjie1024
Copy link
Copy Markdown
Contributor

I don't think it's a blocker.

@liurenjie1024, do you think it's a good idea to implement it first but leave it out of storage-all, so users who want to try it out have to manually enable storage-oss?

Sounds reasonable to me, but we need to add documentation to explain why we don't include it int storage-all.

liurenjie1024
liurenjie1024 previously approved these changes Apr 8, 2025
Copy link
Copy Markdown
Contributor

@liurenjie1024 liurenjie1024 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @divinerapier for this pr, LGTM!

@Xuanwo
Copy link
Copy Markdown
Member

Xuanwo commented Apr 9, 2025

Hi, @divinerapier would you like to merge with main to fix the CI?

Signed-off-by: divinerapier <sihao.fang@outlook.com>
@Xuanwo Xuanwo changed the title feat(io): add OSS storage implementation and update dependencies feat(io): add OSS storage implementation Apr 10, 2025
Copy link
Copy Markdown
Member

@Xuanwo Xuanwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @divinerapier for working on this, really nice!

@Xuanwo Xuanwo merged commit a229d89 into apache:main Apr 10, 2025
17 of 18 checks passed
@Xuanwo
Copy link
Copy Markdown
Member

Xuanwo commented Apr 10, 2025

I previously contacted Aliyun (OSS) pre-sales support to inquire about dedicated testing environments, but unfortunately, they currently do not offer such programs for open-source projects. Given OpenDAL's successful model of using community-sponsored resources, I want to explore two potential paths:

Would you like to submit a seperate issue for this?

@divinerapier
Copy link
Copy Markdown
Contributor Author

An issue submitted #1188

Li0k pushed a commit to risingwavelabs/iceberg-rust that referenced this pull request Sep 23, 2025
- [x] Support AliyunOSS backend by  OpenDAL
- [x] Update doc explains why feature `storage-oss` is not included in
`storage-all`.
- [x] Fix typo

<!--
Provide a summary of the modifications in this PR. List the main changes
such as new features, bug fixes, refactoring, or any other updates.
-->

<!--
Specify what test covers (unit test, integration test, etc.).

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

The support for AliyunOSS is based on OpenDAL, a production-verified
project. Therefore, no additional testing has been added.

Add a new example.

Signed-off-by: divinerapier <sihao.fang@outlook.com>
Li0k added a commit to risingwavelabs/iceberg-rust that referenced this pull request Sep 26, 2025
- [x] Support AliyunOSS backend by  OpenDAL
- [x] Update doc explains why feature `storage-oss` is not included in
`storage-all`.
- [x] Fix typo

<!--
Provide a summary of the modifications in this PR. List the main changes
such as new features, bug fixes, refactoring, or any other updates.
-->

<!--
Specify what test covers (unit test, integration test, etc.).

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

The support for AliyunOSS is based on OpenDAL, a production-verified
project. Therefore, no additional testing has been added.

Add a new example.

Signed-off-by: divinerapier <sihao.fang@outlook.com>
Co-authored-by: divinerapier <sihao.fang@outlook.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants