Skip to content

chore(source-s3): update base image to 4.0.0 and use caret dependencies (do not merge)#55202

Merged
Aldo Gonzalez (aldogonzalez8) merged 9 commits intomasterfrom
devin/1741137980-update-source-s3
Mar 12, 2025
Merged

chore(source-s3): update base image to 4.0.0 and use caret dependencies (do not merge)#55202
Aldo Gonzalez (aldogonzalez8) merged 9 commits intomasterfrom
devin/1741137980-update-source-s3

Conversation

@devin-ai-integration
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot commented Mar 5, 2025

Update source-s3 to:

  • Use new base image (4.0.0)
  • Replace dependency declarations from specific versions to use carets
  • Bump dependencies by running poetry lock

Link to Devin run: https://app.devin.ai/sessions/38e801d31cf94b62ad7bc5f7577bfd2e
Requested by: User

Resolves: https://github.com/airbytehq/airbyte-internal-issues/issues/11890

Co-Authored-By: natik@airbyte.io <natik@airbyte.io>
@vercel
Copy link

vercel bot commented Mar 5, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
airbyte-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Mar 12, 2025 6:49pm

@devin-ai-integration
Copy link
Contributor Author

🤖 Devin AI Engineer

Original prompt from natik@airbyte.io:

Received message in Slack channel #dev-devin-ai:

@Devin please update source-s3 to:
• Use new base image (4.0.0, lookup correct SHA in other python sources)
• Replace dependency declarations from specific versions (`== XYZ`) to use carets (` ^XYZ`) instead. For example, smart-open ^5.0.3 or something
• bump deps (run poetry lock, then commit the result)

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add "(aside)" to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@natikgadzhi
Copy link
Contributor

Aldo Gonzalez (@aldogonzalez8) this one is interesting — it's failing with the following error in pypi smoke test:

TypeError: Can't instantiate abstract class SourceS3StreamReader with abstract methods file_permissions_schema, get_file_acl_permissions, identities_schema, load_identity_groups

And I think what that means is that file-transfer mode now requires those methods to be defined, but they are not really relevant to S3. What's the path forward?

@aldogonzalez8
Copy link
Contributor

Aldo Gonzalez (@aldogonzalez8) this one is interesting — it's failing with the following error in pypi smoke test:

TypeError: Can't instantiate abstract class SourceS3StreamReader with abstract methods file_permissions_schema, get_file_acl_permissions, identities_schema, load_identity_groups

And I think what that means is that file-transfer mode now requires those methods to be defined, but they are not really relevant to S3. What's the path forward?

I see, can we ask Devin to implement as noop?

…ns and identities

Co-Authored-By: natik@airbyte.io <natik@airbyte.io>
@aldogonzalez8
Copy link
Contributor

Aldo Gonzalez (@aldogonzalez8) this one is interesting — it's failing with the following error in pypi smoke test:

TypeError: Can't instantiate abstract class SourceS3StreamReader with abstract methods file_permissions_schema, get_file_acl_permissions, identities_schema, load_identity_groups

And I think what that means is that file-transfer mode now requires those methods to be defined, but they are not really relevant to S3. What's the path forward?

I see, can we ask Devin to implement as noop?

That was fast.

@aldogonzalez8
Copy link
Contributor

We may need to ask Devin to create some unit tests for Noop methods. It seems awkward, but I think they are hitting coverage for stream_reader.

---------- coverage: platform linux, python 3.11.11-final-0 ----------
Name Stmts Miss Cover

source_s3/init.py 0 0 100%
source_s3/source.py 19 0 100%
source_s3/source_files_abstract/init.py 0 0 100%
source_s3/source_files_abstract/formats/init.py 0 0 100%
source_s3/source_files_abstract/formats/avro_spec.py 5 0 100%
source_s3/source_files_abstract/formats/csv_spec.py 16 0 100%
source_s3/source_files_abstract/formats/jsonl_spec.py 13 0 100%
source_s3/source_files_abstract/formats/parquet_spec.py 9 0 100%
source_s3/source_files_abstract/spec.py 55 1 98%
source_s3/v4/init.py 6 0 100%
source_s3/v4/config.py 35 1 97%
source_s3/v4/cursor.py 82 1 99%
source_s3/v4/legacy_config_transformer.py 86 3 97%
source_s3/v4/source.py 65 1 98%
source_s3/v4/stream_reader.py 184 37 80%
source_s3/v4/zip_reader.py 183 33 82%

TOTAL 758 77 90%

FAIL Required test coverage of 90% not reached. Total coverage: 89.84%
======================= 103 passed, 19 warnings in 6.83s =======================

…hods

Co-Authored-By: natik@airbyte.io <natik@airbyte.io>
@aldogonzalez8
Copy link
Contributor

/format-fix

@aldogonzalez8
Copy link
Contributor

Aldo Gonzalez (aldogonzalez8) commented Mar 5, 2025

/format-fix

Format-fix job started... Check job output.

✅ Changes applied successfully. (167156b)

@aldogonzalez8
Copy link
Contributor

I will decouple these methods to a new class so we don't need to initialize noops, will circle back soon.

@aldogonzalez8
Copy link
Contributor

I have a fix, I will update this PR once the CDK PR is merged.

airbytehq/airbyte-python-cdk#402

@aldogonzalez8
Copy link
Contributor

Devin can you do the follwing:

  1. Change in airbyte-integrations/connectors/source-s3/pyproject.toml

airbyte-cdk = {extras = ["file-based"], version = "^6.18.2"}

to

airbyte-cdk = {extras = ["file-based"], version = "6.38.3.dev04101"}

  1. Do poetry lock to update airbyte-integrations/connectors/source-s3/poetry.lock

  2. Remove noop methods:

  • file_permissions_schema
  • get_file_acl_permissions
  • identities_schema
  • load_identity_groups
  1. Also remove the unit tests related to these noop methods (test_file_permissions_and_identity_methods), if CDK changes are ok, everything will pass.

Co-Authored-By: natik@airbyte.io <natik@airbyte.io>
@aldogonzalez8
Copy link
Contributor

Aldo Gonzalez (aldogonzalez8) commented Mar 10, 2025

/format-fix

Format-fix job started... Check job output.

✅ Changes applied successfully. (ee35ae4)

@aldogonzalez8
Copy link
Contributor

Devin can you do the follwing:

Change in airbyte-integrations/connectors/source-s3/pyproject.toml

airbyte-cdk = {extras = ["file-based"], version = "6.38.3.dev04101"}

to

airbyte-cdk = {extras = ["file-based"], version = "^6.38.5"}

Do poetry lock to update airbyte-integrations/connectors/source-s3/poetry.lock

Co-Authored-By: natik@airbyte.io <natik@airbyte.io>
@aldogonzalez8
Copy link
Contributor

Devin in airbyte-integrations/connectors/source-s3/source_s3/v4/stream_reader.py

We have "from typing import Any, Dict, Iterable, List, Optional, Set, cast" that includes "Any", but I think "Any" is not used. Can we remove that part of the import? Probably is a miss from when we removed the unit tests that were unnecessary.

Co-Authored-By: natik@airbyte.io <natik@airbyte.io>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look good — merge when you're happy with it.

@aldogonzalez8
Copy link
Contributor

Aldo Gonzalez (aldogonzalez8) commented Mar 12, 2025

/approve-regression-tests

Check job output.

✅ Approving regression tests

@aldogonzalez8
Copy link
Contributor

Aldo Gonzalez (aldogonzalez8) commented Mar 12, 2025

/approve-regression-tests

Check job output.

✅ Approving regression tests

@aldogonzalez8 Aldo Gonzalez (aldogonzalez8) merged commit 2da3ed7 into master Mar 12, 2025
27 checks passed
@aldogonzalez8 Aldo Gonzalez (aldogonzalez8) deleted the devin/1741137980-update-source-s3 branch March 12, 2025 19:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants