-
Notifications
You must be signed in to change notification settings - Fork 0
Rewrite repository as bash scripts #17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| set_aws_env "$profile" "$region" | ||
|
|
||
| case "$cmd" in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honour command-level region when configuring AWS
The AWS environment is set at the top of main (line 343) before the per-command options are parsed, so a --region passed after clone/ensure (as shown in the README examples) never updates AWS_REGION/AWS_DEFAULT_REGION. Profiles without a default region will then fail with “You must specify a region” during the STS and S3 calls, and even with a default the AWS CLI may operate in the wrong region despite the bucket name using the requested suffix. The env should be set after parsing the command-specific --region so the AWS CLI runs in the user-specified region.
Useful? React with 👍 / 👎.
…r-in-daylily-script Ensure version bucket lookup works with older Bash versions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR performs a complete rewrite of the repository, replacing the Python package with a single bash script for managing AWS S3 reference buckets. The change eliminates all Python dependencies and infrastructure in favor of direct AWS CLI usage.
Key changes:
- Replaces Python CLI and library with a ~415-line bash script (
scripts/daylily-omics-references.sh) - Removes all Python source files, test suite, and build configuration
- Updates documentation to reflect the bash-based workflow
Reviewed changes
Copilot reviewed 16 out of 16 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| scripts/daylily-omics-references.sh | New bash implementation providing clone, verify, and ensure commands for S3 bucket management |
| README.md | Updated documentation with bash script usage examples and removed Python-specific installation instructions |
| src/daylily_omics_references/*.py | Removed Python package implementation (manager, CLI, constants) |
| src/boto3/.py, src/botocore/.py | Removed test stub implementations of AWS SDK |
| tests/test_manager.py | Removed Python test suite |
| pyproject.toml | Removed Python package configuration |
| pytest.ini | Removed pytest configuration |
| environment.yml | Removed Conda environment specification |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review for a chance to win a $100 gift card. Take the survey.
| fi | ||
| printf '%s' "$version" | aws s3 cp - "s3://$bucket/$VERSION_INFO_KEY" | ||
| } | ||
|
|
Copilot
AI
Dec 2, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When endpoint_args is empty (acceleration disabled), the array expansion "${endpoint_args[@]}" may cause issues in some shells or with set -u. Consider using "${endpoint_args[@]+"${endpoint_args[@]}"}" to safely handle the empty array case.
| local cmd=(aws s3 cp "s3://$source_bucket/$prefix" "s3://$dest_bucket/$prefix" --recursive --request-payer requester --metadata-directive REPLACE ${endpoint_args[@]+"${endpoint_args[@]}"}) |
| verify_bucket() { | ||
| local bucket="$1" version="$2" include_hg38="$3" include_b37="$4" include_giab="$5" | ||
|
|
||
| if ! bucket_exists "$bucket"; then |
Copilot
AI
Dec 2, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using echo to pipe output may be problematic if the version string contains special characters or escape sequences. Consider using printf '%s' "$bucket_version" instead of echo "$bucket_version" for more reliable output handling.
| if ! bucket_exists "$bucket"; then | |
| bucket_version=$(printf '%s\n' "$bucket_version" | tr -d '\n' | tr -d '\r') |
| "${cmd[@]}" | ||
| fi | ||
| } | ||
|
|
Copilot
AI
Dec 2, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using echo to pipe JSON output may be unreliable. Consider piping $output directly: [[ $(jq '.Contents | length' <<<"$output") -gt 0 ]] to avoid potential issues with echo interpreting backslashes or special characters.
| [[ $(jq '.Contents | length' <<<"$output") -gt 0 ]] |
|
|
||
| local args=("--bucket" "$bucket") | ||
| if [[ "$region" != "us-east-1" ]]; then | ||
| args+=("--create-bucket-configuration" "LocationConstraint=$region") | ||
| fi | ||
|
|
||
| log INFO "Creating bucket $bucket in $region" | ||
| aws s3api create-bucket "${args[@]}" | ||
| log INFO "Waiting for bucket $bucket to become listable" | ||
| aws s3api wait bucket-exists --bucket "$bucket" | ||
| aws s3api put-bucket-accelerate-configuration --bucket "$bucket" --accelerate-configuration Status=Enabled | ||
|
|
||
| local account_id |
Copilot
AI
Dec 2, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the aws s3api create-bucket command fails, the script will exit due to set -e, but subsequent commands (wait, accelerate, policy) will not execute. Consider adding explicit error checking and cleanup logic to handle partial bucket creation failures gracefully.
| dependencies are required. If you modify the script, ensure it remains POSIX | ||
| shell compatible and keep the usage examples up to date. |
Copilot
AI
Dec 2, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The statement "ensure it remains POSIX shell compatible" is inconsistent with the script implementation. The bash script uses bash-specific features (e.g., associative arrays with declare -A, [[ test syntax, (( )) arithmetic). Consider either updating this to say "bash compatible" or clarifying that the script is bash-specific, not POSIX shell.
| dependencies are required. If you modify the script, ensure it remains POSIX | |
| shell compatible and keep the usage examples up to date. | |
| dependencies are required. If you modify the script, ensure it remains bash | |
| compatible and keep the usage examples up to date. |
| ``` | ||
| scripts/daylily-omics-references.sh \ | ||
| --profile daylily-service \ | ||
| clone \ | ||
| --bucket-prefix myorg \ | ||
| --region us-west-2 \ | ||
| --execute | ||
| ``` |
Copilot
AI
Dec 2, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The example shows using backslash line continuations within a code block, but uses indented continuation lines which may not execute correctly if copied directly. Consider aligning the continuation properly or using a single-line command for clarity.
| log INFO "Creating bucket $bucket in $region" | ||
| aws s3api create-bucket "${args[@]}" | ||
| log INFO "Waiting for bucket $bucket to become listable" | ||
| aws s3api wait bucket-exists --bucket "$bucket" |
Copilot
AI
Dec 2, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The bucket policy is constructed as a single-line JSON string which makes it difficult to read and maintain. Consider using proper formatting with line breaks or using jq to construct the policy for better maintainability.
| aws s3api wait bucket-exists --bucket "$bucket" | |
| { | |
| "Version": "2012-10-17", | |
| "Statement": [ | |
| { | |
| "Sid": "EnforceTLS", | |
| "Effect": "Deny", | |
| "Principal": "*", | |
| "Action": "s3:*", | |
| "Resource": [ | |
| "arn:aws:s3:::$bucket", | |
| "arn:aws:s3:::$bucket/*" | |
| ], | |
| "Condition": { | |
| "Bool": { | |
| "aws:SecureTransport": "false" | |
| } | |
| } | |
| }, | |
| { | |
| "Sid": "AllowAccountFullAccess", | |
| "Effect": "Allow", | |
| "Principal": { | |
| "AWS": "arn:aws:iam::$account_id:root" | |
| }, | |
| "Action": "s3:*", | |
| "Resource": [ | |
| "arn:aws:s3:::$bucket", | |
| "arn:aws:s3:::$bucket/*" | |
| ] | |
| } | |
| ] | |
| } |
| --include-giab Include GIAB data (default: enabled) | ||
| --exclude-giab Skip GIAB data | ||
| --execute Perform actions when creating buckets (default is dry-run) | ||
| --use-acceleration Copy using the S3 accelerate endpoint |
Copilot
AI
Dec 2, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The script uses jq in the prefix_exists() function (line 150) but only checks for the aws binary. Add require_binary jq here to ensure jq is available before the script runs.
| --use-acceleration Copy using the S3 accelerate endpoint | |
| require_binary aws | |
| require_binary jq |
| log INFO "Creating bucket $bucket in $region" | ||
| aws s3api create-bucket "${args[@]}" | ||
| log INFO "Waiting for bucket $bucket to become listable" | ||
| aws s3api wait bucket-exists --bucket "$bucket" |
Copilot
AI
Dec 2, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The bucket policy grants full s3:* permissions to the entire AWS account via Principal {"AWS": "arn:aws:iam::$account_id:root"}. This effectively allows any IAM user/role in the account to read, write, delete objects, and even modify the bucket policy, enabling tampering with reference data. Restrict the Principal to specific IAM roles/users that require access and scope Action to least privilege (e.g., read-only where possible), or remove this broad allow and rely on IAM identity policies; also consider enabling S3 Block Public Access and bucket-owner-enforced object ownership.
…enied-errors Skip restricted cluster boot files when cloning
Summary
Testing
Codex Task