Skip to content

Conversation

@amontanez24
Copy link
Contributor

@amontanez24 amontanez24 commented Sep 3, 2025

resolves #1030

This PR adds a script and github workflow that:

  • On release, creates a python package called rdt-download-tracker
  • Matches the version of this package to the version of the rdt release
  • uploads the package to S3
  • Updates the index.html page in the S3 bucket to have the new files (this is required for pip to be able to install it)

@codecov
Copy link

codecov bot commented Sep 3, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 100.00%. Comparing base (dbfd782) to head (9f9fb70).
⚠️ Report is 7 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff            @@
##              main     #1031   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           19        19           
  Lines         2642      2642           
=========================================
  Hits          2642      2642           
Flag Coverage Δ
integration 83.49% <ø> (ø)
unit 100.00% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

lint

Beginning to add code to update index

Adding workflow and updating script to update index.html

Commenting out branch for now and renaming workflow

Adding id-token

Only adding new files to index

Adding dryrun and ability to create index file for first time

Fixing format of index.html

adding new line

using actual s3 link
@amontanez24 amontanez24 force-pushed the issue-1030-tracking-package branch from cbbd203 to 3b2cfa3 Compare September 4, 2025 19:22
Comment on lines 15 to 16
# with:
# ref: 'stable'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will uncomment this before merging because we want it to install the rdt release version

on:
release:
types: [published]
pull_request:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will remove this before merging

@amontanez24 amontanez24 marked this pull request as ready for review September 4, 2025 23:23
@amontanez24 amontanez24 requested a review from a team as a code owner September 4, 2025 23:23
@amontanez24 amontanez24 requested review from gsheni and removed request for a team September 4, 2025 23:23
text_list = [current_text]
for file in files:
download_link = f'https://{BUCKET}.s3.us-east-1.amazonaws.com/{S3_PACKAGE_PATH}{file}'
new_link = f"<a href='{download_link}'>{file}</a>"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need to include the sha256 hash?

Each file URL SHOULD include a hash in the form of a URL fragment with the following syntax: #=, where is the lowercase name of the hash function (such as sha256) and is the hex encoded digest.

Key=index_file_path,
Body=new_index,
ContentType='text/html',
CacheControl='no-cache',
Copy link
Contributor

@gsheni gsheni Sep 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we specify the ChecksumAlgorithm to be SHA256? It defaults to CRC64NVME

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does the index file need a checksum? Wouldn't it just be needed for the actual wheels and tar.gz files?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right, yah just specify the Checksum for wheel/tar.gz files

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, good catch. I'll add that

@amontanez24 amontanez24 requested a review from rwedge September 8, 2025 20:11
@amontanez24
Copy link
Contributor Author

@gsheni @rwedge I added a refactoring change to pull the publishing workflows into workflows that have to always be called with inputs. The release workflow will just call them with the desired inputs. Let me know what you think

@@ -0,0 +1,87 @@
name: Publish
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
name: Publish
name: Publish RDT

type: boolean
default: false
jobs:
release:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
release:
publish-rdt::

candidate: false
test_pypi: false
release-download-tracker:
uses: ./.github/workflows/publish_download_tracker.yml
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if the download tracker version already exists on PyPI? Will version of download tracker match RDT's version?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script will use the version of RDT that it installs. If it already exists, it will be overridden. That shouldn't really happen but if it does I think it's ok

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha, as long as the hash of the files doesn't change we should be good.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the hash does change, will it work if I remove the old link in the index and replace it with a link with the new hash?

Copy link
Contributor

@gsheni gsheni Sep 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should. pip does verify the wheel/tar.gz file hash matches. That is, the hash given by index.html should match hash generated by the user's local machine (pip downloads the file and hashes it itself). If the mismatch occurs, pip will throw an error.
In addition,pip on user's machine will cache files. So if they have downloaded a wheel file before, the hash changes, then pip will complain. I believe the user would have to run pip cache purge in that case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So is it better to do that, or just prevent it from uploading a package that is already there?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should prevent it from uploading if the package is there (similar to public PyPI where you cannot re-upload an already published version).

@amontanez24 amontanez24 requested a review from gsheni September 11, 2025 19:55
Copy link
Contributor

@gsheni gsheni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Just 1 suggestion.

if file_name not in links:
filepath = os.path.join('dist', file_name)
file_hash = _get_file_hash(filepath)
s3_client.upload_file(filepath, BUCKET, dest)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
s3_client.upload_file(filepath, BUCKET, dest)
s3_client.upload_file(filepath, BUCKET, dest, ExtraArgs={'ChecksumAlgorithm': 'SHA256'})

@@ -0,0 +1 @@
# TODO: fill this in No newline at end of file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we create a follow up issue to document this?

@amontanez24
Copy link
Contributor Author

Closing this PR because we will no longer be adding this package

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add workflow to create and upload package to track downloads

4 participants