-
Notifications
You must be signed in to change notification settings - Fork 27
Add workflow to create and upload package to track downloads #1031
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1031 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 19 19
Lines 2642 2642
=========================================
Hits 2642 2642
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
lint Beginning to add code to update index Adding workflow and updating script to update index.html Commenting out branch for now and renaming workflow Adding id-token Only adding new files to index Adding dryrun and ability to create index file for first time Fixing format of index.html adding new line using actual s3 link
cbbd203 to
3b2cfa3
Compare
| # with: | ||
| # ref: 'stable' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will uncomment this before merging because we want it to install the rdt release version
| on: | ||
| release: | ||
| types: [published] | ||
| pull_request: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will remove this before merging
scripts/create_download_tracker.py
Outdated
| text_list = [current_text] | ||
| for file in files: | ||
| download_link = f'https://{BUCKET}.s3.us-east-1.amazonaws.com/{S3_PACKAGE_PATH}{file}' | ||
| new_link = f"<a href='{download_link}'>{file}</a>" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need to include the sha256 hash?
Each file URL SHOULD include a hash in the form of a URL fragment with the following syntax: #=, where is the lowercase name of the hash function (such as sha256) and is the hex encoded digest.
| Key=index_file_path, | ||
| Body=new_index, | ||
| ContentType='text/html', | ||
| CacheControl='no-cache', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we specify the ChecksumAlgorithm to be SHA256? It defaults to CRC64NVME
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does the index file need a checksum? Wouldn't it just be needed for the actual wheels and tar.gz files?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh right, yah just specify the Checksum for wheel/tar.gz files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, good catch. I'll add that
.github/workflows/publish.yml
Outdated
| @@ -0,0 +1,87 @@ | |||
| name: Publish | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| name: Publish | |
| name: Publish RDT |
.github/workflows/publish.yml
Outdated
| type: boolean | ||
| default: false | ||
| jobs: | ||
| release: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| release: | |
| publish-rdt:: |
| candidate: false | ||
| test_pypi: false | ||
| release-download-tracker: | ||
| uses: ./.github/workflows/publish_download_tracker.yml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if the download tracker version already exists on PyPI? Will version of download tracker match RDT's version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The script will use the version of RDT that it installs. If it already exists, it will be overridden. That shouldn't really happen but if it does I think it's ok
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha, as long as the hash of the files doesn't change we should be good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the hash does change, will it work if I remove the old link in the index and replace it with a link with the new hash?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should. pip does verify the wheel/tar.gz file hash matches. That is, the hash given by index.html should match hash generated by the user's local machine (pip downloads the file and hashes it itself). If the mismatch occurs, pip will throw an error.
In addition,pip on user's machine will cache files. So if they have downloaded a wheel file before, the hash changes, then pip will complain. I believe the user would have to run pip cache purge in that case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So is it better to do that, or just prevent it from uploading a package that is already there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should prevent it from uploading if the package is there (similar to public PyPI where you cannot re-upload an already published version).
gsheni
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Just 1 suggestion.
scripts/create_download_tracker.py
Outdated
| if file_name not in links: | ||
| filepath = os.path.join('dist', file_name) | ||
| file_hash = _get_file_hash(filepath) | ||
| s3_client.upload_file(filepath, BUCKET, dest) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| s3_client.upload_file(filepath, BUCKET, dest) | |
| s3_client.upload_file(filepath, BUCKET, dest, ExtraArgs={'ChecksumAlgorithm': 'SHA256'}) |
| @@ -0,0 +1 @@ | |||
| # TODO: fill this in No newline at end of file | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we create a follow up issue to document this?
|
Closing this PR because we will no longer be adding this package |
resolves #1030
This PR adds a script and github workflow that:
rdt-download-tracker