-
-
Notifications
You must be signed in to change notification settings - Fork 71
[DISCARDED] Add OTL education CSV reader script #168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
TimidRobot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix PR description
Please correct the Checklist in the pull request (PR) description (use [x] where appropriate).
Relatedly it isn't appropriate to modify the template. The following change isn't welcome:
- <!-- Replace the [ ] with [x] to check the boxes (there is no space between x and square brackets). -->
+ <!-- Replace the [ ] with [y/n] to check the boxes (there is no space between x and square brackets). -->Query API
The goal of this project is to regularly query APIs. The pre-automation/education/datasets/OTL.csv file is not a valid fetch source.
Lowercase file names
Please use lowercase files names.
|
Thank you. I'ld refocus on the task on fetching data from sources using API since the sources in pre-automation/ are largely not valid data sources currently. |
|
Also:
The scripts should be run from the root of the repository using pipenv run ./scripts/1-fetch/github_fetch.py -hWhen run this way, the shared library ( Thank you for highlighting this issue. I've addressed the missing documentation by adding the Running the scripts section to the repository README. |
Hello @TimidRobot, I apologise for the delay here, i've had to read through your reviews to pin down exactly what the issue here was and i have made changes as requested for this PR. As regards the lower case for file names, I have adhered to that pattern for file naming in subsequent scripts submitted for your review other than the otl_fetch.py script because you mentioned that the OTL.csv data source is not a valid source. I'ld like to kindly ask that you please reopen the closed PR submitted as to give me a good chance to contributing to the Quantifying Creative Commons Project. I acknowledge my errors in this PR and i'll pay attention to it going forward. Thank You🙏🏼🙏🏼 |
Fixes
Description
the
OTL_education_read_csv.pyscript reads the pre-automation OTL.csv file and copies it to the current quarter's data directory for processing.The script:
• Reads from
pre-automation/education/datasets/OTL.csv• Copies the data to
data/{quarter}/1-fetch/otl_raw_data.csv• Includes proper error handling and logging
• Uses the
--enable-saveflag to control whether data is actually saved• Follows the project's established patterns for fetch phase scripts
Technical Note: Script Execution Requirements
Issue: The
OTL_education_read_csv.pyscript fails withModuleNotFoundError:No module named'shared'when run directly from the 1-fetch directory, necessitating the need for adjusting the need import structure. if import is adjusted, the pre-commit fails.Root Cause: The script imports the shared module from the parent scripts directory, but Python cannot locate it without proper path configuration.
Checklist
Update index.md).mainormaster).Developer Certificate of Origin
For the purposes of this DCO, "license" is equivalent to "license or public domain dedication," and "open source license" is equivalent to "open content license or public domain dedication."
Developer Certificate of Origin