You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Install requirements.txt. Can be done using virtual environment.
Move all .csv converted files from charity commission repo to /extractsdirectory contained in this repo.
Run python3 charity_data_processing.py to process the data.
The file will generate .csv files with
name, contact info, areas of operation
.csv files are saved in a new directory /outputs
A new file is created for each class of charity specified. These can be
defined in the list control_group in line 8 of charity_data_processing.py. Add more/less as you wish.
I have cleaned this data a lot (removed subsidiares, only included current charities, cleaned text)
The data contains mixed data types in certain columns. This makes the script
quite slow. I have provided a logging system to try and provide the user as much information as possible. I am getting in touch with the CC to try and sort this out.