Albania fix func1 and mapping for admin2_new#55
Albania fix func1 and mapping for admin2_new#55yukinko-iwasaki wants to merge 7 commits intomainfrom
Conversation
bhupatiraju
left a comment
There was a problem hiding this comment.
@yukinko-iwasaki This looks good to me!
If I am not mistaken you are not replacing the admin2 at the extraction stage as I thought you had to do but rather adding this new column, doing the calculations with the admin2 as we had before and then finally revealing admin2_new as the new admin2 in gold.
|
|
||
| # admin2 to admin2_new mapping | ||
| mapping = pd.read_csv('./mapping.csv') | ||
| mapping = mapping[['admin2', 'admin2_new', 'county']].rename(columns={'admin2': 'admin2_tmp'}).astype({'admin2_new': 'str'}) |
There was a problem hiding this comment.
Is this mapping going to change in the future? I assume this is a one time adjustment to get the new admin regions to conform the old ones but if not, could we check if the admin2 is a 2 digit code or a 3 digit code? In some cases I noticed that we have correct length padding but in other cases the leading zeros are not present. This may not be relevant here though.
There was a problem hiding this comment.
could we check if the admin2 is a 2 digit code or a 3 digit code?
When merging mapping df with the main dataframe, I temporarily converted the admin2 code into integer, so that we could ignore the padding length inconsistencies. So I think for our case, we don't have to worry about the paddings.
| ) | ||
|
|
||
| tag_code_mapping = pd.read_csv(TAG_MAPPING_URL) | ||
| tag_code_mapping = pd.read_csv(TAG_MAPPING_PATH) |
There was a problem hiding this comment.
Thank you so much for taking a look!
Could you try running this scrip here to confirm that the files can now be loaded using the relative path in the project?
Could you also make sure that you're in the "Repo/{your email address}/{repo_name} folder when testing?
This PR addresses the follwoing issues:
NOTE:
Now all the auxiliary files for Albania is tracked by git and these files are referred by relative paths instead of accessing the volume. This functionality is only available under repo. (please test this PR under Repo folder and not in your personal workspace.)