-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Add map_variables and -99999 nan value to read_crn #1368
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
mikofski
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
kandersolar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice that this isn't a breaking change. Still needs a whatsnew entry though
Co-authored-by: Kevin Anderson <[email protected]>
wholmgren
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @AdamRJensen! A couple of minor things below. Can you also update the PR title?
| try: | ||
| # to_datetime(utc=True) does not work in older versions of pandas | ||
| data = data.tz_localize('UTC') | ||
| except TypeError: | ||
| pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Too bad I neglected to indicate the versions that needed this. I think the tests are comprehensive enough that we can safely remove this if they still pass.
pvlib/iotools/crn.py
Outdated
| ) | ||
|
|
||
| VARIABLE_MAP = { | ||
| CRN_VARIABLE_MAP = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
surfrad.py has VARIABLE_MAP while midc.py has MIDC_VARIABLE_MAP, so either the original or the change is consistent with existing code. I prefer just VARIABLE_MAP since it's simple and already specific to the module. Open to other ideas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I can get behind consistently using VARIABLE_MAP - though I have to admit that I might have done a similar change in previous PRs.
| # set dtypes here because dtype kwarg not supported in read_fwf until 0.20 | ||
| data = data.astype(dict(zip(HEADERS, DTYPES))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pvlib now requires pandas > 0.22 so perhaps we can remove this line in favor of the dtype kwarg
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not believe this is possible as there are rows with all nans - and you cannot set the column dtype to int if the column contains nans. In the current version, the dtypes are set after the all nan rows are removed.
I get the following error when i set dtype=dict(zip(HEADERS, DTYPES)) with read_fwf:
ValueError: Unable to convert column WBANNO to type int64
|
Objections to merging this once the conflict is resolved? I think it's time for a 0.9.1. |
No objection. Definitely time for 0.9.1 |
|
Thanks @AdamRJensen! |
read_crnreturns -99999 instead ofNaN#1372[ ] Updates entries todocs/sphinx/source/api.rstfor API changes.docs/sphinx/source/whatsnewfor all changes. Includes link to the GitHub Issue with:issue:`num`or this Pull Request with:pull:`num`. Includes contributor name and/or GitHub username (link with:ghuser:`user`).I propose adding an argument
map_variablesto theread_crnfunction, which is the standard pattern for the iotools. Also, -99999 has been added to the list of nan values as per issue #1372.