420 add skill score for ensemble metrics#494
Conversation
… to secondary table
…kill-score-for-ensemble-metrics
src/teehr/evaluation/generate.py
Outdated
| output_dataframe : ps.DataFrame | ||
| The output spark DataFrame of a specified timestep and start | ||
| and end datetimes. | ||
| update_variable_table : bool |
There was a problem hiding this comment.
Should these boolean args have default values?
There was a problem hiding this comment.
I think these classes BenchmarkForecast and SignatureTimeseries can go away to be honest. I think I added them to define the type of class that gets returned, but it seems like the method.generate() call can happen directly in the Generator class to reduce complexity.
I have the defaults of the boolean set in the method arguments.
| The input timeseries model. The defines a unique timeseries | ||
| that will be queried from the Evaluation and used as the | ||
| input_dataframe. Defaults to None. | ||
| start_datetime : Union[str, datetime] |
There was a problem hiding this comment.
Should a note be added to the datetime and timedelta doc strings indicating what string formats are supported. Or, I think something like "Any string that can be parsed by XYZ library" would be fine if appropriate (I did not look to see how it is parsed).
| validated_df = tbl._validate(df=self.df) | ||
| tbl._write_spark_df(validated_df, write_mode=write_mode) | ||
|
|
||
| def to_pandas(self): |
There was a problem hiding this comment.
Shoulkd this have to_sdf() and maybe to_geopandas() methods?
There was a problem hiding this comment.
I guess can .df.show() but that is not consistent with other classes...IDK
There was a problem hiding this comment.
Yeah I can add to_sdf(). Maybe we can implement to_geopandas() for timeseries in a separate issue?
|
|
||
|
|
||
| class Normals(SignatureGeneratorBaseModel, GeneratorABC): | ||
| """Model for generating synthetic normals timeseries.""" |
There was a problem hiding this comment.
Does it make sense to put a doc string here so users know what fields are available to be set?
There was a problem hiding this comment.
I guess because the way this is inherits other classes the code editor is not aware of what the properties are...
There was a problem hiding this comment.
added. I can see the doc string when hovering in vs code but it seems really sensitive to my cursor location
src/teehr/models/generate/base.py
Outdated
| configuration_name: Optional[str] = Field(default=None) | ||
| variable_name: Optional[str] = Field(default=None) | ||
| unit_name: Optional[str] = Field(default=None) | ||
| table_name: TimeseriesTableNamesEnum = Field( |
There was a problem hiding this comment.
nit: I feel like the table should be at the top or bottom of the list.
|
@mgdenno I took another crack at the filter issue we talked about. I created a TableFilter(), which takes a table and filter object. This is essentially just consolidates the two arguments. Then I created a filter() method on the Evaluation class which accepts a TableFilter class or table_name, filter arguments. This is just another way to filter a table (ex., I think this is closer to what I was originally going for, but let me know what you think. Happy to discuss |
This could use a review and some additional discussion. Here are the main changes:
calculate_climatology()method to timeseries table for calculating long-term averages of a timeseries based on some time interval (ie, day-of-year) (resolves Investigate adding simple baselines #228 )create_reference_forecast()method to secondary timeseries tableHourOfYearas row-level calculated fieldforecast_lead_time#451)load_dataframe()as a loading method to load in-memory dataframe (resolves Add _write_pandas() method top base table #471)load_tablecalls