Conversation
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
There was a problem hiding this comment.
Pull Request Overview
Updates the SITCOM-2111 evaluation notebook to allow specifying a manual timezone offset, refactor forecast unpacking, and parameterize column count.
- Replace single
day_obswithday_obs_start/day_obs_endand remove fixednumber_of_days - Introduce
N_COLSand manual timezone offsetclt_tz_offsetfor plotting - Refactor static unpacking logic into a new
unpack_hourly_weather_forecast_matching_timestampsfunction
Comments suppressed due to low confidence (3)
notebooks/tel_and_site/subsys_req_ver/SITCOM-2111_Temperature_Forecast_Evaluation.ipynb:72
- [nitpick] Consider renaming this constant to CLT_TZ_OFFSET to follow the uppercase naming convention for constants and improve readability.
"clt_tz_offset = -4 # hours\n",
notebooks/tel_and_site/subsys_req_ver/SITCOM-2111_Temperature_Forecast_Evaluation.ipynb:498
- [nitpick] The code switches between
published_timestampandpublished_time; consider standardizing on one name for consistency.
"for pub_t in df_forecast[\"published_time\"].unique():\n",
notebooks/tel_and_site/subsys_req_ver/SITCOM-2111_Temperature_Forecast_Evaluation.ipynb:263
- The function
unpack_hourly_weather_forecastis not defined in this notebook; ensure it is imported or replaced with the correct unpack function.
" sub_df = unpack_hourly_weather_forecast(sub_df.loc[published_time])\n",
| " for published_time, row in _df[cols].iterrows():\n", | ||
| "\n", | ||
| " # Get the beginning of the data corresponding to this time range\n", | ||
| " index_start = (row - published_time).abs().argmin()\n", |
There was a problem hiding this comment.
Computing index_start this way returns a column label string instead of an integer offset, which will break the subsequent index_end calculation; consider extracting the numeric suffix or using positional indexing.
| " index_start = (row - published_time).abs().argmin()\n", | |
| " index_start = int((row - published_time).abs().argmin().lstrip('timestamp'))\n", |
| "\n", | ||
| "# Plot forecast\n", | ||
| "ax.plot(df_forecast[\"temperature\"], label=\"Meteoblue Latests Forecast\")\n", | ||
| "ax.plot(df_forecast.timestamp - tz_offset , df_forecast.temperature, label=\"Meteoblue Latests Forecast\")\n", |
There was a problem hiding this comment.
There's a typo in the label: "Latests" should be "Latest".
| "ax.plot(df_forecast.timestamp - tz_offset , df_forecast.temperature, label=\"Meteoblue Latests Forecast\")\n", | |
| "ax.plot(df_forecast.timestamp - tz_offset , df_forecast.temperature, label=\"Meteoblue Latest Forecast\")\n", |
| " if new_df is not None:\n", | ||
| " new_df = pd.concat([new_df, sub_df])\n", | ||
| " else:\n", | ||
| " new_df = sub_df\n", | ||
| "\n", | ||
| " return new_df" |
There was a problem hiding this comment.
Repeatedly concatenating DataFrames inside a loop can be inefficient; consider collecting each sub_df in a list and performing one concat after the loop.
| " if new_df is not None:\n", | |
| " new_df = pd.concat([new_df, sub_df])\n", | |
| " else:\n", | |
| " new_df = sub_df\n", | |
| "\n", | |
| " return new_df" | |
| " sub_dfs.append(sub_df)\n", | |
| "\n", | |
| " new_df = pd.concat(sub_dfs, ignore_index=True)\n", | |
| " return new_df\n", |
No description provided.