Skip to content

Conversation

@alexhyunminlee
Copy link
Collaborator

@alexhyunminlee alexhyunminlee commented Feb 8, 2026

Summary

This PR adds python script and function that generates gas tariff mapping for bldg_id's based on electrical_utility and gas_utility

Closes #138

@alexhyunminlee alexhyunminlee linked an issue Feb 8, 2026 that may be closed by this pull request
Copy link
Contributor

@jpvelez jpvelez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly just the same fixes as in the electric tariff mapper

@@ -0,0 +1,11 @@
# Resolve project root as absolute path from this Justfile's location (works for any clone)
project_root := absolute_path(justfile_directory() / ".." / ".." / ".." / ".." / "..")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same fix as in the other PR

@@ -0,0 +1,11 @@
# Resolve project root as absolute path from this Justfile's location (works for any clone)
project_root := absolute_path(justfile_directory() / ".." / ".." / ".." / ".." / "..")
data_base := "s3://data.sb/nrel/resstock/res_2024_amy2018_2/metadata"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same fix as in the other PR

from utils.types import electric_utility

# Project root (rate-design-platform); independent of cwd or caller
_PROJECT_ROOT = Path(__file__).resolve().parent.parent
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same fix as in the other PR



def map_gas_tariff(
SB_metadata_path: S3Path,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same fix as in the other PR

.filter(pl.col("sb.electric_utility") == electric_utility_name)
.collect()
)
if utility_metadata_df.is_empty():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same fix as in the other PR

# For now, we will manually add the electric utility name column. Later on, the metadata parquet will be updated to include this column.
# Assign first ~1/3 to Coned, next ~1/3 to National Grid, last ~1/3 to NYSEG.
metadata_path = S3Path(args.metadata_path)
metadata_df = pl.read_parquet(io.BytesIO(metadata_path.read_bytes()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same fix as in the other PR

n = len(metadata_df)
# Create electricity utility name column
metadata_df = (
metadata_df.with_row_index("_row_idx")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same fix as in the other PR

raise FileNotFoundError(f"SB metadata path {SB_metadata_path} does not exist")

utility_metadata_df = (
pl.scan_parquet(io.BytesIO(SB_metadata_path.read_bytes()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same fix as in the other PR

utility_metadata_df.select(pl.col("bldg_id", "sb.gas_utility"))
.with_columns(
pl.when(pl.col("sb.gas_utility") == "National Grid")
.then(pl.lit("National Grid"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Obviously, this will need to be generalized, but it's fine for now.

However, the since the tariff_keys will get spliced into filenames, they need to be lowercase_with_underscore.

)
if not output_path.parent.exists():
output_path.parent.mkdir(parents=True)
gas_tariff_mapping_df.write_csv(output_path)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same fix as in the other PR

--state NY \
--upgrade_id "{{upgrade_id}}" \
--electric_utility "{{electric_utility}}" \
{{ if output_dir != "" { "--output_dir \"" + output_dir + "\"" } else { "" } }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was allowing output_dir to be an optional input. Changed in the latest commit to make mandatory.


# Some useful enumerations:

map-coned-default:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's just 1 utility in rhode island—Rhode Island Energy (RIE)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These values correspond to the electrical utility, not the gas utility. We take electrical_utility_name as input here, filter out the bldg_id's that correspond to the provided electrical utility, then assign gas tariff_key's based on the gas_utility column value in the metadata-sb.parquet

just map-electric-tariff Coned class_specific_seasonal 1 00

map-national-grid-default:
just map-gas-tariff National Grid 00
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to cause a syntax error... should be "National Grid" if it has whitespace

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in the latest commit

_PROJECT_ROOT = Path(__file__).resolve().parent.parent
RATE_DESIGN_DIR = _PROJECT_ROOT / "rate_design"

AWS_REGION = "us-west-2"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This really shouldn't be hardcoded... the aws tools should be able to pick this up from the environment.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in the latest commit

raise FileNotFoundError(f"Metadata path {metadata_path} does not exist")
SB_metadata_lazy_df = pl.scan_parquet(str(metadata_path))

SB_metadata_lazy_df_with_utilities = SB_metadata_lazy_df.with_columns(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need for "_lazy_df" type stuff in the variable names... we have type annotations and type inference in IDEs.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in the latest commit

try:
out_base = S3Path(args.output_dir)
output_path = out_base / output_filename
if not output_path.parent.exists():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for this, and therefore for S3Path. It can just fail if the directory is missing and then the user knows to make it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in the latest commit

gas_tariff_mapping_df.sink_csv(str(output_path))
else:
output_path = (
RATE_DESIGN_DIR
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Get rid of this (and of RATE_DESIGN_DIR) and fail if output path isn't provided

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in the latest commit

@jpvelez jpvelez merged commit 43dd520 into main Feb 9, 2026
2 of 4 checks passed
@jpvelez jpvelez deleted the 138-function-to-generate-gas-tariff-mapping branch February 9, 2026 21:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Function to generate gas tariff mapping

2 participants