Getting ValueError Error when using ModelCheckpoint with auto_insert_metric_name=False #16385
-
Hi, I understand that
But encounter a problem which is Wondering if I missed anything on this feature? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Hi @dongchirua, I found this issue by debugging pytorch_lightning/callbacks/model_checkpoint.py #L524 def _format_checkpoint_name(
cls,
filename: Optional[str],
metrics: Dict[str, Tensor],
prefix: str = "",
auto_insert_metric_name: bool = True,
) -> str:
if not filename:
# filename is not set, use default name
filename = "{epoch}" + cls.CHECKPOINT_JOIN_CHAR + "{step}"
# check and parse user passed keys in the string
groups = re.findall(r"(\{.*?)[:\}]", filename)
if len(groups) >= 0:
for group in groups:
name = group[1:]
if auto_insert_metric_name:
filename = filename.replace(group, name + "={" + name)
# support for dots: https://stackoverflow.com/a/7934969
filename = filename.replace(group, f"{{0[{name}]")
if name not in metrics:
metrics[name] = torch.tensor(0)
filename = filename.format(metrics)
if prefix:
filename = cls.CHECKPOINT_JOIN_CHAR.join([prefix, filename])
return filename Regex For me I just hacked into pytorch-lightning source code by changing |
Beta Was this translation helpful? Give feedback.
Hi @dongchirua, I found this issue by debugging pytorch_lightning/callbacks/model_checkpoint.py #L524