Skip to content

Enable batch training#134

Open
Oculux314 wants to merge 36 commits intomainfrom
nwil508
Open

Enable batch training#134
Oculux314 wants to merge 36 commits intomainfrom
nwil508

Conversation

@Oculux314
Copy link
Contributor

@Oculux314 Oculux314 commented Dec 5, 2025

You can now run a batch of trainings with different configurations by specifying --batch. Works with parallel executions. Regular execution is unchanged if --batch is not specified.

You can set up the configurations in gymnasium_envrionments/scripts/batch_coordinator.py. E.g.

batch_config: dict[str, list[Any | tuple[Any, str]]] = {
    "alg_config.gamma": [0.9, 0.95],
    "env_config.task": ["run", "swingup"],
}

Will set up 4 runs:

  • gamma-0.9 task-run
  • gamma-0.95 task-run
  • gamma-0.9 task-swingup
  • gamma-0.95 task-swingup

You can edit the _skip() function for more fine-grained control.

--b_start and --b_end allow you to specify only running a range.

Edited the readme with this information.

@Oculux314 Oculux314 requested a review from beardyFace December 13, 2025 21:12
Comment on lines +14 to +95
from cares_reinforcement_learning.util.configurations import (
FunctionLayer,
MLPConfig,
TrainableLayer,
)

# MARK: ACTIVATION LAYERS

# GoLU
golu_a: MLPConfig = MLPConfig(
layers=[
TrainableLayer(layer_type="Linear", out_features=256),
FunctionLayer(layer_type="GoLU"),
]
)
golu_c: MLPConfig = MLPConfig(
layers=[
TrainableLayer(layer_type="Linear", out_features=256),
FunctionLayer(layer_type="GoLU"),
TrainableLayer(layer_type="Linear", in_features=256, out_features=1),
]
)

# GELU
gelu_a: MLPConfig = MLPConfig(
layers=[
TrainableLayer(layer_type="Linear", out_features=256),
FunctionLayer(layer_type="GELU"),
]
)
gelu_c: MLPConfig = MLPConfig(
layers=[
TrainableLayer(layer_type="Linear", out_features=256),
FunctionLayer(layer_type="GELU"),
TrainableLayer(layer_type="Linear", in_features=256, out_features=1),
]
)

# ReLU
relu_a: MLPConfig = MLPConfig(
layers=[
TrainableLayer(layer_type="Linear", out_features=256),
FunctionLayer(layer_type="ReLU"),
]
)
relu_c: MLPConfig = MLPConfig(
layers=[
TrainableLayer(layer_type="Linear", out_features=256),
FunctionLayer(layer_type="ReLU"),
TrainableLayer(layer_type="Linear", in_features=256, out_features=1),
]
)

# Leaky ReLU
leaky_a: MLPConfig = MLPConfig(
layers=[
TrainableLayer(layer_type="Linear", out_features=256),
FunctionLayer(layer_type="LeakyReLU"),
]
)
leaky_c: MLPConfig = MLPConfig(
layers=[
TrainableLayer(layer_type="Linear", out_features=256),
FunctionLayer(layer_type="LeakyReLU"),
TrainableLayer(layer_type="Linear", in_features=256, out_features=1),
]
)

# PReLU
prelu_a: MLPConfig = MLPConfig(
layers=[
TrainableLayer(layer_type="Linear", out_features=256),
FunctionLayer(layer_type="PReLU"),
]
)
prelu_c: MLPConfig = MLPConfig(
layers=[
TrainableLayer(layer_type="Linear", out_features=256),
FunctionLayer(layer_type="PReLU"),
TrainableLayer(layer_type="Linear", in_features=256, out_features=1),
]
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these activation functions here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, so that's how you configure the batches - you need to write them in code (I considered JSON, but I think that doesn't bring any benefit). I've been using this for Hoda's activation functions which is why these are here, but ty for the suggestion - I'll strip them down to a more generic example for this PR

@Oculux314 Oculux314 requested a review from beardyFace December 19, 2025 00:31
@Oculux314
Copy link
Contributor Author

Done! Edited README as well

@Oculux314 Oculux314 mentioned this pull request Dec 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants