-
Notifications
You must be signed in to change notification settings - Fork 3
Open
Description
Currently, smartflow can run on cpu and gpu clusters. However, we can only use local + mpirun to run on a single node. It is highly desirable if we can also use slurm+srun on a single node of a cluster, where database and cfd runs are located on the same node.
In the code, if the database is created, echo experiment would not be able to start. However, if the database is not created by commenting the database lines, echo experiment would not be able to start. Do you have any ideas for solving the issue?
import smartsim
import os
from smartsim import Experiment
from smartsim.database.orchestrator import Orchestrator
exp = Experiment('envs', launcher='slurm')
db = exp.create_database(
interface='lo',
# Set the database to run on the current allocation
db_nodes=1,
batch=False, # Important: This tells SmartSim to use the current allocation
)
exp.start(db)
models = []
for i in range(1):
# Define run arguments
run_args = {
# 'cpus-per-task': 8,
# 'gpus-per-task': 1,
}
run_settings = exp.create_run_settings(
exe="echo",
exe_args="Hello World",
run_command="srun",
run_args=run_args,
)
run_settings.set_tasks(1)
model = exp.create_model(f"env_{i}", run_settings)
exp.start(model, block=False, summary=False)
models.append(model)
Metadata
Metadata
Assignees
Labels
No labels