Skip to content

Conversation

@mranst
Copy link
Collaborator

@mranst mranst commented May 7, 2025

Description

This PR adds support for detecting AWS and Mac (#539). I also added a commit that will automatically set the platform for create and tier tests based on the current host, though this can be reverted if it causes problems. This has the convenience of not need to specify -p aws every time.

@ashiklom Is there a cylc installation on AWS? I can set it up like I did on Discover

@mranst mranst requested review from Dooruk and ashiklom May 7, 2025 15:16
@mranst mranst self-assigned this May 7, 2025
Copy link
Contributor

@ashiklom ashiklom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementation looks good.

But we should think through what exactly we mean by a "platform" and what it means for Swell to "support" said platform.

Like, do we have anyone successfully using the GENERIC or MAC platforms? You can't actually run Swell without slurm, and getting slurm running locally on a MAC seems really difficult (or flat-out impossible?).

A useful way to think about (and implement/document) this might in terms of tiers of support. Maybe something like:

  1. A platform with tier0 support can run code_tests, unit tests, and maybe individual tasks if they are precisely configured.
  2. A platform with tier1 support can run the tier1 tests without modification.
  3. A platform with tier2 support can successfully run tier2 tests.

# Try the lscpu shell command, which should be available across NCCS
try:
cpu_info = str(subprocess.run('lscpu', capture_output=True).stdout)
try:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure Swell won't actually work with the GENERIC platform, right? If that's the case, I suggest we don't actually include it as a "supported" platform here and just let any errors in this section get thrown (which would be a noteworthy error --- we're already limiting this section to Discover hostnames).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or would this break code_tests or other unit tests running on a generic linux machine or something? I.e., Does code_tests depend on the platform? If so, it shouldn't!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code tests don't depend on platform, but one quirk of the click interface is that it executes all calls in the driver group regardless of whether the particular command that needs it is called or not. So if we raise an error if the platform was not detected, it will raise that error any time swell is called for any purpose

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, makes sense. Can we work around that with something like this?

@swell_driver.command()
@click.argument('suite', type=click.Choice(AllSuites.config_names()))
@click.option('-m', '--input_method', 'input_method', default='defaults',
              type=click.Choice(['defaults', 'cli']), help=input_method_help)
@click.option('-p', '--platform', 'platform', default=None,     # <--- NOTE: Default to `None`
              type=click.Choice(platforms.get_all()), help=platform_help)
@click.option('-o', '--override', 'override', default=None, help=override_help)
@click.option('-a', '--advanced', 'advanced', default=False, help=advanced_help)
@click.option('-s', '--slurm', 'slurm', default=None, help=slurm_help)
def create(
    suite: str,
    input_method: str,
    platform: str | None,
    override: Union[dict, str, None],
    advanced: bool,
    slurm: str
) -> None:
    """
    Create a new experiment

    This command creates an experiment directory based on the provided suite name and options.

    Arguments: \n
        suite (str): Name of the suite you wish to run. \n

    """

    if platform is None:
        platform = platforms.detect_platform()

    # Create the experiment directory
    create_experiment_directory(suite, input_method, platform, override, advanced, slurm)

if all(key in model_name for key in ['Intel', 'Xeon']):
return cls.NCCS_DISCOVER_CASCADE
elif all(key in model_name for key in ['AMD', 'EPYC']):
return cls.NCCS_DISCOVER_SLES15
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If neither of these conditions is met, suggest raising a ValueError(f"Unknown CPU architecture: {model_name}").

if all(key in os_name for key in ['macOS', 'arm64']):
return cls.MAC

return cls.GENERIC
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per my comment above. If we've gotten this far, I suggest we just throw an error.

raise ValueError(f"Unknown or unsupported platform: {os_name}")

@ashiklom
Copy link
Contributor

ashiklom commented May 8, 2025

Re: Cylc on AWS --- Yes, I installed cylc globally. It's in /usr/local/bin/cylc.

@Dooruk Dooruk linked an issue Jul 17, 2025 that may be closed by this pull request
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Identify more platforms automatically

2 participants