-
Notifications
You must be signed in to change notification settings - Fork 18
Removing start, stop with ec2.py, adding validations #1191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 50 commits
Commits
Show all changes
57 commits
Select commit
Hold shift + click to select a range
3b11c13
Removing start, stop with ec2.py, adding validations
abuabraham-ttd 6034c5e
Removing start, stop with ec2.py, adding validations
abuabraham-ttd d1f1756
Updates
abuabraham-ttd f42f872
Updates
abuabraham-ttd 2542232
Add virtual env and start it in systemd
abuabraham-ttd edf85f3
Add virtual env and start it in systemd
abuabraham-ttd 3e95e4c
Add virtual env and start it in systemd
abuabraham-ttd cb70032
use venv like flask service
abuabraham-ttd 5fe844c
use versions
abuabraham-ttd 937e7a2
Add URL validation
abuabraham-ttd 2b23ff0
Move validations around
abuabraham-ttd 44aa71f
Move validations around
abuabraham-ttd 711d50b
Move validations around
abuabraham-ttd 4c694e7
Remove aws implemnttion from typedict
abuabraham-ttd 62cc490
Remove aws implemnttion from typedict
abuabraham-ttd 5de70be
Adding more logs
abuabraham-ttd 77f1f4a
Adding min capacity
abuabraham-ttd a4241fc
Loop every sec for 10sec for confg server to be up
abuabraham-ttd 0bff456
Fix regex
abuabraham-ttd e669887
validate after default
abuabraham-ttd 85fc3e7
Add tested min values for capacity
abuabraham-ttd d7b24c7
[CI Pipeline] Released Snapshot version: 5.43.1-alpha-93-SNAPSHOT
4499dcf
Add to build eif stage
abuabraham-ttd 3dd967d
[CI Pipeline] Released Snapshot version: 5.43.2-alpha-94-SNAPSHOT
45b2908
Dont check for enclave, kill all
abuabraham-ttd d890e5d
Change version on ami build
abuabraham-ttd 8eeaf9a
[CI Pipeline] Released Snapshot version: 5.43.3-alpha-100-SNAPSHOT
eb8955c
Use AuxilaryConfig to store and return URLs
abuabraham-ttd 502e9d7
fix log level
abuabraham-ttd b210817
Remove optout URL references, we only need core URL and it infers opt…
abuabraham-ttd 4398ac6
use custom shared action
abuabraham-ttd 69968b2
[CI Pipeline] Released Snapshot version: 5.43.4-alpha-101-SNAPSHOT
47b841c
[CI Pipeline] Released Snapshot version: 5.43.5-alpha-102-SNAPSHOT
51e5c6b
FIx CFN templates
abuabraham-ttd 0aa4df8
[CI Pipeline] Released Snapshot version: 5.43.6-alpha-103-SNAPSHOT
251f812
[CI Pipeline] Released Snapshot version: 5.43.7-alpha-104-SNAPSHOT
e39c125
[CI Pipeline] Released Snapshot version: 5.43.8-alpha-105-SNAPSHOT
9c9d81b
Add FB in FB
abuabraham-ttd 616964c
Bypass validations
abuabraham-ttd 0219170
Add new param for skipping validations
abuabraham-ttd 061050d
SkipValidations
abuabraham-ttd 9ee0d14
SkipValidations
abuabraham-ttd a6650e0
Force debug, better error handle
abuabraham-ttd 53495aa
Fix code
abuabraham-ttd 791b6ae
Adding more logs inside enclave
abuabraham-ttd 607fe20
Adding more logs inside enclave
abuabraham-ttd 73acc2f
Remove all e2e related
abuabraham-ttd a595a55
Remove all e2e related
abuabraham-ttd e597e53
remove space
abuabraham-ttd e4a9fca
remove space
abuabraham-ttd 5d2e54a
Add values based on condition
abuabraham-ttd 30fcbbe
Add values based on condition
abuabraham-ttd 1449563
removing from cfn interface
abuabraham-ttd 30d4a8b
Remove skipvalidation as a param in CFN
abuabraham-ttd 8b249a1
Updates to CFN templates
abuabraham-ttd 108ab34
Add an env validation
abuabraham-ttd f249e8f
Revert noop chnage
abuabraham-ttd File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,3 +1,3 @@ | ||
| Flask==2.3.2 | ||
| Werkzeug==3.0.3 | ||
| setuptools==70.0.0 | ||
| setuptools==70.0.0 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,259 @@ | ||
| #!/usr/bin/env python3 | ||
|
|
||
| import boto3 | ||
| import json | ||
| import os | ||
| import subprocess | ||
| import re | ||
| import multiprocessing | ||
| import requests | ||
| import signal | ||
| import argparse | ||
| from botocore.exceptions import ClientError | ||
| from typing import Dict | ||
| import sys | ||
| import time | ||
| import yaml | ||
|
|
||
| sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__)))) | ||
| from confidential_compute import ConfidentialCompute, ConfidentialComputeConfig, SecretNotFoundException, ConfidentialComputeStartupException | ||
|
|
||
| class AWSConfidentialComputeConfig(ConfidentialComputeConfig): | ||
| enclave_memory_mb: int | ||
| enclave_cpu_count: int | ||
|
|
||
| class AuxiliaryConfig: | ||
| FLASK_PORT: str = "27015" | ||
| LOCALHOST: str = "127.0.0.1" | ||
| AWS_METADATA: str = "169.254.169.254" | ||
|
|
||
| @classmethod | ||
| def get_socks_url(cls) -> str: | ||
| return f"socks5://{cls.LOCALHOST}:3306" | ||
|
|
||
| @classmethod | ||
| def get_config_url(cls) -> str: | ||
| return f"http://{cls.LOCALHOST}:{cls.FLASK_PORT}/getConfig" | ||
|
|
||
| @classmethod | ||
| def get_user_data_url(cls) -> str: | ||
| return f"http://{cls.AWS_METADATA}/latest/user-data" | ||
|
|
||
| @classmethod | ||
| def get_token_url(cls) -> str: | ||
| return f"http://{cls.AWS_METADATA}/latest/api/token" | ||
|
|
||
| @classmethod | ||
| def get_meta_url(cls) -> str: | ||
| return f"http://{cls.AWS_METADATA}/latest/dynamic/instance-identity/document" | ||
|
|
||
|
|
||
| class EC2(ConfidentialCompute): | ||
|
|
||
| def __init__(self): | ||
| super().__init__() | ||
|
|
||
| def __get_aws_token(self) -> str: | ||
| """Fetches a temporary AWS EC2 metadata token.""" | ||
| try: | ||
| response = requests.put( | ||
| AuxiliaryConfig.get_token_url(), headers={"X-aws-ec2-metadata-token-ttl-seconds": "3600"}, timeout=2 | ||
| ) | ||
| return response.text | ||
| except requests.RequestException as e: | ||
| raise RuntimeError(f"Failed to fetch aws token: {e}") | ||
|
|
||
| def __get_current_region(self) -> str: | ||
| """Fetches the current AWS region from EC2 instance metadata.""" | ||
abuabraham-ttd marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| token = self.__get_aws_token() | ||
| headers = {"X-aws-ec2-metadata-token": token} | ||
| try: | ||
| response = requests.get(AuxiliaryConfig.get_meta_url(), headers=headers, timeout=2) | ||
| response.raise_for_status() | ||
| return response.json()["region"] | ||
| except requests.RequestException as e: | ||
| raise RuntimeError(f"Failed to fetch region: {e}") | ||
|
|
||
| def __validate_aws_specific_config(self, secret): | ||
| if "enclave_memory_mb" in secret or "enclave_cpu_count" in secret: | ||
| max_capacity = self.__get_max_capacity() | ||
| min_capacity = {"enclave_memory_mb": 11000, "enclave_cpu_count" : 2 } | ||
| for key in ["enclave_memory_mb", "enclave_cpu_count"]: | ||
| if int(secret.get(key, 0)) > max_capacity.get(key): | ||
| raise ValueError(f"{key} value ({secret.get(key, 0)}) exceeds the maximum allowed ({max_capacity.get(key)}).") | ||
abuabraham-ttd marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| if min_capacity.get(key) > int(secret.get(key, 10**9)): | ||
| raise ValueError(f"{key} value ({secret.get(key, 0)}) needs to be higher than the minimum required ({min_capacity.get(key)}).") | ||
|
|
||
| def _get_secret(self, secret_identifier: str) -> AWSConfidentialComputeConfig: | ||
| """Fetches a secret value from AWS Secrets Manager and adds defaults""" | ||
sunnywu marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| def add_defaults(configs: Dict[str, any]) -> AWSConfidentialComputeConfig: | ||
| """Adds default values to configuration if missing.""" | ||
| default_capacity = self.__get_max_capacity() | ||
| configs.setdefault("enclave_memory_mb", default_capacity["enclave_memory_mb"]) | ||
| configs.setdefault("enclave_cpu_count", default_capacity["enclave_cpu_count"]) | ||
| configs.setdefault("debug_mode", False) | ||
| return configs | ||
|
|
||
| region = self.__get_current_region() | ||
| print(f"Running in {region}") | ||
| try: | ||
| client = boto3.client("secretsmanager", region_name=region) | ||
sunnywu marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| except Exception as e: | ||
| raise RuntimeError("Please use IAM instance profile for your instance and make sure that has permission to access Secret Manager", e) | ||
| try: | ||
| secret = add_defaults(json.loads(client.get_secret_value(SecretId=secret_identifier)["SecretString"])) | ||
| self.__validate_aws_specific_config(secret) | ||
abuabraham-ttd marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| return secret | ||
| except ClientError as _: | ||
| raise SecretNotFoundException(f"{secret_identifier} in {region}") | ||
|
|
||
| @staticmethod | ||
| def __get_max_capacity(): | ||
| try: | ||
| with open("/etc/nitro_enclaves/allocator.yaml", "r") as file: | ||
| nitro_config = yaml.safe_load(file) | ||
| return {"enclave_memory_mb": nitro_config['memory_mib'], "enclave_cpu_count": nitro_config['cpu_count']} | ||
| except Exception as e: | ||
| raise RuntimeError("/etc/nitro_enclaves/allocator.yaml does not have CPU, memory allocated") | ||
|
|
||
| def __setup_vsockproxy(self, log_level: int) -> None: | ||
| """ | ||
| Sets up the vsock proxy service. | ||
| """ | ||
| thread_count = (multiprocessing.cpu_count() + 1) // 2 | ||
| command = [ | ||
| "/usr/bin/vsockpx", "-c", "/etc/uid2operator/proxy.yaml", | ||
| "--workers", str(thread_count), "--log-level", str(log_level), "--daemon" | ||
| ] | ||
| self.run_command(command) | ||
|
|
||
| def __run_config_server(self) -> None: | ||
| """ | ||
| Starts the Flask configuration server. | ||
| """ | ||
| os.makedirs("/etc/secret/secret-value", exist_ok=True) | ||
| config_path = "/etc/secret/secret-value/config" | ||
| with open(config_path, 'w') as config_file: | ||
| json.dump(self.configs, config_file) | ||
| os.chdir("/opt/uid2operator/config-server") | ||
| command = ["./bin/flask", "run", "--host", AuxiliaryConfig.LOCALHOST, "--port", AuxiliaryConfig.FLASK_PORT] | ||
| self.run_command(command, seperate_process=True) | ||
|
|
||
| def __run_socks_proxy(self) -> None: | ||
| """ | ||
| Starts the SOCKS proxy service. | ||
| """ | ||
| command = ["sockd", "-D"] | ||
| self.run_command(command) | ||
|
|
||
| def __get_secret_name_from_userdata(self) -> str: | ||
| """Extracts the secret name from EC2 user data.""" | ||
| token = self.__get_aws_token() | ||
| response = requests.get(AuxiliaryConfig.get_user_data_url(), headers={"X-aws-ec2-metadata-token": token}) | ||
| user_data = response.text | ||
|
|
||
| with open("/opt/uid2operator/identity_scope.txt") as file: | ||
| identity_scope = file.read().strip() | ||
|
|
||
| default_name = f"{identity_scope.lower()}-operator-config-key" | ||
| hardcoded_value = f"{identity_scope.upper()}_CONFIG_SECRET_KEY" | ||
| match = re.search(rf'^export {hardcoded_value}="(.+?)"$', user_data, re.MULTILINE) | ||
| return match.group(1) if match else default_name | ||
|
|
||
| def _setup_auxiliaries(self) -> None: | ||
| """Sets up the vsock tunnel, socks proxy and flask server""" | ||
| log_level = 1 if self.configs["debug_mode"] else 3 | ||
| self.__setup_vsockproxy(log_level) | ||
| self.__run_config_server() | ||
| self.__run_socks_proxy() | ||
| print("Finished setting up all auxiliaries") | ||
|
|
||
| def _validate_auxiliaries(self) -> None: | ||
| """Validates connection to flask server direct and through socks proxy.""" | ||
| print("Validating auxiliaries") | ||
| try: | ||
| for attempt in range(10): | ||
| try: | ||
| response = requests.get(AuxiliaryConfig.get_config_url()) | ||
| print("Config server is reachable") | ||
| break | ||
| except requests.exceptions.ConnectionError as e: | ||
| print(f"Connecting to config server, attempt {attempt + 1} failed with ConnectionError: {e}") | ||
| time.sleep(1) | ||
| else: | ||
| raise RuntimeError(f"Config server unreachable") | ||
| response.raise_for_status() | ||
| except requests.RequestException as e: | ||
| raise RuntimeError(f"Failed to get config from config server: {e}") | ||
| proxies = {"http": AuxiliaryConfig.get_socks_url(), "https": AuxiliaryConfig.get_socks_url()} | ||
| try: | ||
| response = requests.get(AuxiliaryConfig.get_config_url(), proxies=proxies) | ||
| response.raise_for_status() | ||
| except requests.RequestException as e: | ||
| raise RuntimeError(f"Cannot connect to config server via SOCKS proxy: {e}") | ||
| print("Connectivity check to config server passes") | ||
|
|
||
| def __run_nitro_enclave(self): | ||
| command = [ | ||
abuabraham-ttd marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| "nitro-cli", "run-enclave", | ||
| "--eif-path", "/opt/uid2operator/uid2operator.eif", | ||
abuabraham-ttd marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| "--memory", str(self.configs["enclave_memory_mb"]), | ||
| "--cpu-count", str(self.configs["enclave_cpu_count"]), | ||
| "--enclave-cid", "42", | ||
| "--enclave-name", "uid2operator" | ||
| ] | ||
| if self.configs.get('debug_mode', False): | ||
| print("Running in debug_mode") | ||
| command += ["--debug-mode", "--attach-console"] | ||
| self.run_command(command, seperate_process=True) | ||
|
|
||
| def run_compute(self) -> None: | ||
| """Main execution flow for confidential compute.""" | ||
| secret_manager_key = self.__get_secret_name_from_userdata() | ||
| self.configs = self._get_secret(secret_manager_key) | ||
| print(f"Fetched configs from {secret_manager_key}") | ||
| if not self.configs.get("skip_validations"): | ||
| self.validate_configuration() | ||
| self._setup_auxiliaries() | ||
| self._validate_auxiliaries() | ||
| self.__run_nitro_enclave() | ||
|
|
||
| def cleanup(self) -> None: | ||
| """Terminates the Nitro Enclave and auxiliary processes.""" | ||
| try: | ||
| self.run_command(["nitro-cli", "terminate-enclave", "--all"]) | ||
| self.__kill_auxiliaries() | ||
| except subprocess.SubprocessError as e: | ||
| raise (f"Error during cleanup: {e}") | ||
|
|
||
| def __kill_auxiliaries(self) -> None: | ||
| """Kills all auxiliary processes spawned.""" | ||
| try: | ||
| for process_name in ["vsockpx", "sockd", "flask"]: | ||
| result = subprocess.run(["pgrep", "-f", process_name], stdout=subprocess.PIPE, text=True, check=False) | ||
| if result.stdout.strip(): | ||
| for pid in result.stdout.strip().split("\n"): | ||
| os.kill(int(pid), signal.SIGKILL) | ||
| print(f"Killed process '{process_name}'.") | ||
| else: | ||
| print(f"No process named '{process_name}' found.") | ||
| except Exception as e: | ||
| print(f"Error killing process '{process_name}': {e}") | ||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| parser = argparse.ArgumentParser(description="Manage EC2-based confidential compute workflows.") | ||
| parser.add_argument("-o", "--operation", choices=["stop", "start"], default="start", help="Operation to perform.") | ||
| args = parser.parse_args() | ||
| try: | ||
| ec2 = EC2() | ||
| if args.operation == "stop": | ||
| ec2.cleanup() | ||
| else: | ||
| ec2.run_compute() | ||
| except ConfidentialComputeStartupException as e: | ||
| print("Failed starting up Confidential Compute. Please checks the logs for errors and retry \n", e) | ||
| except Exception as e: | ||
abuabraham-ttd marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| print("Unknown failure while starting up Confidential Compute. Please contact UID support team with this log \n ", e) | ||
|
|
||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.