The first prototype that "works"
-
server-1: an OpenPCC router server.
-
server-2: an OpenPCC compute server. w/ ollama + llama3.1 1B 8bit. for c6a.4xlarge
-
client: CLI fake-attestation supported
-
Note: this requires fake-attestation and TPM-simulator only. (not very secure. real attestation and real TPM not supported)
-
With almost everything hardcoded, this is to setup the first building block for hybrid AI research.
-
When you deploy this be careful:
- fake-attestation (at build-pack) is required.
- server-2 requires to "know" server-1 (router) address. this is injected in the deploy process. To do so, EIF should be created at deploy.yml, not in build-pack.yml
- DON'T enable "Build Nitro Enclave EIF for server-2" at build-pack.yml
- DON'T add "S3 URI to upload compute EIF" at build-pack.yml
- DO add "optional compute_boot build tags" with
include_fake_attestationat build-pack.yml - CPU count < 14 (update /etc/nitro-enclave/allocate.yaml to increase this) at deploy.tml
- 10000 < Memory < 24000 (update /etc/nitro-enclave/allocate.yaml to increase this) at deploy.tml
- CID should be kept 16 at deploy.tml
- Image tag should be kept the same between deploy.yml and build-pack.yml
- Compute instance should be "c6a.4xlarge" or better.
Full Changelog: https://github.com/nnstreamer/hybrid/commits/v0.001