Skip to content

Commit 3dd20af

Browse files
committed
Add quick intro
Signed-off-by: Alex Ellis (OpenFaaS Ltd) <[email protected]>
1 parent a954416 commit 3dd20af

File tree

1 file changed

+16
-8
lines changed

1 file changed

+16
-8
lines changed

_posts/2025-04-16-local-llm-openfaas-edge.md

Lines changed: 16 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,14 @@ hide_header_image: true
1515

1616
The rise of hosted LLMs has been meteoric, but many Non-Disclosure Agreements (NDAs) would prevent you from using them. We explore how a self-hosted solution protects your data.
1717

18+
This post at a glance:
19+
20+
* Pros and cons of hosted vs. self-hosted LLMs.
21+
* Bill of materials for a PC with cost-effective Nvidia GPUs
22+
* Configuration for OpenFaaS Edge with Ollama
23+
* Sample function and test data for categorizing cold outreach emails
24+
* Past posts on AI and LLMs from OpenFaaS and sister companies
25+
1826
## Why Self-Hosted LLMs?
1927

2028
Self-hosted models are great for experimentation and exploring what is possible, without having to worry about how much your API calls are costing you ($$$). Practically speaking, they are the only option if you are dealing with Confidential Information covered by an NDA.
@@ -45,7 +53,7 @@ Downsides for hosted models:
4553
Pros for self-hosted models:
4654

4755
* Tools such as [Ollama](https://ollama.com), [llama.cpp](https://github.com/ggml-org/llama.cpp), [LLM Studio](https://lmstudio.ai) and [vLLM](https://github.com/vllm-project/vllm) make it trivial to run LLMs locally
48-
* A modest investment in 1 or 2 NVIDIA GPUs such as 3060 or 3090 can give you access to a wide range of models
56+
* A modest investment in 1 or 2 Nvidia GPUs such as 3060 or 3090 can give you access to a wide range of models
4957
* Running on your own hardware means there are no API costs - all you can eat
5058
* You have full control over the model, and can choose to use open source models, or your own fine-tuned models
5159
* You have full control over the data, and can choose to keep it on-premises or in a private cloud
@@ -59,7 +67,7 @@ Cons for self-hosted models:
5967

6068
## Build of materials for a PC
6169

62-
For our sister company [actuated.com](https://actuated.com), we built a custom PC to show [how to leverage GPUs and LLMs during CI/CD with GitHub Actions and GitLab CI](https://actuated.com/blog/ollama-in-github-actions).
70+
For our sister brand [actuated.com](https://actuated.com), we built a custom PC to show [how to leverage GPUs and LLMs during CI/CD with GitHub Actions and GitLab CI](https://actuated.com/blog/ollama-in-github-actions).
6371

6472
The build uses an AMD Ryzen 9 5950X 16-Core CPU with 2x 3060 GPUs, 128GB of RAM, 1TB of NVMe storage, and a 1000W power supply.
6573

@@ -71,7 +79,7 @@ Around 9 months later, we swapped the 2x 3060 GPUs for 2x 3090s taking the VRAM
7179

7280
For this post, we allocated one of the two 3090 cards to a microVM, then we installed OpenFaaS Edge.
7381

74-
At the time of writing, a brand-new NVIDIA 3060 card with 12GB of VRAM is currently available for around [250 GBP as a one-off cost from Amazon.co.uk](https://amzn.to/42tE1Xp). If you use it heavily, will pay for itself in a short period of time compared to the cost of API credits.
82+
At the time of writing, a brand-new Nvidia 3060 card with 12GB of VRAM is currently available for around [250 GBP as a one-off cost from Amazon.co.uk](https://amzn.to/42tE1Xp). If you use it heavily, will pay for itself in a short period of time compared to the cost of API credits.
7583

7684
## How to get started with OpenFaaS Edge
7785

@@ -89,11 +97,11 @@ Use the [official instructions to install OpenFaaS Edge](https://docs.openfaas.c
8997

9098
Activate your license using your license key or GitHub Sponsorship.
9199

92-
### Install the NVIDIA Container Toolkit
100+
### Install the Nvidia Container Toolkit
93101

94-
Follow the instructions for your platform to install the NVIDIA Container Toolkit. This will allow you to run GPU workloads in Docker containers.
102+
Follow the instructions for your platform to install the Nvidia Container Toolkit. This will allow you to run GPU workloads in Docker containers.
95103

96-
[Installing the NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html)
104+
[Installing the Nvidia Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html)
97105

98106
You should be able to run `nvidia-smi` and see your GPUs detected.
99107

@@ -365,7 +373,7 @@ You can deploy the [function we wrote previously on the blog](https://www.openfa
365373
366374
### Conclusion
367375
368-
The latest release of [OpenFaaS Edge](https://docs.openfaas.com/deployment/edge/) adds support for NVIDIA GPUs for core services defined in the `docker-compose.yaml` file. This makes it easy to run local LLMs using a tool like Ollama, then to call them for a wide range of tasks and workflows, whilst retaining data privacy and complete confidentiality.
376+
The latest release of [OpenFaaS Edge](https://docs.openfaas.com/deployment/edge/) adds support for Nvidia GPUs for core services defined in the `docker-compose.yaml` file. This makes it easy to run local LLMs using a tool like Ollama, then to call them for a wide range of tasks and workflows, whilst retaining data privacy and complete confidentiality.
369377
370378
The functions can be written in any language, both synchronously and asynchronously for durability and scaling out.
371379
@@ -380,7 +388,7 @@ We've covered various AI/LLM related topics across our blog in the past:
380388
* [How to check for price drops with Functions, Cron & LLMs](https://www.openfaas.com/blog/checking-stock-price-drops/)
381389
* [How to transcribe audio with OpenAI Whisper and OpenFaaS](https://www.openfaas.com/blog/transcribe-audio-with-openai-whisper/)
382390
383-
From our sister companies:
391+
From our sister brands:
384392
385393
* Inlets - [Access local Ollama models from a cloud Kubernetes Cluster](https://inlets.dev/blog/2024/08/09/local-ollama-tunnel-k3s.html)
386394
* Actuated - [Run AI models with ollama in CI with GitHub Actions](https://actuated.com/blog/ollama-in-github-actions)

0 commit comments

Comments
 (0)