Skip to content

Commit bfb5719

Browse files
committed
Mention REST calls into Ollama
Signed-off-by: Alex Ellis (OpenFaaS Ltd) <[email protected]>
1 parent 0c0e12a commit bfb5719

File tree

1 file changed

+10
-5
lines changed

1 file changed

+10
-5
lines changed

_posts/2024-09-04-checking-stock-price-drops.md

Lines changed: 10 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -646,21 +646,20 @@ The simplest option would be a remotely hosted cloud database, or even an Object
646646
647647
## So did it work?
648648
649-
It did indeed work. I felt the team at Classic Hand Tools were rather stingy with their discount, but I was probably the first person to find out about the price change.
649+
It did indeed work. I felt the team at Classic Hand Tools were rather stingy with their discount, but I was probably the first person to find out about the price change so the function achieved its goal.
650650
651-
Monitoring the function's logs in the OpenFaaS Pro UI:
651+
The next morning I clicked on the Logs tab for the function and selected "24h" in the OpenFaaS Pro dashboard, that showed me what happened leading up to the alert:
652652
653653
![Dropped prices](/images/2024-09-stockcheck/detected.png)
654654
655-
Then the Discord alert I got when I checked my phone:
655+
Here's the Discord alert I got when I checked my phone:
656656
657657
![Discord alert](/images/2024-09-stockcheck/discord.png)
658658
659-
Whilst it's overkill for this task, and standard HTML scraping and parsing techniques worked perfectly well, I decided to try out running a local Llama3.1 8B model on my Nvidia RTX 3090 GPU to see if it was up to the task.
659+
Whilst it's overkill for this task because standard HTML scraping and parsing techniques worked perfectly well, I decided to try out running a local Llama3.1 8B model on my Nvidia RTX 3090 GPU to see if it was up to the task.
660660
661661
It wasn't until I ran `ollama run llama3.1:8b-instruct-q8_0` and pasted in my prompt that I realised just how long that HTML was was. It was huge, over 681KB of text, this is generally considered a large context window for a Large Language Model.
662662
663-
664663
{% raw %}
665664
```
666665
You are a function that parses HTML and returns the data requested as JSON. "available" is true when "in stock" or "InStock" was found in the HTML, anything else is false. You must give no context, no explanation and no other text than the following JSON, with the values replaced accordingly between the ` characters.
@@ -701,6 +700,12 @@ If a local LLM wasn't up to the task, then we could have also used a cloud-hoste
701700
702701
And in the case that the local LLM aced the task, we could also try scaling down to something that can run better on CPU, or that doesn't require so many resources. I tried out the phi3 model from Microsoft which was designed with this in mind. After setting the system prompt, to my surprise performed the task just as well and returned the same JSON for me.
703702
703+
To integrate a local LLM with your function, you can package [Ollama](https://ollama.com/) as a container image using the instructions on our sister site: inlets.dev - [Access local Ollama models from a cloud Kubernetes Cluster](https://inlets.dev/blog/2024/08/09/local-ollama-tunnel-k3s.html). There are a few options here including deploying the LLM as a function, or deploying it as a regular Kubernetes Deployment, either will work, but the Deployment allows for easier Pod spec customisation if you're using a more complex GPU sharing technology like [NVidia Time Slicing](https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/latest/gpu-sharing.html#time-slicing-gpus-in-kubernetes).
704+
705+
Both Ollama and Llama.cpp are popular options for running a local model in Kubernetes, and Ollama provides a simple HTTP REST API that can be used from your function's handler. There are a few eaxmples in the above linked article.
706+
707+
In the conclusion, I'll also link to where we've packaged OpenAI Whisper as a function for CPU or GPU accelerated transcription of audio and video files.
708+
704709
## Ok so what about me and my use-case?
705710
706711
So I hear you saying: "Alex I don't do woodwork, and I don't shop at Classic Hand Tools in the UK".

0 commit comments

Comments
 (0)