You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: _posts/2024-09-04-checking-stock-price-drops.md
+10-5Lines changed: 10 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -646,21 +646,20 @@ The simplest option would be a remotely hosted cloud database, or even an Object
646
646
647
647
## So did it work?
648
648
649
-
It did indeed work. I felt the team at Classic Hand Tools were rather stingy with their discount, but I was probably the first person to find out about the price change.
649
+
It did indeed work. I felt the team at Classic Hand Tools were rather stingy with their discount, but I was probably the first person to find out about the price change so the function achieved its goal.
650
650
651
-
Monitoring the function's logs in the OpenFaaS Pro UI:
651
+
The next morning I clicked on the Logs tab for the function and selected "24h"in the OpenFaaS Pro dashboard, that showed me what happened leading up to the alert:
Whilst it's overkill for this task, and standard HTML scraping and parsing techniques worked perfectly well, I decided to try out running a local Llama3.1 8B model on my Nvidia RTX 3090 GPU to see if it was up to the task.
659
+
Whilst it's overkill for this task because standard HTML scraping and parsing techniques worked perfectly well, I decided to try out running a local Llama3.1 8B model on my Nvidia RTX 3090 GPU to see if it was up to the task.
660
660
661
661
It wasn't until I ran `ollama run llama3.1:8b-instruct-q8_0` and pasted in my prompt that I realised just how long that HTML was was. It was huge, over 681KB of text, this is generally considered a large context window for a Large Language Model.
662
662
663
-
664
663
{% raw %}
665
664
```
666
665
You are a function that parses HTML and returns the data requested as JSON. "available" is true when "instock" or "InStock" was found in the HTML, anything else is false. You must give no context, no explanation and no other text than the following JSON, with the values replaced accordingly between the ` characters.
@@ -701,6 +700,12 @@ If a local LLM wasn't up to the task, then we could have also used a cloud-hoste
701
700
702
701
And in the case that the local LLM aced the task, we could also try scaling down to something that can run better on CPU, or that doesn't require so many resources. I tried out the phi3 model from Microsoft which was designed with this in mind. After setting the system prompt, to my surprise performed the task just as well and returned the same JSON for me.
703
702
703
+
To integrate a local LLM with your function, you can package [Ollama](https://ollama.com/) as a container image using the instructions on our sister site: inlets.dev - [Access local Ollama models from a cloud Kubernetes Cluster](https://inlets.dev/blog/2024/08/09/local-ollama-tunnel-k3s.html). There are a few options here including deploying the LLM as a function, or deploying it as a regular Kubernetes Deployment, either will work, but the Deployment allows for easier Pod spec customisation if you're using a more complex GPU sharing technology like [NVidia Time Slicing](https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/latest/gpu-sharing.html#time-slicing-gpus-in-kubernetes).
704
+
705
+
Both Ollama and Llama.cpp are popular options for running a local model in Kubernetes, and Ollama provides a simple HTTP REST API that can be used from your function's handler. There are a few eaxmples in the above linked article.
706
+
707
+
In the conclusion, I'll also link to where we've packaged OpenAI Whisper as a function for CPU or GPU accelerated transcription of audio and video files.
708
+
704
709
## Ok so what about me and my use-case?
705
710
706
711
So I hear you saying: "AlexIdon't do woodwork, and I don'tshopatClassicHandToolsintheUK".
0 commit comments