Skip to content

Commit ab7d0ed

Browse files
authored
Merge pull request #1678 from oracle-devrel/vandijm-patch-2
Update README.md
2 parents 7fa0a8d + ce5f9b2 commit ab7d0ed

File tree

1 file changed

+12
-13
lines changed
  • cloud-infrastructure/private-cloud-and-edge/compute-cloud-at-customer/local-llm

1 file changed

+12
-13
lines changed

cloud-infrastructure/private-cloud-and-edge/compute-cloud-at-customer/local-llm/README.md

Lines changed: 12 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
*Last Update: 27 November 2024*
1+
*Last Update: 04.04.2025*
22

33
<br><h1 align="center">Local LLM Inferencing and Interaction<br>Using the Ollama Open Source Tool</h1>
44
<p align="center"><img align="centre" src="./images/ollama-logo.png" width="10%" style="float:right"/></p>
@@ -21,12 +21,12 @@
2121

2222
By running models locally, you maintain full data ownership and avoid the potential security risks associated with cloud storage. Offline AI tools like Ollama also help reduce latency and reliance on external facilities, making them faster and more reliable.
2323

24-
This article is intended to demonstrate and provide directions to install and create an Ollama LLM processing facility. Despite the fact that Ollama can be run on both personal servers and laptops, this installation is aimed at the Oracle Compute Cloud@Customer (C3) and Private Cloud Appliance (PCA) to capitalize on more readily available resources to increase performance and processing efficiency, especially if large models are used.
24+
This article is intended to demonstrate and provide directions to install and create an Ollama LLM processing facility. Despite the fact that Ollama can be run on both personal servers and laptops, this installation is aimed at the Oracle Compute Cloud@Customer (C3) to capitalize on more readily available resources to increase performance and processing efficiency, especially if large models are used.
2525

2626
Considerations:
27-
* A firm grasp of C3/PCA/OCI concepts and administration is assumed.
27+
* A firm grasp of C3 and OCI concepts and administration is assumed.
2828
* The creation and integration of a development environment is outside of the scope of this document.
29-
* Oracle Linux 8 and macOS Sonoma 14.7.1 clients were used for testing but Windows is however widely supported.
29+
* Oracle Linux 8, macOS Sonoma 14.7.1 and macOS Sequoia 15.3.2 clients were used for testing but Windows is however widely supported.
3030

3131
[Back to top](#toc)<br>
3232
<br>
@@ -40,17 +40,16 @@ Considerations:
4040
|----------|----------|
4141
| Operating system | Oracle Linux 8 or later<br>Ubuntu 22.04 or later<br>Windows<br> |
4242
| RAM | 16 GB for running models up to 7B. "The rule of thumb" is to have at least 2x memory for the size of the LLM, also allowing for LLMs that will be loaded in memory simultaneously. |
43-
| Disk space | 12 GB for installing Ollama and basic models. Additional space is required for storing model data depending on the used models. The LLM sizes can be obtained from the "trained models" link in the References section. For example the Llama 3.1 LLM with 405Bn parameters occupy 229GB of disk space |
43+
| Disk space | 12 GB for installing Ollama and basic models. Additional space is required for storing model data depending on the used models. The LLM sizes can be obtained from the "trained models" link in the References section. For example the Llama 3.1 LLM with 405Bn parameters occupies 229GB of disk space |
4444
| Processor | Recommended to use a modern CPU with at least 4 cores. For running models of approximately 15B, 8 cores (OCPUs) is recommended. Allocate accordingly |
45-
| Graphics Processing Unit<br>(optional) | A GPU is not required for running Ollama, but can improve performance, especially when working with large models. If you have a GPU, you can use it to accelerate training of custom models. |
45+
| Graphics Processing Unit<br>(optional) | A GPU is not required for running Ollama, but can improve performance, especially when working with large models. If you have a GPU, you can use it to accelerate inferencing, training, fine-tuning and RAG (Retrieval Augmented Generation). |
4646

4747
>[!NOTE]
48-
>The GPU options in the Compute Cloud@Customer will be available soon.
48+
>The C3 now has an NVIDIA L40S GPU expansion option available. Using a 4-GPU VM it is expected that performance acceleration will dramatically improve for LLMs of up to approximately 70bn parameters.
4949
5050
### Create a Virtual Machine Instance
5151

52-
[C3: Creating an Instance](https://docs.oracle.com/en-us/iaas/compute-cloud-at-customer/topics/compute/creating-an-instance.htm#creating-an-instance)<br>
53-
[PCA 3.0: Working with Instances](https://docs.oracle.com/en/engineered-systems/private-cloud-appliance/3.0-latest/user/user-usr-instance-lifecycle.html)
52+
[Creating an Instance](https://docs.oracle.com/en-us/iaas/compute-cloud-at-customer/topics/compute/creating-an-instance.htm#creating-an-instance)<br>
5453

5554
Create a VM in a public subnet following these guidelines:
5655

@@ -73,8 +72,7 @@ sudo dnf update
7372

7473
### Create a Block Storage Device for LLMs
7574

76-
[C3: Creating and Attaching Block Volumes](https://docs.oracle.com/en-us/iaas/compute-cloud-at-customer/topics/block/creating-and-attaching-block-volumes.htm)<br>
77-
[PCA 3.0: Creating and Attaching Block Volumes](https://docs.oracle.com/en/engineered-systems/private-cloud-appliance/3.0-latest/user/user-usr-blk-volume-create-attach.html)
75+
[Creating and Attaching Block Volumes](https://docs.oracle.com/en-us/iaas/compute-cloud-at-customer/topics/block/creating-and-attaching-block-volumes.htm)<br>
7876

7977
1. Create and attach a block volume to the VM
8078
2. Volume name `llm-repo`
@@ -106,7 +104,7 @@ export no_proxy
106104
```
107105

108106
>[!TIP]
109-
>The `no_proxy` environment variable can be expanded to include your internal domains. It is not required to list IP addresses in internal subnets of the C3/PCA.
107+
>The `no_proxy` environment variable can be expanded to include your internal domains. It is not required to list IP addresses in internal subnets of the C3.
110108
111109
Edit the `/etc/yum.conf` file to include the following line:
112110
```
@@ -181,7 +179,7 @@ The installation comprises the following components:
181179
<sup><sub>2</sup></sub> See [Ollama documentation](https://github.com/ollama/ollama/tree/main/docs)
182180

183181
>[!IMPORTANT]
184-
>When GPU's become available the NVIDIA and CUDA drivers should be installed. This configuration will also be tested on the Roving Edge Device GPU model.
182+
>When GPU's are available for use on the C3 the NVIDIA and CUDA drivers should be installed. This configuration will also be tested on the Roving Edge Device GPU model.
185183
186184
### Installation
187185

@@ -324,3 +322,4 @@ Copyright (c) 2025 Oracle and/or its affiliates.
324322
Licensed under the Universal Permissive License (UPL), Version 1.0.
325323

326324
See [LICENSE](LICENSE) for more details.
325+

0 commit comments

Comments
 (0)