Skip to content

Conversation

@eddumelendez
Copy link
Contributor

Currently, test is disable because in every run the image should
be downloaded and then pull the model. Now, taking advantage of
Testcontainers and a GHA, a new image is created on-the-fly with
the model in it and then cached, so, next executions will reuse the
cached image instead.

Fixes #121

Currently, test is disable because in every run the image should
be downloaded and then pull the model. Now, taking advantage of
Testcontainers and a GHA, a new image is created on-the-fly with
the model in it and then cached, so, next executions will reuse the
cached image instead.

Fixes spring-projects#121
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I'd like to speed up the IT tests as well. I'm curious as to the rationale in going from a one liner to use test containers

	@Container
	static GenericContainer<?> ollamaContainer = new GenericContainer<>("ollama/ollama:0.1.23").withExposedPorts(11434);

To now using two static classes and a static method. I'm not quite up on my testcontainer knowledge, but would have expected something more terse code-wise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason is to create the new Ollama image with the orca-mini model in it. We are doing it programmatically but another approach could be to reuse an existing image from Docker Hub that has the model and the one line could remain just pointing to the new image, removing the rest of the code. However, not sure if there is any preference to use any image out there or creating their own.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a tricky situation I encountered in other projects as well. On the one hand, it'd be better to use official images. On the other hand, downloading the model every time is not much feasible, requiring some mitigation strategies with additional configuration. I looked for existing images out there, but I ended up publishing my own Ollama-based images to make sure they are always up-to-date and for having multi-architecture support: https://github.com/ThomasVitale/llm-images.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an issue in ollama to provide specific images ollama/ollama#2161

Copy link
Member

@markpollack markpollack Jul 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That image has been closed with 'won't do' - so closing this issue. We will have to setup a separate integration test profile for ollama along with other model providers that aren't currently running in CI

@tzolov
Copy link
Contributor

tzolov commented Feb 27, 2024

@eddumelendez , I'm trying to run this locally. The first attempt failed with 404 if not mistaken and a consecutive runs cause:

java.lang.ExceptionInInitializerError
 at java.base/java.util.ArrayList.forEach([ArrayList.java:1511](vscode-file://vscode-app/Applications/Visual%20Studio%20Code.app/Contents/Resources/app/out/vs/code/electron-sandbox/workbench/workbench.html))
Caused by: org.testcontainers.containers.ContainerFetchException: Can't get Docker image: RemoteDockerImage(imageName=mistral-ollama/ollama:0.1.10, imagePullPolicy=org.springframework.ai.autoconfigure.ollama.OllamaChatAutoConfigurationIT$OllamaContainer$$Lambda$497/0x000000013a226ac8@1e7f2e0f, imageNameSubstitutor=org.testcontainers.utility.ImageNameSubstitutor$LogWrappedImageNameSubstitutor@1da6ee17)
 at org.testcontainers.containers.GenericContainer.getDockerImageName([GenericContainer.java:1364](vscode-file://vscode-app/Applications/Visual%20Studio%20Code.app/Contents/Resources/app/out/vs/code/electron-sandbox/workbench/workbench.html))
 at org.testcontainers.containers.GenericContainer.doStart([GenericContainer.java:359](vscode-file://vscode-app/Applications/Visual%20Studio%20Code.app/Contents/Resources/app/out/vs/code/electron-sandbox/workbench/workbench.html))
 at org.testcontainers.containers.GenericContainer.start([GenericContainer.java:330](vscode-file://vscode-app/Applications/Visual%20Studio%20Code.app/Contents/Resources/app/out/vs/code/electron-sandbox/workbench/workbench.html))
 at org.springframework.ai.autoconfigure.ollama.OllamaChatAutoConfigurationIT.([OllamaChatAutoConfigurationIT.java:69](vscode-file://vscode-app/Applications/Visual%20Studio%20Code.app/Contents/Resources/app/out/vs/code/electron-sandbox/workbench/workbench.html))
 ... 2 more
Caused by: com.github.dockerjava.api.exception.NotFoundException: Status 404: {"message":"pull access denied for mistral-ollama/ollama, repository does not exist or may require 'docker login': denied: requested access to the resource is denied"}

May you know what is going on?

@tzolov
Copy link
Contributor

tzolov commented Feb 27, 2024

Did a second attempt (after removing all docker images) and the error it fails with is :

8:37:40.513 [main] INFO tc.ollama/ollama:0.1.10 -- Image ollama/ollama:0.1.10 pull took PT29.724702S
18:37:40.513 [docker-java-stream-1404029602] INFO tc.ollama/ollama:0.1.10 -- Pull complete. 3 layers, pulled in 28s (downloaded 471 MB at 16 MB/s)
18:37:40.518 [main] INFO tc.ollama/ollama:0.1.10 -- Creating container for image: ollama/ollama:0.1.10
18:37:44.108 [main] INFO tc.ollama/ollama:0.1.10 -- Container ollama/ollama:0.1.10 is starting: 8aff3a52c42c029a229503dc4e50f59e80ee5b1074734df00852b9bd01de8af5
18:37:44.452 [main] INFO tc.ollama/ollama:0.1.10 -- Container ollama/ollama:0.1.10 started in PT3.934304S
18:39:38.730 [main] INFO org.springframework.ai.autoconfigure.ollama.OllamaChatAutoConfigurationIT -- Start pulling the 'mistral ' generative ... would take several minutes ...
18:43:47.751 [main] INFO org.springframework.ai.autoconfigure.ollama.OllamaChatAutoConfigurationIT -- mistral pulling competed!
18:43:49.397 [main] WARN org.springframework.ai.ollama.api.OllamaApi -- [404] Not Found - 404 page not found

@eddumelendez
Copy link
Contributor Author

Hi @tzolov, I've reviewed and fixed the issue. PR is updated and everything should work as expected now.

@tzolov
Copy link
Contributor

tzolov commented Feb 27, 2024

Thanks @eddumelendez , just merged it but it seems we don't have the right account for using docker caching ? https://github.com/spring-projects/spring-ai/actions/runs/8070608896
Or is there a different way to resolve this?

@eddumelendez
Copy link
Contributor Author

Looks like need to check Actions Permissions under Settings > Actions > General to allow external actions.

@tzolov
Copy link
Contributor

tzolov commented Feb 27, 2024

Yeh, got this as well. But i'm not sure how safe is to white list an unverified action?

@tzolov
Copy link
Contributor

tzolov commented Feb 27, 2024

@eddumelendez i've whitelisted it but now it fails on this
Screenshot 2024-02-27 at 22 55 22

@eddumelendez
Copy link
Contributor Author

Let me try on my fork.

But i'm not sure how safe is to white list an unverified action?

Totally understand. Another alternative could be have persistent runners

@eddumelendez
Copy link
Contributor Author

Couldn't reproduce it https://github.com/eddumelendez/spring-ai/actions/runs/8072468166/job/22054281356#step:8:22

Can you enable debug logs, please?

@izeye
Copy link
Contributor

izeye commented Mar 16, 2024

This seems to have been merged in 246ba17.

@tzolov
Copy link
Contributor

tzolov commented Mar 16, 2024

@izeye it is merged but disabled because of the #322 (comment) we still have to resolve.
So i've left this issue open until we fix it.

@markpollack
Copy link
Member

Closing as model specific images are not going to be provided by ollama.

We will pick this up as part of a larger effort to have more CI coverage outside the current github action that runs on each commit.

@markpollack markpollack reopened this Aug 9, 2024
@dsyer
Copy link
Member

dsyer commented Aug 10, 2024

FWIW I just tried my demo in a github action and the ollama models (x2) pulled in 30sec. It's almost not worth trying to cache. If only my network was that fast. It actually takes longer to pull the ollama and chromadb docker images than it does to pull the models.

I also tried using https://docs.docker.com/build/ci/github-actions/cache/ to see if it would help, but it fails to create the cache because (I think) ollama is running as root so the file permissions are fubar.

@dsyer
Copy link
Member

dsyer commented Aug 10, 2024

Update. I got it working with this

@Bean
@ServiceConnection
public ChromaDBContainer chroma() {
	return new ChromaDBContainer(DockerImageName.parse("ghcr.io/chroma-core/chroma:0.5.5"));
}

@Bean
@ServiceConnection
public OllamaContainer ollama() throws Exception {
	@SuppressWarnings("resource")
	OllamaContainer ollama = new OllamaContainer(DockerImageName.parse("ollama/ollama:0.3.2"))
			// Not recommended strategy from testcontainers, but the only practical way to
			// make it work locally
			.withFileSystemBind("ollama", "/root/.ollama", BindMode.READ_WRITE);
	return ollama;
}

@Bean
ApplicationRunner runner(OllamaContainer ollama) {
	return args -> {
		logger.info("Pulling models...");
		ollama.execInContainer("ollama", "pull", "albertogg/multi-qa-minilm-l6-cos-v1");
		ollama.execInContainer("ollama", "pull", "mistral");
		ollama.execInContainer("chmod", "go+r", "-R", "/root/.ollama");
		logger.info("...done");
	};
}

but creating the cache (on the first run) and unpacking it (subsequently) takes about 1 minute. So it's not efficient for this case. Maybe it would work better with different models or more models. Here's the workflow:

name: Java CI with Maven

on:
  push:
    branches: [ main ]

jobs:
  build:
    name: Build and Deploy On Push

    runs-on: ubuntu-latest

    steps:
    - uses: actions/checkout@v4
    - name: Set up JDK 17
      uses: actions/setup-java@v4
      with:
        java-version: '17'
        distribution: 'temurin'
        cache: maven
    - name: Cache LLM Data
      id: cache-models
      uses: actions/cache@v4
      with:
        path: ollama
        key: ${{ runner.os }}-models
    - name: Install with Maven
      run: |
          ./mvnw -B install

UPDATE: I also tried caching the docker images with the build-push-action and it wasn't really any better than the default - my impression was that the cache was not being used for the docker images in the tests. The quickest overall CI run was with only the Maven cache (provided by the setup-java action), but there wasn't much in it.

The file system bind and the exec in container steps in the test context make a big difference to local development though, so definitely worth including those in the tests here (and docs).

@ThomasVitale
Copy link
Contributor

We might be able to close this issue now. All integration tests for Spring AI Ollama are now enabled and running on Testcontainers, also on the GitHub Actions workflow. They are based on the newly introduced "model auto-pull" feature #1554 It's used for the integration test setup but it's also a feature to improve the developer experience, making it possible to pull models automatically at startup time if not available yet.

@markpollack
Copy link
Member

I don't see the ollama tests running in the CI log, still something to sort out.

@dsyer thanks for taking the time to dig into it. We do pull several models in the range of IT tests ,but that info you provided is very useful. Let's see how it goes.

I'd like to move this issue to spring-projects/spring-ai-integration-tests#5 as we now have a more dedicated environment to run all the tests, thus decreasing the already considerable time spent for our mainline build.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve Ollama IT by caching the image with model

6 participants