anaconda-ai

Download, launch, and integrate AI models curated by Anaconda. This package provides programmatic access and an SDK to access the curated models, download them, and start servers.

Below you will find documentation for

How to install
Command line interface to list, download, run API servers for models
Anaconda AI SDK
Integration with LLM CLI
Langchain
LlamaIndex
LiteLLM
DSPy
Pydantic AI
Instructor
Panel ChatInterface

Install

conda install -c anaconda-cloud anaconda-ai

Backends

The anaconda-ai package is the CLI/SDK for a number of backends that provide API endpoint to list and download models and manage running servers. All activities performed by the CLI, SDK, and integrations here are visible within the backend application or site.

The available backends are

Backend name	Configuration value	Supports	Default
Anaconda AI Navigator	`"ai-navigator"`	Models,Servers,Server Parameters,VectorDB	DEFAULT
Anaconda Desktop (beta)	`"anaconda-desktop"`	Models,Servers,Server Parameters,VectorDB
Anaconda AI Catalyst (beta)	`"ai-catalyst"`	Models,Servers

Configuration

Anaconda AI supports configuration management in the ~/.anaconda/config.toml file. The following parameters are supported under the table [plugin.ai] or by setting ANACONDA_AI_<parameter>=<value> environment variables.

Parameter	Environment variable	Description	Default value
`backend`	`ANACONDA_AI_BACKEND`	The backend API	`"ai-navigator"`
`stop_server_on_exit`	`ANACONDA_AI_STOP_SERVER_ON_EXIT`	For any server started during a Python interpreter session stop the server when the interpreter stops. Does not affect servers that were previously running	`true`
`server_operations_timeout`	`ANACONDA_AI_SERVER_OPERATIONS_TIMEOUT`	Timeout waiting for a server to start or stop	`30`
`show_blocked_models`	`ANACONDA_AI_SHOW_BLOCKED_MODELS`	Toggle display of blocked models if backend supports it	`false`

Configuration CLI

Use anaconda ai config command to apply changes to the ~/.anaconda/config.toml. See anaconda ai config --help for details.

Declaring model quantization files

In the CLI, SDK, and integrations below individual model quantizations are are referenced according the following scheme.

[<author>/]<model_name></ or _><quantization>[.<format>]

Fields surrounded by [] are optional. The essential elements are the model name and quantization method separated by either / or _. The supported quantization methods are

Q4_K_M
Q5_K_M
Q6_K
Q8_0

CLI

The CLI subcommands within anaconda ai provide full access to list and download model files, start and stop servers through the backend.

Command	Description
models	Show all models or detailed information about a single model with downloaded model files indicated in bold
download	Download a model file using model name and quantization
launch	Launch a server for a model file
servers	Show all running servers or detailed information about a single server
stop	Stop a running server by id
launch-vectordb	Starts a pg vector db (not supported by all backends)

See the --help for each command for more details.

SDK

The SDK actions are initiated by creating a client connection to the backend.

from anaconda_ai import AnacondaAIClient

client = AnacondaAIClient()

The client provides two top-level accessors .models and .servers.

Models

The .models attribute provides actions to list available models and download specific quantization files.

Method	Return	Description
`.list()`	`List[ModelSummary]`	List all available and downloaded models
`.get('<model-name>')`	`Model`	retrieve metadata about a model
`.download('<model>/<quantization>')`	None	Download a model quantization file
`.delete('<model>/<quantization>')`	None	Delete a downloaded model quantization file

The Model class holds metadata for each available model

Attribute/Method	Return	Description
`.name`	string	The name of the model
`.description`	str	Description of the model provided by the original author
`.num_parameters`	int	Number of parameters for the model
`.trained_for`	str	Either `'sentence-similarity'` or `'text-generation'`
`.context_window_size`	int	Length of the context window for the model
`.quantized_files`	`List[ModelQuantization]`	List of available quantization files
`.get_quantization('<method>')`	`ModelQuantization`	Retrieve metadata for a single quantization file
`.download('<method>')`	None	Direct call to download a quantization file
`.delete('<method>')`	None	Delete a downloaded quantization file

Each ModelQuantization object provides

Attribute/Method	Return	Description
`.identifier`	str	The file name as it will appear on disk
`.sha256`	str	The sha256 checksum of the model file
`.quant_method`	str	The quantization method
`.size_bytes`	int	Size of the model file in bytes
`.max_ram_usage`	int	The total amount of ram needed to load the model in bytes
`.is_downloaded`	bool	True if the model file has been downloaded
`.local_path`	str	Will be non-null if the model file has been downloaded
`.download()`	None	Direct call to download the quantization file
`.delete()`	None	Delete the downloaded quantization file

Downloading models

There are three methods to download a quantization file:

Calling .download() from a ModelQuantization object
- For example: client.models.get('<model>').get_quantization('<method>').download()
Calling .download('<method>') from a Model object
- For example: client.models.get('<model>').download('<method>')
client.models.download('quantized-file-name')
- the .models.download() method accepts two types of input: string name of the model with quantization or a ModelQuantization object

If the model file has already been downloaded this function returns immediately. Otherwise a progress bar is shown showing the download progress.

Servers

The .servers accessor provides methods to list running servers, start new servers, and stop servers.

Method	Return	Description
`.list`	`List[Server]`	List all running servers
`.get('<server-id>')`	`Server`	Lookup server object by identifier
`.match`	Server	Find a running server that matches supplied configuration
`.create`	Server	Create a new server configuration with supplied model file and API parameters
`.start('<server-id>')`	None	Start the API server
`.status('<server-id>')`	str	Return the status for a server id
`.stop('<server-id>')`	None	Stop a running server
`.delete('<server-id>')`	None	Completely remove record of server configuration

Creating servers

The .create method will create a new server configuration. If there is already a running server with the same model file and API parameters the matched server configuration is returned rather than creating and starting a new server.

The .create function has the following inputs

Argument	Type	Description
model	str or ModelQuantization	The string name for the quantized model or a ModelQuantization object
extra_options	dict	Control server configuration supported by the backend

By default creating a server configuration will

download the model file if required by the backend
run the server API

For example to create a server with the OpenHermes model with default values

from anaconda_ai import get_default_client

client = get_default_client()
server = client.servers.create(
  'OpenHermes-2.5-Mistral-7B/Q4_K_M',
)

Starting servers

When a server is created it is not automatically started. A server can be started and stopped in a number of ways

From the server object

server.start()
server.stop()

From the .servers accessor

client.servers.start(server)
client.servers.stop(server)

Alternatively you can use .create as a context manager, which will automatically stop the server on exit of the indented block.

with client.servers.create('OpenHermes-2.5-Mistral-7B/Q4_K_M') as server:
    openai_client = server.openai_client()
    # make requests to the server

Server attributes

.status: Text status of the server
.is_running: Boolean status, True if the server is in the 'running' state
.start(): Start the server, optional can be used as a context manager to auto stop
.stop(): Stop the server
.url: is the full url to the running server
.openai_url: OpenAI compatibility url
.openai_client(): creates a pre-configured OpenAI client for this url
.async_openai_client(): creates a pre-configured Async OpenAI client for this url

Each of .openai_client() and async_openai_client() allow extra keyword parameters to pass to the client initialization.

Server Configuration Options

Not all backends support extra_options= on server create.

The AI Navigator backend supports llama-server options passed as snake-case dictionary keys to client.servers.create() with the extra_options kwarg. To enable flags set the value to True.

Here are some notes on specific server parameter behavior

Dict key	Notes
`port`	Start server on specific port, 0 or missing means start on random port
`jinja`	Set to `True` to enable tool calling for models trained to do so

For example:

from anaconda_ai import AnacondaAIClient

client = AnacondaAIClient()
server = client.servers.create(
  'OpenHermes-2.5-Mistral-7B/Q4_K_M',
  extra_options={
    "ctx_size": 512,
    "jinja": True
  }
)

Vector Db

Creates a postgres vector db and returns the connection information. VectorDB is not supported by all backends.

anaconda ai launch-vectordb

LLM

To use the llm integration you will need to also install llm package

conda install -c conda-forge llm

then you can list downloaded model quantizations

llm models

or to show only the Anaconda AI models

llm models list -q anaconda

When utilizing a model it will first ensure that the model has been downloaded and start the server though the backend. Standard OpenAI parameters are supported.

llm -m 'anaconda:meta-llama/llama-2-7b-chat-hf_Q4_K_M.gguf' -o temperature 0.1 'what is pi?'

Additionally, server configuration parameters like ctx_size can be passed

llm -m 'anaconda:meta-llama/llama-2-7b-chat-hf_Q4_K_M.gguf' -o temperature 0.1 -o ctx_size 512 'what is pi?'

Langchain

The LangChain integration provides Chat and Embedding classes that automatically manage downloading and starting servers. You will need the langchain-openai package.

from langchain.prompts import ChatPromptTemplate
from anaconda_ai.integrations.langchain import AnacondaQuantizedModelChat, AnacondaQuantizedModelEmbeddings

prompt = ChatPromptTemplate.from_template("tell me a joke about {topic}")
model = AnacondaQuantizedModelChat(model_name='meta-llama/llama-2-7b-chat-hf_Q4_K_M.gguf')

chain = prompt | model

message = chain.invoke({'topic': 'python'})

The following keyword arguments are supported:

extra_options: Dict, see create servers above

LlamaIndex

You will need at least the llama-index-llms-openai package installed to use the integration.

from anaconda_ai.integrations.llama_index import AnacondaModel

llm = AnacondaModel(
    model='OpenHermes-2.5-Mistral-7B_q4_k_m'
)

The AnacondaModel class supports the following arguments

model: Name of the model using the pattern defined above
system_prompt: Optional system prompt to apply to completions and chats
temperature: Optional temperature to apply to all completions and chats (default is 0.1)
max_tokens: Optional Max tokens to predict (default is to let the model decide when to finish)
extra_options: Optional dict, see server creation above

LiteLLM

This provides a CustomLLM provider for use with litellm. But, since litellm does not currently support entrypoints to register the provider, the user must import the module first.

import litellm
import anaconda_ai.integrations.litellm

response = litellm.completion(
    'anaconda/openhermes-2.5-mistral-7b/q4_k_m',
    messages=[{'role': 'user', 'content': 'what is pi?'}]
)

Supported usage:

completion (with and without stream=True)
acompletion (with and without stream=True)
Most OpenAI inference parameters
- n: number of completions is not supported
Server parameters can be passed as dictionaries to the optional_params keyword argument in the key "server"
- optional_params={"server": {"ctx_size": 512}}

DSPy

Since DSPy uses LiteLLM, Anaconda models can be used with dspy. Streaming and async are supported for raw LLM calls and for modules like Predict or ChainofThought .

import dspy
import anaconda_ai.integrations.litellm

lm = dspy.LM('anaconda/openhermes-2.5-mistral-7b/q4_k_m')
dspy.configure(lm=lm)

chain = dspy.ChainOfThought("question -> answer")
chain(question="Who are you?")

dspy.LM supports optional_params= keyword argument as explained in the previous section.

PydanticAI

The Pydantic AI integration provides ChatModel and EmbeddingModel support. Here's an example using a chat model in an agent.

from anaconda_ai.integrations.pydantic_ai import (
    AnacondaChatModel,
    AnacondaChatModelSettings,
)
settings = AnacondaChatModelSettings(temperature=0.1, extra_options={"ctx_size": 1024})

model = AnacondaChatModel(
    "OpenHermes-2.5-Mistral-7B/q4_k_m",
    settings=settings,
)

And embedding

embed = AnacondaEmbeddingModel(
    "bge-small-en-v1.5/q4_k_m"
)

result = await embed.embed("cat", input_type="document")

Instructor

This integration monkeypatches the instructor.from_provider() method on import. This is needed until the provider can be added to the upstream Instructor package.

import instructor
from pydantic import BaseModel
import anaconda_ai.integrations.instructor  # noqa: F401

client = instructor.from_provider(
    "anaconda/OpenHermes-2.5-Mistral-7B/Q4_K_M", extra_options={"ctx_size": 512}
)

class UserInfo(BaseModel):
    name: str
    age: int


user_info = await client.create(
    response_model=UserInfo,
    messages=[{"role": "user", "content": "John Doe is 30 years old."}],
)

Panel

A callback is available to work with Panel's ChatInterface

To use it you will need to have panel, httpx, and numpy installed.

Here's an example application that can be written in Python script or Jupyter Notebook

import panel as pn
from anaconda_ai.integrations.panel import AnacondaModelHandler

pn.extension('echarts', 'tabulator', 'terminal')

llm = AnacondaModelHandler('TinyLlama/TinyLlama-1.1B-Chat-v1.0_Q4_K_M.gguf', display_throughput=True)

chat = pn.chat.ChatInterface(
    callback=llm.callback,
    show_button_name=False)

chat.send(
    "I am your assistant. How can I help you?",
    user=llm.model_id, avatar=llm.avatar, respond=False
)
chat.servable()

the AnacondaModelHandler supports the following keyword arguments

display_throughput: Show a speed dial next to the response. Default is False
system_message: Default system message applied to all responses
client_options: Optional dict passed as kwargs to chat.completions.create
api_params: Optional dict or APIParams object
load_params: Optional dict or LoadParams object
infer_params: Optional dict or InferParams object

Setup for development

Ensure you have conda installed. Then run:

make setup

Run the unit tests

make test

Run the unit tests across isolated environments with tox

make tox

Name		Name	Last commit message	Last commit date
Latest commit History 72 Commits
.github		.github
conda.recipe		conda.recipe
docs		docs
etc		etc
examples		examples
src/anaconda_ai		src/anaconda_ai
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
environment-dev.yml		environment-dev.yml
pyproject.toml		pyproject.toml
renovate.json		renovate.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

anaconda-ai

Install

Backends

Configuration

Configuration CLI

Declaring model quantization files

CLI

SDK

Models

Downloading models

Servers

Creating servers

Starting servers

Server attributes

Server Configuration Options

Vector Db

LLM

Langchain

LlamaIndex

LiteLLM

DSPy

PydanticAI

Instructor

Panel

Setup for development

Run the unit tests

Run the unit tests across isolated environments with tox

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

anaconda-ai

Install

Backends

Configuration

Configuration CLI

Declaring model quantization files

CLI

SDK

Models

Downloading models

Servers

Creating servers

Starting servers

Server attributes

Server Configuration Options

Vector Db

LLM

Langchain

LlamaIndex

LiteLLM

DSPy

PydanticAI

Instructor

Panel

Setup for development

Run the unit tests

Run the unit tests across isolated environments with tox

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages