Skip to content

Add NVIDIA Catalog Connector#1

Open
chrisalexiuk-nvidia wants to merge 18 commits intomainfrom
add_feature_nvidia_api_playground_connector_llm
Open

Add NVIDIA Catalog Connector#1
chrisalexiuk-nvidia wants to merge 18 commits intomainfrom
add_feature_nvidia_api_playground_connector_llm

Conversation

@chrisalexiuk-nvidia
Copy link
Copy Markdown
Owner

Added NVIDIA Catalog Connector

"playground_mixtral_8x7b": 32_000,
"playground_llama2_code_34b": 100_000,
"playground_steerlm_llama_70b": 3072,
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

models to add -

  • mistralai/mistral-7b-instruct-v0.2
  • mistralai/mixtral-8x7b-instruct-v0.1
  • google/gemma-7b
  • meta/codellama-70b
  • meta/llama2-70b
  • cohere/aya-101
  • cohere/command-r

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added!

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chrisalexiuk-nvidia the cohere ones aren't actually part of api catalog, let's remove them

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good!

from openai.types.chat import ChatCompletionMessageParam
from openai.types.chat.chat_completion_message import ChatCompletionMessage

AI_PLAYGROUND_MODELS: Dict[str, int] = {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these will be API_CATALOG_MODELS

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed.

@@ -0,0 +1,3 @@
from llama_index.llms.nvidia_ai_playground.base import NvidiaAIPlayground

__all__ = ["NvidiaAIPlayground"]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's tentatively go with NVIDIA instead of NvidiaAIPlayground

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed.

DEFAULT_PLAYGROUND_MAX_TOKENS = 512


class NvidiaAIPlayground(LLM):
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

chat completion models also support params -

  • frequency_penalty: float -2..2
  • presence_penalty: float -2..2
  • seed
  • stop: str | list[str]

include them as properties or require passing as kwargs

in either case, make sure there's test coverage

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passing as kwargs tested as working, added example to "test" notebook.


response = self._client.chat.completions.create(messages=message_dicts, stream=True, **all_kwargs)

def gen() -> ChatResponseGen:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you create gen() and return gen() instead of yielding from the loop?

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just the pattern found in LlamaIndex - they use this ChatResponseGen later.

https://github.com/run-llama/llama_index/blob/main/llama-index-core/llama_index/core/base/llms/types.py#L99

(Sorry for double ping, was on my personal acct.)

…hub.com:chrisalexiuk-nvidia/llama_index into add_feature_nvidia_api_playground_connector_llm
Copy link
Copy Markdown

@mattf mattf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how should this be extended to work with a downloaded NIM?

@chrisalexiuk-nvidia
Copy link
Copy Markdown
Owner Author

@mattf Added tests for Pytest to test for basic functionality

@chrisalexiuk-nvidia
Copy link
Copy Markdown
Owner Author

how should this be extended to work with a downloaded NIM?

I'm still not sure the ".mode()" works well with LlamaIndex - but if that's the best way to do it, in your impression, then I'll try and whip it up today.



def test_validates_api_key_is_present() -> None:
with CachedNVIDIApiKeys(set_fake_key=True):
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NVIDIA() should work with no api_key or NVIDIA_API_KEY env to support users doing NVIDIA().mode("nim", base_url=...) without a key. please add a test for that case.

Copy link
Copy Markdown

@mattf mattf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to add support and tests for mode switching w/ NVIDIA().mode("nim", base_url=...)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants