Design proposal: Chat Completions API (rev. 0)

## Description

This issue proposes a design for a new **Chat Completions API**. This API will allow *consumer extensions* to provide completions for the user's current input from the UI. In this context, a consumer extension is any frontend + server extension that intends to provide completions for substrings in the chat input.

### Motivation

Suppose a user types `/` in the chat with Jupyter AI installed. Today, Jupyter Chat responds by showing a menu of **chat completions**:

<img width="358" alt="Screenshot 2025-01-02 at 5 08 27 PM" src="https://github.com/user-attachments/assets/833baa29-3a93-4943-aac7-1df8299496fd" />

The opening of this completions menu is triggered simply by typing `/`. However, because the current API only allows a single "trigger character", this doesn't work when `@` is typed, meaning that `@file` commands cannot be auto-completed.

This design aims to:

1. Extend the existing chat completions capability to allow for completions to be triggered on multiple triggering patterns.
2. Allow triggering patterns to be more complex than the existence of a single character.

To help explain the proposed design, this document will start from the perspective of a consumer extension, then work backwards towards the necessary changes in Jupyter Chat.

## Step 1: Defining & providing a `ChatCompleter`

To register completions for partial inputs, a consumer extension must provide a set of **chat completers**. A chat completer is a Python class which provides:

- `id` (property): Defines a unique ID for this chat completer. We will see why this is useful later.

- `regex` (property): Defines a regex which matches any incomplete input.
  - Each regex should end with `$` to ensure this regex only matches partial inputs just typed by the user. Without `$`, the completer may generate completions for commands which were already typed.

- `get_completions(match: str) -> List[ChatCompletion]` Defines a method which accepts a substring matched by its regex, and returns a list of potential completions for that input. This list may be empty.
  - The interface of `ChatCompletion` will be defined later; for now, we can think of this method as just returning a list of strings that are potential completions to the user's input.


Jupyter Chat will provide an `AbstractChatCompleter` class that defines the structure of the chat completer class, shown below.

```python
from abc import ABC

class AbstractChatCompleter(ABC):
    @property
    @abstractmethod
    def id(self):
        raise NotImplementedError()
       
    @property
    @abstractmethod
    def regex(self):
        raise NotImplementedError()
    
    @abstractmethod
    def get_completions(self, match: str) -> List[ChatCompletion]
        raise NotImplementedError()
```

To define a chat completer, a consumer extension should implement the `AbstractChatCompleter` class. Here is an example of how Jupyter AI may implement a chat completer to provide completions for its slash commands:

```py
class SlashCommandCompleter(AbstractChatCompleter):
    @property
    def id(self):
        return "jai-slash-commands"
    
    @property
    def regex(self):
        # matches when:
        # - any partial slash command appears at start of input
        # - the partial slash command is immediately followed by end of input
        #
        # Examples:
        # - "/" => matched
        # - "/le" => matched
        # - "/learn" => matched
        # - "/learn " (note the space) => not matched
        # - "what does /help do?" => not matched
        return "/^\/\w*$/"
    
    def get_completions(self, match: str) -> List[ChatCompletion]:
        # should behave like:
        # "/" => ["/ask", "/help", "/learn", ...]
        # "/l" => ["/learn"]
        # "/h" => ["/help"]
        # "/zxcv" => []
        ...
```

Finally, for a consumer extension to provide these chat completers to Jupyter Chat, the consumer extension must declare each class as an *entry point* in a fixed *entry point group*. When Jupyter Chat reads from this entry point group on init, Jupyter Chat can gather all chat completer classes from every consumer extension.

Entry points are defined in [PyPA entry points specification](https://packaging.python.org/en/latest/specifications/entry-points/). Entry points are used already in Jupyter AI to allow other extensions to add extra chat commands. We will not go into detail here, as Jupyter AI already serves as an implementation reference.

## Step 2: Define the chat completion REST API

From the example `SlashCommandCompleter` implementation above, we can piece together what Jupyter Chat's frontend should do:

1. On init, fetch the chat completer IDs & regexes from the backend.
3. When any completer's regex is matched by the user's input:
    - Fetch a list of all valid completions from every chat completer whose regex is matched by the user's input, from the backend.
    - Show the list of all completions in the UI.
    - When a completion is accepted, replace the substring of the input matched by the completer's regex with the completion.
  
The frontend implementation should [debounce](https://developer.mozilla.org/en-US/docs/Glossary/Debounce) how frequently it checks the input for step 2, since it will be expensive to do on every typed character.

To make this possible, we need to define a new REST API for chat completion.

### Completions REST API

- `GET /chat/completers`: Returns a `ChatCompletersResponse` object, which describes all of the chat completers provided to Jupyter Chat by consumer extensions. 

- `POST /chat/completion_matches`: Returns a `ChatCompletionsResponse` object given a `ChatCompletionsRequest` object in the request body. This fetches the list of completions from any input. This should be triggered when the user's chat input matches any of the regexes from `GET /chat/completers`.


### Request & response types

```ts
type ChatCompletersResponse = {
    completers: ChatCompleter[];
}

type ChatCompleter = {
    id: string;
    regex: string;
}

type ChatCompletionsRequest = {
    matches: ChatCompleterMatch;
}

type ChatCompleterMatch = {
    completerId: string;
    match: string;
}

type ChatCompletionsResponse = {
    completions: ChatCompletion[]
}

type ChatCompletion = {
    // e.g. "/ask"
    value: string;
    
    // if set, use this as the label. otherwise use `value`.
    label?: string;
    
    // if set, show this as a subtitle.
    description?: string;
    
    // identifies which icon should be used.
    // not described here, so consider this field reserved for now.
    iconType?: string;
}
```


### Example request flow

In this section, we will explore the REST API calls made in an example setting. This assumes that this design has been implemented exactly as stated, and that `SlashCommandCompleter` has been provided by another consumer extension.

When a user opens JupyterLab, the frontend immediately calls `GET /chat/completers` to fetch the list of completers & their regexes. With just one completer provided, the `ChatCompletersResponse` object is:

```json
{
    "completers": [
        { "id": "jai-slash-commands", "regex": "/^\/\w*$/" }
    ]
}
```

Then, the user types `/h`. This matches the regex of `SlashCommandCompleter`, so the frontend calls `POST /chat/completion_matches` with a `ChatCompletionsRequest` object:

```json
{
    "matches": [
        { "completerId": "jai-slash-commands", "match": "/h" }
    ]
}
```

The backend receives this request and responds with a `ChatCompletionsResponse` object. Here, we assume that `/help` is the only valid completion.

```json
{
    "completions": [
        {
            "value": "/help ", // <= adds a space after accepting completion
            "label": "/help"
            "description": "Display a help message (Jupyter AI).",
            "iconType": "book",
        }
    ]
}
```

The user's menu now has a single completion for `/h`, which replaces `/h` with `/help ` when accepted.

## Conclusion 

Together, the entry points API (step 1) and the REST API (step 2) form the proposed **Chat Completion API**.

### Benefits & applications

- Completers are uniquely identified by their `id`, so two completers can use the same regex but yield two different sets of completions.
    - Application: Another extension could use the same `/` command regex to provide completions for its own custom `/` commands.
    - Application: Typing `@` can trigger completions from multiple completers; one may provide usernames of other users in the chat, and another may provide the context commands available in Jupyter AI (e.g. `@file`).
- A completion doesn't need to share a prefix with the substring that triggered completions.
    - Application: Define a completer that matches `$` and returns the completion `\\$`. Pressing "Enter" to accept the completion allows a user to easily type a literal dollar sign instead of opening math mode. If typing math was the user's intention, typing any character other than "Enter" hides the `\\$` completion and allows math to be written. 
- Regex allows the triggering of completions to be strictly controlled. This means that "complete-able" suffixes don't need some unique identifier like `/` or `@`.
    - Application: Define a completer that matches `./` following whitespace and returns filenames for the current directory. For example, this could trigger the completions `./README.md`, `./pyproject.toml`, etc. 
    
    - Application: Define a completer that matches `:` following whitespace and returns a list of emojis.

### Risks considered

- This design proposes that the completer classes are defined in the backend. This may be a concern as some data & state is more easily accessed from the frontend.
  - I can change this such that completer classes are defined in the frontend. The `get_completions()` method can be made async such that *some* completers can make a network call to use backend APIs, but others can use frontend APIs directly.
  - One issue with defining completers in the frontend is that I'm not sure if it will allow *multiple* (>1) extensions to provide completers. As far as I know, at most one extension can provide a Lumino token.

- I'm not sure if the current design will be sufficient for the `@`-mentioning of kernel local variables. [This is a proposal for Jupyter AI v3.](https://github.com/jupyterlab/jupyter-ai/pull/1157)

If a major revision of this design is needed, I will close this issue, revise the design, and open a new issue with a bumped revision number.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Design proposal: Chat Completions API (rev. 0) #143

Description

Motivation

Step 1: Defining & providing a `ChatCompleter`

Step 2: Define the chat completion REST API

Completions REST API

Request & response types

Example request flow

Conclusion

Benefits & applications

Risks considered

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Design proposal: Chat Completions API (rev. 0) #143

Description

Description

Motivation

Step 1: Defining & providing a ChatCompleter

Step 2: Define the chat completion REST API

Completions REST API

Request & response types

Example request flow

Conclusion

Benefits & applications

Risks considered

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Step 1: Defining & providing a `ChatCompleter`