Skip to content
This repository was archived by the owner on Jul 4, 2025. It is now read-only.

Commit 08fbb8a

Browse files
committed
chore: add document
1 parent 6300732 commit 08fbb8a

File tree

1 file changed

+95
-58
lines changed

1 file changed

+95
-58
lines changed
Lines changed: 95 additions & 58 deletions
Original file line numberDiff line numberDiff line change
@@ -1,89 +1,126 @@
11
---
2-
title: Building Engine Extensions
2+
title: Adding a Third-Party Engine to Cortex
33
description: Cortex supports Engine Extensions to integrate both :ocal inference engines, and Remote APIs.
44
---
55

6-
:::info
7-
🚧 Cortex is currently under development, and this page is a stub for future development.
8-
:::
9-
10-
<!--
11-
import Tabs from "@theme/Tabs";
12-
import TabItem from "@theme/TabItem";
13-
146
:::warning
157
🚧 Cortex.cpp is currently under development. Our documentation outlines the intended behavior of Cortex, which may not yet be fully implemented in the codebase.
168
:::
179

10+
# Guide to Adding a Third-Party Engine to Cortex
1811

19-
This document provides a step-by-step guide to adding a new engine to the Cortex codebase, similar to the `OpenAIEngineExtension`.
12+
## Introduction
2013

14+
This guide outlines the steps to integrate a custom engine with Cortex. We hope this helps developers understand the integration process.
2115

22-
## Integrate a New Remote Engine
16+
## Implementation Steps
2317

24-
### Step 1: Create the New Engine Extension
18+
### 1. Implement the Engine Interface
2519

26-
1. Navigate to the `cortex-js/src/extensions` directory.
27-
2. Create a new file named `<new-engine>.engine.ts` (replace `<new-engine>` with the name of your engine).
28-
3. Implement your new engine extension class using the following template:
20+
First, create an engine that implements the `EngineI.h` interface. Here's the interface definition:
2921

30-
```typescript
31-
class <NewEngine>EngineExtension extends OAIEngineExtension {
32-
apiUrl = 'https://api.<new-engine>.com/v1/chat/completions';
33-
name = '<new-engine>';
34-
productName = '<New Engine> Inference Engine';
35-
description = 'This extension enables <New Engine> chat completion API calls';
36-
version = '0.0.1';
37-
apiKey?: string;
38-
}
39-
```
22+
```cpp
23+
class EngineI {
24+
public:
25+
struct EngineLoadOption{};
26+
struct EngineUnloadOption{};
4027

41-
:::info
42-
Be sure to replace all placeholders with the appropriate values for your engine.
43-
:::
28+
virtual ~EngineI() {}
29+
30+
virtual void Load(EngineLoadOption opts) = 0;
31+
virtual void Unload(EngineUnloadOption opts) = 0;
4432

45-
### Step 2: Register the New Engine
33+
// Cortex.llamacpp interface methods
34+
virtual void HandleChatCompletion(
35+
std::shared_ptr<Json::Value> json_body,
36+
std::function<void(Json::Value&&, Json::Value&&)>&& callback) = 0;
4637

47-
1. Open the `extensions.module.ts` located at `cortex-js/src/extensions/`.
38+
virtual void HandleEmbedding(
39+
std::shared_ptr<Json::Value> json_body,
40+
std::function<void(Json::Value&&, Json::Value&&)>&& callback) = 0;
4841

49-
2. Register your new engine in the provider array using the following code:
42+
virtual void LoadModel(
43+
std::shared_ptr<Json::Value> json_body,
44+
std::function<void(Json::Value&&, Json::Value&&)>&& callback) = 0;
5045

51-
```typescript
52-
[
53-
new OpenAIEngineExtension(httpService, configUsecases, eventEmitter),
54-
//... other remote engines
55-
new <NewEngine>EngineExtension(httpService, configUsecases, eventEmitter),
56-
]
46+
virtual void UnloadModel(
47+
std::shared_ptr<Json::Value> json_body,
48+
std::function<void(Json::Value&&, Json::Value&&)>&& callback) = 0;
49+
50+
virtual void GetModelStatus(
51+
std::shared_ptr<Json::Value> json_body,
52+
std::function<void(Json::Value&&, Json::Value&&)>&& callback) = 0;
53+
54+
// Compatibility and model management
55+
virtual bool IsSupported(const std::string& f) = 0;
56+
57+
virtual void GetModels(
58+
std::shared_ptr<Json::Value> jsonBody,
59+
std::function<void(Json::Value&&, Json::Value&&)>&& callback) = 0;
60+
61+
// Logging configuration
62+
virtual bool SetFileLogger(int max_log_lines,
63+
const std::string& log_path) = 0;
64+
virtual void SetLogLevel(trantor::Logger::LogLevel logLevel) = 0;
65+
};
5766
```
5867
59-
## Explanation of Key Properties and Methods
60-
| **Value** | **Description** |
61-
|------------------------------------|--------------------------------------------------------------------------------------------------|
62-
| `apiUrl` | This is the URL endpoint for the new engine's API. It is used to make chat completion requests. |
63-
| `name` | This is a unique identifier for the engine. It is used internally to reference the engine. |
64-
| `productName` | This is a human-readable name for the engine. It is used for display purposes. |
65-
| `description` | This provides a brief description of what the engine does. It is used for documentation and display purposes. |
66-
| `version` | This indicates the version of the engine extension. It is used for version control and display purposes. |
67-
| `eventEmmitter.on('config.updated')` | This is an event listener that listens for configuration updates. When the configuration for the engine is updated, this listener updates the `apiKey` and the engine's status. |
68-
| `onLoad` | This method is called when the engine extension is loaded. It retrieves the engine's configuration (such as the `apiKey`) and sets the engine's status based on whether the `apiKey` is available. |
68+
Note that Cortex will call `Load` before loading any models and `Unload` when stopping the engine.
69+
70+
### 2. Create a Dynamic Library
71+
72+
We recommend using the [dylib library](https://github.com/martin-olivier/dylib) to build your dynamic library. This library provides helpful tools for creating cross-platform dynamic libraries.
6973
70-
## Advanced: Transforming Payloads and Responses
74+
### 3. Package Dependencies
7175
72-
Some engines require custom transformations for the payload sent to the API and the response received from the API. This is achieved using the `transformPayload` and `transformResponse` methods. These methods allow you to modify the data structure to match the specific requirements of the engine.
76+
Please ensure all dependencies are included with your dynamic library. This allows us to create a single, self-contained package for distribution.
7377
74-
### `transformPayload`
78+
### 4. Publication and Integration
79+
80+
#### 4.1 Publishing Your Engine (Optional)
81+
82+
If you wish to make your engine publicly available, you can publish it through GitHub. For reference, examine the [cortex.llamacpp releases](https://github.com/janhq/cortex.llamacpp/releases) structure:
83+
84+
- Each release tag should represent your version
85+
- Include all variants within the same release
86+
- Cortex will automatically select the most suitable variant or allow users to specify their preferred variant
87+
88+
#### 4.2 Integration with Cortex
89+
90+
Once your engine is ready, we encourage you to:
91+
92+
1. Notify the Cortex team about your engine for potential inclusion in our default supported engines list
93+
2. Allow us to help test and validate your implementation
94+
95+
### 5. Local Testing Guide
96+
97+
To test your engine locally:
98+
99+
1. Create a directory structure following this hierarchy:
100+
101+
```
102+
engines/
103+
└── cortex.llamacpp/
104+
└── mac-arm64/
105+
└── v0.1.40/
106+
├── libengine.dylib
107+
└── version.txt
108+
```
75109
76-
The `transformPayload` method is used to transform the data before sending it to the engine's API. This method takes the original payload and modifies it as needed.
110+
2. Configure your engine:
77111
78-
**Example: Anthropic Engine**
112+
- Edit the `~/.cortexrc` file to register your engine name
113+
- Add your model with the appropriate engine field in `model.yaml`
79114
80-
In the Anthropic Engine, the `transformPayload` method extracts the system message and other messages, and includes additional parameters like `model`, `stream`, and `max_tokens`.
115+
3. Testing:
116+
- Start the engine
117+
- Load your model
118+
- Verify functionality
81119
82-
### `transformResponse`
120+
## Future Development
83121
84-
The `transformResponse` method is used to transform the data received from the engine's API. This method processes the response and converts it into a format that the application can use.
122+
We're currently working on expanding support for additional release sources to make distribution more flexible.
85123
86-
**Example: Anthropic Engine**
124+
## Contributing
87125
88-
In the Anthropic Engine, the `transformResponse` method handles both stream and non-stream responses. It processes the response data and converts it into a standardized format.
89-
-->
126+
We welcome suggestions and contributions to improve this integration process. Please feel free to submit issues or pull requests through our repository.

0 commit comments

Comments
 (0)