|
1 | 1 | --- |
2 | | -title: Building Engine Extensions |
| 2 | +title: Adding a Third-Party Engine to Cortex |
3 | 3 | description: Cortex supports Engine Extensions to integrate both :ocal inference engines, and Remote APIs. |
4 | 4 | --- |
5 | 5 |
|
6 | | -:::info |
7 | | -🚧 Cortex is currently under development, and this page is a stub for future development. |
8 | | -::: |
9 | | - |
10 | | -<!-- |
11 | | -import Tabs from "@theme/Tabs"; |
12 | | -import TabItem from "@theme/TabItem"; |
13 | | - |
14 | 6 | :::warning |
15 | 7 | 🚧 Cortex.cpp is currently under development. Our documentation outlines the intended behavior of Cortex, which may not yet be fully implemented in the codebase. |
16 | 8 | ::: |
17 | 9 |
|
| 10 | +# Guide to Adding a Third-Party Engine to Cortex |
18 | 11 |
|
19 | | -This document provides a step-by-step guide to adding a new engine to the Cortex codebase, similar to the `OpenAIEngineExtension`. |
| 12 | +## Introduction |
20 | 13 |
|
| 14 | +This guide outlines the steps to integrate a custom engine with Cortex. We hope this helps developers understand the integration process. |
21 | 15 |
|
22 | | -## Integrate a New Remote Engine |
| 16 | +## Implementation Steps |
23 | 17 |
|
24 | | -### Step 1: Create the New Engine Extension |
| 18 | +### 1. Implement the Engine Interface |
25 | 19 |
|
26 | | -1. Navigate to the `cortex-js/src/extensions` directory. |
27 | | -2. Create a new file named `<new-engine>.engine.ts` (replace `<new-engine>` with the name of your engine). |
28 | | -3. Implement your new engine extension class using the following template: |
| 20 | +First, create an engine that implements the `EngineI.h` interface. Here's the interface definition: |
29 | 21 |
|
30 | | -```typescript |
31 | | -class <NewEngine>EngineExtension extends OAIEngineExtension { |
32 | | - apiUrl = 'https://api.<new-engine>.com/v1/chat/completions'; |
33 | | - name = '<new-engine>'; |
34 | | - productName = '<New Engine> Inference Engine'; |
35 | | - description = 'This extension enables <New Engine> chat completion API calls'; |
36 | | - version = '0.0.1'; |
37 | | - apiKey?: string; |
38 | | -} |
39 | | -``` |
| 22 | +```cpp |
| 23 | +class EngineI { |
| 24 | + public: |
| 25 | + struct EngineLoadOption{}; |
| 26 | + struct EngineUnloadOption{}; |
40 | 27 |
|
41 | | -:::info |
42 | | -Be sure to replace all placeholders with the appropriate values for your engine. |
43 | | -::: |
| 28 | + virtual ~EngineI() {} |
| 29 | + |
| 30 | + virtual void Load(EngineLoadOption opts) = 0; |
| 31 | + virtual void Unload(EngineUnloadOption opts) = 0; |
44 | 32 |
|
45 | | -### Step 2: Register the New Engine |
| 33 | + // Cortex.llamacpp interface methods |
| 34 | + virtual void HandleChatCompletion( |
| 35 | + std::shared_ptr<Json::Value> json_body, |
| 36 | + std::function<void(Json::Value&&, Json::Value&&)>&& callback) = 0; |
46 | 37 |
|
47 | | -1. Open the `extensions.module.ts` located at `cortex-js/src/extensions/`. |
| 38 | + virtual void HandleEmbedding( |
| 39 | + std::shared_ptr<Json::Value> json_body, |
| 40 | + std::function<void(Json::Value&&, Json::Value&&)>&& callback) = 0; |
48 | 41 |
|
49 | | -2. Register your new engine in the provider array using the following code: |
| 42 | + virtual void LoadModel( |
| 43 | + std::shared_ptr<Json::Value> json_body, |
| 44 | + std::function<void(Json::Value&&, Json::Value&&)>&& callback) = 0; |
50 | 45 |
|
51 | | -```typescript |
52 | | -[ |
53 | | - new OpenAIEngineExtension(httpService, configUsecases, eventEmitter), |
54 | | - //... other remote engines |
55 | | - new <NewEngine>EngineExtension(httpService, configUsecases, eventEmitter), |
56 | | -] |
| 46 | + virtual void UnloadModel( |
| 47 | + std::shared_ptr<Json::Value> json_body, |
| 48 | + std::function<void(Json::Value&&, Json::Value&&)>&& callback) = 0; |
| 49 | + |
| 50 | + virtual void GetModelStatus( |
| 51 | + std::shared_ptr<Json::Value> json_body, |
| 52 | + std::function<void(Json::Value&&, Json::Value&&)>&& callback) = 0; |
| 53 | + |
| 54 | + // Compatibility and model management |
| 55 | + virtual bool IsSupported(const std::string& f) = 0; |
| 56 | + |
| 57 | + virtual void GetModels( |
| 58 | + std::shared_ptr<Json::Value> jsonBody, |
| 59 | + std::function<void(Json::Value&&, Json::Value&&)>&& callback) = 0; |
| 60 | + |
| 61 | + // Logging configuration |
| 62 | + virtual bool SetFileLogger(int max_log_lines, |
| 63 | + const std::string& log_path) = 0; |
| 64 | + virtual void SetLogLevel(trantor::Logger::LogLevel logLevel) = 0; |
| 65 | +}; |
57 | 66 | ``` |
58 | 67 |
|
59 | | -## Explanation of Key Properties and Methods |
60 | | -| **Value** | **Description** | |
61 | | -|------------------------------------|--------------------------------------------------------------------------------------------------| |
62 | | -| `apiUrl` | This is the URL endpoint for the new engine's API. It is used to make chat completion requests. | |
63 | | -| `name` | This is a unique identifier for the engine. It is used internally to reference the engine. | |
64 | | -| `productName` | This is a human-readable name for the engine. It is used for display purposes. | |
65 | | -| `description` | This provides a brief description of what the engine does. It is used for documentation and display purposes. | |
66 | | -| `version` | This indicates the version of the engine extension. It is used for version control and display purposes. | |
67 | | -| `eventEmmitter.on('config.updated')` | This is an event listener that listens for configuration updates. When the configuration for the engine is updated, this listener updates the `apiKey` and the engine's status. | |
68 | | -| `onLoad` | This method is called when the engine extension is loaded. It retrieves the engine's configuration (such as the `apiKey`) and sets the engine's status based on whether the `apiKey` is available. | |
| 68 | +Note that Cortex will call `Load` before loading any models and `Unload` when stopping the engine. |
| 69 | +
|
| 70 | +### 2. Create a Dynamic Library |
| 71 | +
|
| 72 | +We recommend using the [dylib library](https://github.com/martin-olivier/dylib) to build your dynamic library. This library provides helpful tools for creating cross-platform dynamic libraries. |
69 | 73 |
|
70 | | -## Advanced: Transforming Payloads and Responses |
| 74 | +### 3. Package Dependencies |
71 | 75 |
|
72 | | -Some engines require custom transformations for the payload sent to the API and the response received from the API. This is achieved using the `transformPayload` and `transformResponse` methods. These methods allow you to modify the data structure to match the specific requirements of the engine. |
| 76 | +Please ensure all dependencies are included with your dynamic library. This allows us to create a single, self-contained package for distribution. |
73 | 77 |
|
74 | | -### `transformPayload` |
| 78 | +### 4. Publication and Integration |
| 79 | +
|
| 80 | +#### 4.1 Publishing Your Engine (Optional) |
| 81 | +
|
| 82 | +If you wish to make your engine publicly available, you can publish it through GitHub. For reference, examine the [cortex.llamacpp releases](https://github.com/janhq/cortex.llamacpp/releases) structure: |
| 83 | +
|
| 84 | +- Each release tag should represent your version |
| 85 | +- Include all variants within the same release |
| 86 | +- Cortex will automatically select the most suitable variant or allow users to specify their preferred variant |
| 87 | +
|
| 88 | +#### 4.2 Integration with Cortex |
| 89 | +
|
| 90 | +Once your engine is ready, we encourage you to: |
| 91 | +
|
| 92 | +1. Notify the Cortex team about your engine for potential inclusion in our default supported engines list |
| 93 | +2. Allow us to help test and validate your implementation |
| 94 | +
|
| 95 | +### 5. Local Testing Guide |
| 96 | +
|
| 97 | +To test your engine locally: |
| 98 | +
|
| 99 | +1. Create a directory structure following this hierarchy: |
| 100 | +
|
| 101 | +``` |
| 102 | +engines/ |
| 103 | +└── cortex.llamacpp/ |
| 104 | + └── mac-arm64/ |
| 105 | + └── v0.1.40/ |
| 106 | + ├── libengine.dylib |
| 107 | + └── version.txt |
| 108 | +``` |
75 | 109 |
|
76 | | -The `transformPayload` method is used to transform the data before sending it to the engine's API. This method takes the original payload and modifies it as needed. |
| 110 | +2. Configure your engine: |
77 | 111 |
|
78 | | -**Example: Anthropic Engine** |
| 112 | + - Edit the `~/.cortexrc` file to register your engine name |
| 113 | + - Add your model with the appropriate engine field in `model.yaml` |
79 | 114 |
|
80 | | -In the Anthropic Engine, the `transformPayload` method extracts the system message and other messages, and includes additional parameters like `model`, `stream`, and `max_tokens`. |
| 115 | +3. Testing: |
| 116 | + - Start the engine |
| 117 | + - Load your model |
| 118 | + - Verify functionality |
81 | 119 |
|
82 | | -### `transformResponse` |
| 120 | +## Future Development |
83 | 121 |
|
84 | | -The `transformResponse` method is used to transform the data received from the engine's API. This method processes the response and converts it into a format that the application can use. |
| 122 | +We're currently working on expanding support for additional release sources to make distribution more flexible. |
85 | 123 |
|
86 | | -**Example: Anthropic Engine** |
| 124 | +## Contributing |
87 | 125 |
|
88 | | -In the Anthropic Engine, the `transformResponse` method handles both stream and non-stream responses. It processes the response data and converts it into a standardized format. |
89 | | - --> |
| 126 | +We welcome suggestions and contributions to improve this integration process. Please feel free to submit issues or pull requests through our repository. |
0 commit comments