Skip to content

Commit 1723881

Browse files
[openvino-langchain] Add ChatOpenVINO (#983)
Signed-off-by: Kirill Suvorov <[email protected]> Co-authored-by: Alicja Miloszewska <[email protected]>
1 parent 2ede306 commit 1723881

File tree

8 files changed

+506
-14
lines changed

8 files changed

+506
-14
lines changed

modules/openvino-langchain/README.md

Lines changed: 36 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -64,9 +64,9 @@ optimum-cli export openvino --model "TinyLlama/TinyLlama-1.1B-Chat-v1.0" --weigh
6464

6565
## LLM
6666

67-
This package contains the `GenAI` class, which is the recommended way to interact with models optimized for the OpenVINO toolkit.
67+
This package contains the `OpenVINO` class, which is the recommended way to interact with models optimized for the OpenVINO toolkit.
6868

69-
**GenAI Parameters**
69+
**OpenVINO Parameters**
7070

7171
| Name | Type | Required | Description |
7272
| ----- | ---- |--------- | ----------- |
@@ -75,9 +75,9 @@ This package contains the `GenAI` class, which is the recommended way to interac
7575
| generationConfig | [GenerationConfig](https://github.com/openvinotoolkit/openvino.genai/blob/master/src/js/lib/utils.ts#L107-L110) || Structure to keep generation config parameters. |
7676

7777
```typescript
78-
import { GenAI } from "openvino-langchain";
78+
import { OpenVINO } from "openvino-langchain";
7979

80-
const model = new GenAI({
80+
const model = new OpenVINO({
8181
modelPath: "path-to-model",
8282
device: "CPU",
8383
generationConfig: {
@@ -87,6 +87,38 @@ const model = new GenAI({
8787
const response = await model.invoke("Hello, world!");
8888
```
8989

90+
## ChatModel
91+
92+
This package contains the `ChatOpenVINO` class, which allow use the OpenVINO for chat pipelines.
93+
94+
**ChatOpenVINO Parameters**
95+
96+
| Name | Type | Required | Description |
97+
| ----- | ---- |--------- | ----------- |
98+
| modelPath | string || Path to the directory containing model xml/bin files and tokenizer |
99+
| device | string || Device to run the model on (e.g., CPU, GPU). |
100+
| generationConfig | [GenerationConfig](https://github.com/openvinotoolkit/openvino.genai/blob/master/src/js/lib/utils.ts#L107-L110) || Structure to keep generation config parameters. |
101+
102+
```js
103+
import { ChatOpenVINO } from "openvino-langchain";
104+
import { HumanMessage, SystemMessage } from '@langchain/core/messages';
105+
106+
const model = new ChatOpenVINO({
107+
modelPath: "path-to-model",
108+
device: "CPU",
109+
generationConfig: {
110+
"max_new_tokens": 100,
111+
},
112+
});
113+
114+
const messages = [
115+
new SystemMessage('Translate the following from English into German'),
116+
new HumanMessage('Thank you!'),
117+
];
118+
const response = await model.invoke(messages);
119+
console.log(response.content);
120+
```
121+
90122
## Text Embedding Model
91123

92124
This package also adds support for OpenVINO's embeddings model.
Lines changed: 43 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,21 +1,55 @@
1-
## How to run sample?
1+
# OpenVINO™ LangChain.js adapter samples
22

3-
First download a sample model. You can use Optimum Intel [tool](https://github.com/huggingface/optimum-intel):
3+
## Download and convert the model and tokenizers
4+
5+
You need to convert and compress the text generation model into the [OpenVINO IR format](https://docs.openvino.ai/2025/documentation/openvino-ir-format.html).
6+
Refer to the [Supported Models](https://openvinotoolkit.github.io/openvino.genai/docs/supported-models/#large-language-models-llms) for more details.
7+
8+
### Option 1. Convert a model with Optimum Intel
9+
10+
First install [Optimum Intel](https://github.com/huggingface/optimum-intel) and then run the export with Optimum CLI:
411

512
```bash
6-
optimum-cli export openvino --trust-remote-code --model microsoft/Phi-3.5-mini-instruct Phi-3.5-mini-instruct
13+
optimum-cli export openvino --model <model> <output_folder>
714
```
815

9-
Alternatively, you can clone the repository:
16+
### Option 2. Download a converted model
1017

11-
```bash
12-
git clone https://huggingface.co/OpenVINO/Phi-3.5-mini-instruct-fp16-ov
18+
If a converted model in OpenVINO IR format is already available in the collection of [OpenVINO optimized LLMs](https://huggingface.co/collections/OpenVINO/llm-6687aaa2abca3bbcec71a9bd) on Hugging Face, it can be downloaded directly via [huggingface-cli](https://huggingface.co/docs/huggingface_hub/en/guides/cli).
19+
20+
```sh
21+
huggingface-cli download <model> --local-dir <output_folder>
1322
```
1423

15-
Then navigate to the `openvino-langchain/sample` directory and run the sample:
24+
## Install NPM dependencies
25+
26+
Run the following command from the current directory:
1627

1728
```bash
18-
cd sample/
1929
npm install
20-
node index.js *path_to_llm_model_dir* *path_to_embeddings_model_dir*
2130
```
31+
32+
## Sample Descriptions
33+
34+
### 1. Chat Sample (`chat_sample`)
35+
- **Description:** Interactive chat interface powered by OpenVINO.
36+
- **Recommended Models:**
37+
- `meta-llama/Llama-2-7b-chat-hf`
38+
- `TinyLlama/TinyLlama-1.1B-Chat-v1.0`
39+
- **Main Feature:** Real-time chat-like text generation.
40+
- **Run Command:**
41+
```bash
42+
node chat_sample.js <model_dir>
43+
```
44+
45+
### 2. RAG Sample (`rag_sample`)
46+
- **Description:** This sample retrieves relevant documents from a simple [knowledge base](./data/document_sample.txt) using a retriever model
47+
and generates a response using a generative model, conditioned on both the user query and the retrieved documents.
48+
- **Recommended Models:**
49+
- **LLM:** `meta-llama/Llama-2-7b-chat-hf`
50+
- **Embedding:** `BAAI/bge-small-en-v1.5`
51+
- **Main Feature:** RAG pipeline implementation.
52+
- **Run Command:**
53+
```bash
54+
node rag_sample.js <llm_dir> <embedding_model_dir>
55+
```
Lines changed: 84 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,84 @@
1+
import { HumanMessage, SystemMessage } from '@langchain/core/messages';
2+
import { ChatOpenVINO } from 'openvino-langchain';
3+
import { basename } from 'node:path';
4+
import readline from 'readline';
5+
6+
const LLM_MODEL_PATH = process.argv[2];
7+
8+
if (!LLM_MODEL_PATH) {
9+
console.error('Please specify path to models directories\n'
10+
+ 'Run command must be:\n'
11+
+ `'node ${basename(process.argv[1])} *path_to_llm_model_dir*'`);
12+
process.exit(1);
13+
}
14+
if (process.argv.length > 3) {
15+
console.error(
16+
`Run command must be:
17+
'node ${basename(process.argv[1])} *path_to_llm_model_dir*'`,
18+
);
19+
process.exit(1);
20+
}
21+
22+
async function main() {
23+
const MODEL_PATH = process.argv[2];
24+
25+
if (process.argv.length > 3) {
26+
console.error(
27+
`Run command must be:
28+
'node ${basename(process.argv[1])} *path_to_model_dir*'`,
29+
);
30+
process.exit(1);
31+
}
32+
if (!MODEL_PATH) {
33+
console.error('Please specify path to model directory\n'
34+
+ `Run command must be:
35+
'node ${basename(process.argv[1])} *path_to_model_dir*'`);
36+
process.exit(1);
37+
}
38+
39+
const device = 'CPU'; // GPU can be used as well
40+
const config = { 'max_new_tokens': 100 };
41+
const chat = new ChatOpenVINO({
42+
modelPath: LLM_MODEL_PATH,
43+
device,
44+
generationConfig: config,
45+
});
46+
47+
const rl = readline.createInterface({
48+
input: process.stdin,
49+
output: process.stdout,
50+
});
51+
52+
const messages = [
53+
new SystemMessage('You are chatbot.'),
54+
];
55+
56+
promptUser();
57+
58+
// Function to prompt the user for input
59+
function promptUser() {
60+
rl.question('question:\n', handleInput);
61+
}
62+
63+
// Function to handle user input
64+
async function handleInput(input) {
65+
input = input.trim();
66+
67+
// Check for exit command
68+
if (!input) {
69+
rl.close();
70+
process.exit(0);
71+
}
72+
73+
messages.push(new HumanMessage(input));
74+
const aiResponse = await chat.invoke(messages);
75+
76+
messages.push(aiResponse);
77+
console.log(aiResponse.text);
78+
console.log('\n----------');
79+
80+
promptUser();
81+
}
82+
}
83+
84+
main();

modules/openvino-langchain/sample/index.js renamed to modules/openvino-langchain/sample/rag_sample.js

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ import { TextLoader } from 'langchain/document_loaders/fs/text';
88
import { OpenVINO, OpenVINOEmbeddings } from 'openvino-langchain';
99

1010
// Paths to document and models
11-
const TEXT_DOCUMENT_PATH = './document_sample.txt';
11+
const TEXT_DOCUMENT_PATH = './data/document_sample.txt';
1212
const LLM_MODEL_PATH = process.argv[2];
1313
const EMBEDDINGS_MODEL_PATH = process.argv[3];
1414

Lines changed: 118 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,118 @@
1+
import { CallbackManagerForLLMRun } from '@langchain/core/callbacks/manager';
2+
import {
3+
BaseLanguageModelCallOptions,
4+
} from '@langchain/core/language_models/base';
5+
import {
6+
SimpleChatModel,
7+
} from '@langchain/core/language_models/chat_models';
8+
import { AIMessageChunk, BaseMessage } from '@langchain/core/messages';
9+
import { ChatGenerationChunk } from '@langchain/core/outputs';
10+
import {
11+
GenerationConfig,
12+
LLMPipeline,
13+
StreamingStatus,
14+
} from 'openvino-genai-node';
15+
16+
export interface ChatOpenVINOParams extends BaseLanguageModelCallOptions {
17+
generationConfig?: GenerationConfig,
18+
modelPath: string,
19+
device?: string,
20+
}
21+
22+
export class ChatOpenVINO extends SimpleChatModel {
23+
generateOptions: GenerationConfig;
24+
25+
path: string;
26+
27+
device: string;
28+
29+
pipeline: Promise<any>;
30+
31+
constructor(params: ChatOpenVINOParams) {
32+
super(params);
33+
this.path = params.modelPath;
34+
this.device = params.device || 'CPU';
35+
this.pipeline = LLMPipeline(this.path, this.device);
36+
this.generateOptions = params.generationConfig || {};
37+
}
38+
_llmType() {
39+
return 'OpenVINO';
40+
}
41+
private convertMessages(messages: BaseMessage[]): string {
42+
return messages
43+
.map((msg) => `${msg.getType().toUpperCase()}: "${msg.content}"`)
44+
.join('\n');
45+
}
46+
async _call(
47+
messages: BaseMessage[],
48+
options: this['ParsedCallOptions'],
49+
runManager?: CallbackManagerForLLMRun,
50+
): Promise<string> {
51+
if (!messages.length) {
52+
throw new Error('No messages provided.');
53+
}
54+
if (typeof messages[0].content !== 'string') {
55+
throw new Error('Multimodal messages are not supported.');
56+
}
57+
const pipeline = await this.pipeline;
58+
59+
// Signal setup
60+
const signals: AbortSignal[] = [];
61+
if (options.signal) {
62+
signals.push(options.signal);
63+
}
64+
if (options.timeout) {
65+
signals.push(AbortSignal.timeout(options.timeout));
66+
}
67+
const signal = AbortSignal.any(signals);
68+
69+
// generation option setup
70+
const generateOptions: GenerationConfig = { ...this.generateOptions };
71+
if (options.stop) {
72+
const set = new Set(options.stop);
73+
generateOptions['stop_strings'] = set;
74+
generateOptions['include_stop_str_in_output'] = true;
75+
}
76+
77+
// callback setup
78+
const callback = (chunk: string) => {
79+
runManager?.handleLLMNewToken(chunk).catch(console.error);
80+
81+
return signal.aborted ? StreamingStatus.CANCEL : StreamingStatus.RUNNING;
82+
};
83+
84+
const prompt = this.convertMessages(messages);
85+
86+
const result = await pipeline.generate(
87+
prompt,
88+
generateOptions,
89+
callback,
90+
);
91+
// We need to throw an exception if the generation was canceled by a signal
92+
signal.throwIfAborted();
93+
94+
return result;
95+
}
96+
97+
async *_streamResponseChunks(
98+
messages: BaseMessage[],
99+
_options: this['ParsedCallOptions'],
100+
runManager?: CallbackManagerForLLMRun,
101+
): AsyncGenerator<ChatGenerationChunk> {
102+
const pipeline = await this.pipeline;
103+
const prompt = this.convertMessages(messages);
104+
const generator = pipeline.stream(
105+
prompt,
106+
this.generateOptions,
107+
);
108+
for await (const chunk of generator) {
109+
yield new ChatGenerationChunk({
110+
message: new AIMessageChunk({
111+
content: chunk,
112+
}),
113+
text: chunk,
114+
});
115+
await runManager?.handleLLMNewToken(chunk);
116+
}
117+
}
118+
}
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,3 @@
11
export * from './embeddings.js';
22
export * from './llms.js';
3+
export * from './chat_models.js';

0 commit comments

Comments
 (0)