Skip to content

Commit f36aa71

Browse files
authored
bump version to v0.10.2 (#4062)
* bump version to v0.10.2 * fix side effect
1 parent 1931670 commit f36aa71

File tree

6 files changed

+15
-17
lines changed

6 files changed

+15
-17
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -212,7 +212,7 @@ The default prebuilt package is compiled on **CUDA 12** since v0.3.0.
212212
For the GeForce RTX 50 series, please install the LMDeploy prebuilt package complied with **CUDA 12.8**
213213

214214
```shell
215-
export LMDEPLOY_VERSION=0.10.1
215+
export LMDEPLOY_VERSION=0.10.2
216216
export PYTHON_VERSION=310
217217
pip install https://github.com/InternLM/lmdeploy/releases/download/v${LMDEPLOY_VERSION}/lmdeploy-${LMDEPLOY_VERSION}+cu128-cp${PYTHON_VERSION}-cp${PYTHON_VERSION}-manylinux2014_x86_64.whl --extra-index-url https://download.pytorch.org/whl/cu128
218218
```

README_zh-CN.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -213,7 +213,7 @@ pip install lmdeploy
213213
若使用 GeForce RTX 50 系列显卡,请安装基于 **CUDA 12.8** 编译的 LMDeploy 预编译包。
214214

215215
```shell
216-
export LMDEPLOY_VERSION=0.10.1
216+
export LMDEPLOY_VERSION=0.10.2
217217
export PYTHON_VERSION=310
218218
pip install https://github.com/InternLM/lmdeploy/releases/download/v${LMDEPLOY_VERSION}/lmdeploy-${LMDEPLOY_VERSION}+cu128-cp${PYTHON_VERSION}-cp${PYTHON_VERSION}-manylinux2014_x86_64.whl --extra-index-url https://download.pytorch.org/whl/cu128
219219
```

benchmark/profile_throughput.py

Lines changed: 8 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -163,7 +163,7 @@ async def _inference(self, req_queue: Queue, session_id: int, temperature: float
163163

164164
state = DetokenizeState(len(input_ids))
165165

166-
prev_len = 0
166+
n_token = 0
167167
token_ids = input_ids.copy()
168168

169169
generator = model_inst.async_stream_infer(session_id,
@@ -178,15 +178,13 @@ async def _inference(self, req_queue: Queue, session_id: int, temperature: float
178178
stream_output=stream_output)
179179
try:
180180
async for outputs in generator:
181-
n_token = outputs.num_token
182-
if n_token > prev_len:
183-
token_ids += outputs.token_ids[prev_len - n_token:]
184-
if not skip_detokenize:
185-
_, state = self.tokenizer.detokenize_incrementally(token_ids, state)
186-
sess.tick(n_token)
187-
prev_len = n_token
188-
if n_token > cancel_after:
189-
break
181+
n_token += outputs.num_token
182+
token_ids += outputs.token_ids
183+
if not skip_detokenize:
184+
_, state = self.tokenizer.detokenize_incrementally(token_ids, state)
185+
sess.tick(n_token)
186+
if n_token > cancel_after:
187+
break
190188
sess.finish(Session.SUCCESS)
191189
finally:
192190
await generator.aclose()

docs/en/get_started/installation.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ pip install lmdeploy
2323
The default prebuilt package is compiled on **CUDA 12**. If CUDA 11+ (>=11.3) is required, you can install lmdeploy by:
2424

2525
```shell
26-
export LMDEPLOY_VERSION=0.10.1
26+
export LMDEPLOY_VERSION=0.10.2
2727
export PYTHON_VERSION=310
2828
pip install https://github.com/InternLM/lmdeploy/releases/download/v${LMDEPLOY_VERSION}/lmdeploy-${LMDEPLOY_VERSION}+cu118-cp${PYTHON_VERSION}-cp${PYTHON_VERSION}-manylinux2014_x86_64.whl --extra-index-url https://download.pytorch.org/whl/cu118
2929
```
@@ -51,7 +51,7 @@ DISABLE_TURBOMIND=1 pip install git+https://github.com/InternLM/lmdeploy.git
5151
If you prefer a specific version instead of the `main` branch of LMDeploy, you can specify it in your command:
5252

5353
```shell
54-
pip install https://github.com/InternLM/lmdeploy/archive/refs/tags/v0.10.1.zip
54+
pip install https://github.com/InternLM/lmdeploy/archive/refs/tags/v0.10.2.zip
5555
```
5656

5757
If you want to build LMDeploy with support for Ascend, Cambricon, or MACA, install LMDeploy with the corresponding `LMDEPLOY_TARGET_DEVICE` environment variable.

docs/zh_cn/get_started/installation.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ pip install lmdeploy
2323
默认的预构建包是在 **CUDA 12** 上编译的。如果需要 CUDA 11+ (>=11.3),你可以使用以下命令安装 lmdeploy:
2424

2525
```shell
26-
export LMDEPLOY_VERSION=0.10.1
26+
export LMDEPLOY_VERSION=0.10.2
2727
export PYTHON_VERSION=310
2828
pip install https://github.com/InternLM/lmdeploy/releases/download/v${LMDEPLOY_VERSION}/lmdeploy-${LMDEPLOY_VERSION}+cu118-cp${PYTHON_VERSION}-cp${PYTHON_VERSION}-manylinux2014_x86_64.whl --extra-index-url https://download.pytorch.org/whl/cu118
2929
```
@@ -51,7 +51,7 @@ DISABLE_TURBOMIND=1 pip install git+https://github.com/InternLM/lmdeploy.git
5151
如果您希望使用特定版本,而不是 LMDeploy 的 `main` 分支,可以在命令行中指定:
5252

5353
```shell
54-
pip install https://github.com/InternLM/lmdeploy/archive/refs/tags/v0.10.1.zip
54+
pip install https://github.com/InternLM/lmdeploy/archive/refs/tags/v0.10.2.zip
5555
```
5656

5757
如果您希望构建支持昇腾、寒武纪或沐熙的 LMDeploy,请使用相应的 `LMDEPLOY_TARGET_DEVICE` 环境变量进行安装。

lmdeploy/version.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# Copyright (c) OpenMMLab. All rights reserved.
22
from typing import Tuple
33

4-
__version__ = '0.10.1'
4+
__version__ = '0.10.2'
55
short_version = __version__
66

77

0 commit comments

Comments
 (0)