Skip to content

Commit df8de7a

Browse files
authored
feat: cover generation (#261)
* feat: cover generation * docs: update docs * docs: update icons
1 parent 5b92fa2 commit df8de7a

File tree

10 files changed

+166
-27
lines changed

10 files changed

+166
-27
lines changed

README.md

Lines changed: 20 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,8 @@
1616
<img src="assets/zhipu-color.svg" alt="Zhipu GLM-4V-PLUS" width="60" height="60" />
1717
<img src="assets/gemini-brand-color.svg" alt="Google Gemini 1.5 Pro" width="60" height="60" />
1818
<img src="assets/qwen-color.svg" alt="Qwen-2.5-72B-Instruct" width="60" height="60" />
19+
<img src="assets/minimax-color.svg" alt="Minimax" width="20" height="60" />
20+
<img src="assets/minimax-text.svg" alt="Minimax" width="60" height="60" />
1921

2022
</div>
2123

@@ -41,6 +43,8 @@
4143
- `Qwen-2.5-72B-Instruct`
4244
- **( :tada: NEW)持久化登录/下载/上传视频(支持多p投稿)**[bilitool](https://github.com/timerring/bilitool) 已经开源,实现持久化登录,下载视频及弹幕(含多p)/上传视频(可分p投稿),查询投稿状态,查询详细信息等功能,一键pip安装,可以使用命令行 cli 操作,也可以作为api调用。
4345
- **( :tada: NEW)自动多平台循环直播推流**:该工具已经开源 [looplive](https://github.com/timerring/looplive) 是一个 7 x 24 小时全自动**循环多平台同时推流**直播工具。
46+
- **( :tada: NEW)自动生成风格变换的视频封面**:采用图生图多模态模型,自动获取视频截图并上传风格变换后的视频封面。
47+
- `Minimax image-01`
4448

4549
项目架构流程如下:
4650

@@ -144,11 +148,11 @@ pip install -r requirements.txt
144148

145149
##### 3.1.1 采用 api 方式
146150

147-
`src/config.py` 文件中的 `ASR_METHOD` 参数设置为 `api`,然后填写 `WHISPER_API_KEY` 参数为你的 [API Key](https://console.groq.com/keys)。本项目采用 groq 提供 free tier 的 `whisper-large-v3-turbo` 模型,上传限制为 40 MB(约半小时),因此如需采用 api 识别的方式,请将视频录制分段调整为 30 分钟。此外,free tier 请求限制为 7200秒/20次/小时,28800秒/2000次/天。如果有更多需求,也欢迎升级到 dev tier,更多信息见[groq 官网](https://console.groq.com/docs/rate-limits)
151+
`settings.toml` 文件中的 `ASR_METHOD` 参数设置为 `api`,然后填写 `WHISPER_API_KEY` 参数为你的 [API Key](https://console.groq.com/keys)。本项目采用 groq 提供 free tier 的 `whisper-large-v3-turbo` 模型,上传限制为 40 MB(约半小时),因此如需采用 api 识别的方式,请将视频录制分段调整为 30 分钟。此外,free tier 请求限制为 7200秒/20次/小时,28800秒/2000次/天。如果有更多需求,也欢迎升级到 dev tier,更多信息见[groq 官网](https://console.groq.com/docs/rate-limits)
148152

149153
##### 3.1.2 采用本地部署方式(需保证有 NVIDIA 显卡)
150154

151-
`src/config.py` 文件中的 `ASR_METHOD` 参数设置为 `deploy`,然后下载所需模型文件,并放置在 `src/subtitle/models` 文件夹中。
155+
`settings.toml` 文件中的 `ASR_METHOD` 参数设置为 `deploy`,然后下载所需模型文件,并放置在 `src/subtitle/models` 文件夹中。
152156

153157
项目默认采用 [`small`](https://openaipublic.azureedge.net/main/whisper/models/9ecf779972d90ba49c06d968637d720dd632c55bbf19d441fb42bf17a411e794/small.pt) 模型,请点击下载所需文件,并放置在 `src/subtitle/models` 文件夹中。
154158

@@ -160,7 +164,7 @@ pip install -r requirements.txt
160164
161165
##### 3.2 MLLM 模型
162166

163-
MLLM 模型主要用于自动切片后的切片标题生成,此功能默认关闭,如果需要打开请将 `src/config.py` 文件中的 `AUTO_SLICE` 参数设置为 `True`。其他配置分别有:
167+
MLLM 模型主要用于自动切片后的切片标题生成,此功能默认关闭,如果需要打开请将 `settings.toml` 文件中的 `AUTO_SLICE` 参数设置为 `True`。其他配置分别有:
164168
- `SLICE_DURATION` 以秒为单位设置切片时长(不建议超过 60 秒)。
165169
- `SLICE_NUM` 设置切片数量。
166170
- `SLICE_OVERLAP` 设置切片重叠时长。切片采用滑动窗口法处理,细节内容请见 [auto-slice-video](https://github.com/timerring/auto-slice-video)
@@ -169,21 +173,27 @@ MLLM 模型主要用于自动切片后的切片标题生成,此功能默认关
169173

170174
##### 3.2.1 GLM-4V-PLUS 模型
171175

172-
> 如需使用 GLM-4V-PLUS 模型,请将 `src/config.py` 文件中的 `MLLM_MODEL` 参数设置为 `zhipu`
176+
> 如需使用 GLM-4V-PLUS 模型,请将 `settings.toml` 文件中的 `MLLM_MODEL` 参数设置为 `zhipu`
173177
174-
在项目的自动切片功能需要使用到智谱的 [`GLM-4V-PLUS`](https://bigmodel.cn/dev/api/normal-model/glm-4) 模型,请自行[注册账号](https://www.bigmodel.cn/invite?icode=shBtZUfNE6FfdMH1R6NybGczbXFgPRGIalpycrEwJ28%3D)并申请 API Key,填写到 `src/config.py` 文件中对应的 `ZHIPU_API_KEY` 中。
178+
在项目的自动切片功能需要使用到智谱的 [`GLM-4V-PLUS`](https://bigmodel.cn/dev/api/normal-model/glm-4) 模型,请自行[注册账号](https://www.bigmodel.cn/invite?icode=shBtZUfNE6FfdMH1R6NybGczbXFgPRGIalpycrEwJ28%3D)并申请 API Key,填写到 `settings.toml` 文件中对应的 `ZHIPU_API_KEY` 中。
175179

176180
##### 3.2.2 Gemini 模型
177181

178-
> 如需使用 Gemini-2.0-flash 模型,请将 `src/config.py` 文件中的 `MLLM_MODEL` 参数设置为 `gemini`
182+
> 如需使用 Gemini-2.0-flash 模型,请将 `settings.toml` 文件中的 `MLLM_MODEL` 参数设置为 `gemini`
179183
180-
在项目的自动切片功能需要使用到 Gemini-2.0-flash 模型,请自行[注册账号](https://aistudio.google.com/app/apikey)并申请 API Key,填写到 `src/config.py` 文件中对应的 `GEMINI_API_KEY` 中。
184+
在项目的自动切片功能需要使用到 Gemini-2.0-flash 模型,请自行[注册账号](https://aistudio.google.com/app/apikey)并申请 API Key,填写到 `settings.toml` 文件中对应的 `GEMINI_API_KEY` 中。
181185

182186
##### 3.2.3 Qwen 模型
183187

184-
> 如需使用 Qwen-2.5-72B-Instruct 模型,请将 `src/config.py` 文件中的 `MLLM_MODEL` 参数设置为 `qwen`
188+
> 如需使用 Qwen-2.5-72B-Instruct 模型,请将 `settings.toml` 文件中的 `MLLM_MODEL` 参数设置为 `qwen`
185189
186-
在项目的自动切片功能需要使用到 Qwen-2.5-72B-Instruct 模型,请自行[注册账号](https://bailian.console.aliyun.com/?apiKey=1)并申请 API Key,填写到 `src/config.py` 文件中对应的 `QWEN_API_KEY` 中。
190+
在项目的自动切片功能需要使用到 Qwen-2.5-72B-Instruct 模型,请自行[注册账号](https://bailian.console.aliyun.com/?apiKey=1)并申请 API Key,填写到 `settings.toml` 文件中对应的 `QWEN_API_KEY` 中。
191+
192+
##### 3.2.4 Minimax 模型
193+
194+
> 如需使用 Minimax 模型,请将 `settings.toml` 文件中 `generate_cover` 参数设置为 `true`,并将 `IMAGE_GEN_MODEL` 参数设置为 `minimax`
195+
196+
在项目的自动切片功能需要使用到 Minimax 模型,请自行[注册账号](https://www.minimax.chat/)并申请 API Key,填写到 `settings.toml` 文件中对应的 `MINIMAX_API_KEY` 中。
187197

188198
#### 4. bilitool 登录
189199

@@ -248,7 +258,7 @@ logs # 日志文件夹
248258
#### 8. 配置上传参数
249259

250260
> [!TIP]
251-
> 上传默认参数如下,[]中内容全部自动替换。可以在 `src/config.py` 中自定义相关配置,映射关键词为 `{artist}``{date}``{title}``{source_link}`,可自行组合删减定制模板:
261+
> 上传默认参数如下,[]中内容全部自动替换。可以在 `settings.toml` 中自定义相关配置,映射关键词为 `{artist}``{date}``{title}``{source_link}`,可自行组合删减定制模板:
252262
> + 标题模板是`{artist}直播回放-{date}-{title}`,效果为"【弹幕+字幕】[XXX]直播回放-[日期]-[直播间标题]",可自行修改。
253263
> + 简介模板是`{artist}直播,直播间地址:{source_link} 内容仅供娱乐,直播中主播的言论、观点和行为均由主播本人负责,不代表录播员的观点或立场。`,效果为"【弹幕+字幕】[XXX]直播,直播间地址:[https://live.bilibili.com/XXX] 内容仅供娱乐,直播中主播的言论、观点和行为均由主播本人负责,不代表录播员的观点或立场。",可自行修改。
254264
> + 默认标签是根据主播名字自动在 b 站搜索推荐中抓取的热搜词。

assets/minimax-color.svg

Lines changed: 1 addition & 0 deletions
Loading

assets/minimax-text.svg

Lines changed: 1 addition & 0 deletions
Loading

settings.toml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,11 @@ zhipu_api_key = "" # Apply for your own GLM-4v-Plus API key at https://www.bigmo
3838
gemini_api_key = "" # Apply for your own Gemini API key at https://aistudio.google.com/app/apikey
3939
qwen_api_key = "" # Apply for your own Qwen API key at https://bailian.console.aliyun.com/?apiKey=1
4040

41+
[cover]
42+
generate_cover = false # whether to generate cover
43+
image_gen_model = "minimax" # the image generation model, can be "minimax"
44+
minimax_api_key = "" # Apply for your own Minimax API key at https://platform.minimaxi.com/user-center/basic-information/interface-key
45+
4146
# blrec Settings
4247
[[tasks]]
4348
room_id = 173551

src/config.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -71,3 +71,7 @@ def get_interface_config():
7171
ZHIPU_API_KEY = config.get('slice', {}).get('zhipu_api_key')
7272
GEMINI_API_KEY = config.get('slice', {}).get('gemini_api_key')
7373
QWEN_API_KEY = config.get('slice', {}).get('qwen_api_key')
74+
75+
GENERATE_COVER = config.get('cover', {}).get('generate_cover')
76+
IMAGE_GEN_MODEL = config.get('cover', {}).get('image_gen_model')
77+
MINIMAX_API_KEY = config.get('cover', {}).get('minimax_api_key')

src/cover/__init__.py

Whitespace-only changes.

src/cover/cover_generator.py

Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
from functools import wraps
2+
from src.log.logger import upload_log
3+
from src.config import IMAGE_GEN_MODEL
4+
import subprocess
5+
6+
def cut_cover_use_ffmpeg(video_path):
7+
"""Cut cover use ffmpeg
8+
Args:
9+
video_path: str, path to the video file
10+
Returns:
11+
str: the video cut cover path
12+
"""
13+
upload_log.info("begin to generate cover")
14+
cover_path = video_path[:-4] + ".jpg"
15+
ffmpeg_command = [
16+
'ffmpeg', '-y', '-i', video_path, '-t', '1', '-r', '1', cover_path
17+
]
18+
try:
19+
result = subprocess.run(ffmpeg_command, check=True, capture_output=True, text=True)
20+
upload_log.debug(f"FFmpeg output: {result.stdout}")
21+
if result.stderr:
22+
upload_log.debug(f"FFmpeg debug: {result.stderr}")
23+
return cover_path
24+
except subprocess.CalledProcessError as e:
25+
upload_log.error(f"Error: {e.stderr}")
26+
return None
27+
28+
29+
def cover_generator(model_type):
30+
"""Decorator to select cover generation function based on model type
31+
Args:
32+
model_type: str, type of model to use
33+
Returns:
34+
function: wrapped title generation function
35+
"""
36+
def decorator(func):
37+
def wrapper(video_path):
38+
cover_path = cut_cover_use_ffmpeg(video_path)
39+
if cover_path is None:
40+
upload_log.error("Failed to generate cover using ffmpeg")
41+
return None
42+
if model_type == "minimax":
43+
from .image_model_sdk.minimax_sdk import minimax_generate_cover
44+
return minimax_generate_cover(cover_path)
45+
else:
46+
upload_log.error(f"Unsupported model type: {model_type}")
47+
return None
48+
return wrapper
49+
return decorator
50+
51+
@cover_generator(IMAGE_GEN_MODEL)
52+
def generate_cover(video_path):
53+
"""Generate cover for video
54+
Args:
55+
video_path: str, path to the video file
56+
Returns:
57+
str: generated cover
58+
"""
59+
pass # The actual implementation is handled by the decorator
Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
import requests
2+
import json
3+
import base64
4+
import os
5+
import time
6+
from src.config import MINIMAX_API_KEY
7+
8+
9+
def minimax_generate_cover(your_file_path):
10+
"""Generater cover image using minimax api
11+
Args:
12+
your_file_path: str, path to the image file
13+
Returns:
14+
str, local download path of the generated cover image file
15+
"""
16+
cover_name = time.strftime("%Y%m%d%H%M%S") + ".png"
17+
temp_cover_path = os.path.join(os.path.dirname(your_file_path), cover_name)
18+
19+
with open(your_file_path, "rb") as image_file:
20+
data = base64.b64encode(image_file.read()).decode('utf-8')
21+
22+
payload = json.dumps({
23+
"model": "image-01",
24+
"prompt": "这是一个视频截图,请生成其对应的吉普力风格的图片",
25+
"subject_reference": [
26+
{
27+
"type": "character",
28+
"image_file": f"data:image/jpeg;base64,{data}"
29+
}
30+
],
31+
"n": 2
32+
})
33+
headers = {
34+
'Authorization': f'Bearer {MINIMAX_API_KEY}',
35+
'Content-Type': 'application/json'
36+
}
37+
38+
url = "https://api.minimax.chat/v1/image_generation"
39+
response = requests.request("POST", url, headers=headers, data=payload).json()
40+
if response['base_resp']['status_code'] == 0:
41+
image_url = response['data']['image_urls'][0]
42+
img_data = requests.get(image_url).content
43+
with open(temp_cover_path, 'wb') as handler:
44+
handler.write(img_data)
45+
os.remove(your_file_path)
46+
return temp_cover_path
47+
else:
48+
print(response['base_resp']['error_msg'])
49+
return None
50+
51+
if __name__ == "__main__":
52+
your_file_path = ""
53+
print(minimax_generate_cover(your_file_path))

src/log/retry.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ def run(self, func, *args, **kwargs) -> Tuple[bool, Any]:
2929
status = (True,return_value)
3030
break
3131
except Exception as e:
32-
scan_log.error(f"Exceptions in trial {i+1}/{self.max_retry} : {e}")
32+
scan_log.error(f"Exceptions in function {func.__name__} trial {i+1}/{self.max_retry} : {e}")
3333
sleep(self.interval)
3434

3535
return status

src/upload/upload.py

Lines changed: 22 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -3,26 +3,31 @@
33
import subprocess
44
import os
55
import sys
6-
from src.config import SRC_DIR, BILIVE_DIR, RESERVE_FOR_FIXING, UPLOAD_LINE
6+
from src.config import SRC_DIR, BILIVE_DIR, RESERVE_FOR_FIXING, UPLOAD_LINE, GENERATE_COVER
77
from datetime import datetime
88
from src.upload.generate_upload_data import generate_video_data, generate_slice_data
99
from src.upload.extract_video_info import generate_title
10-
from src.log.logger import upload_log
10+
from src.log.logger import upload_log, scan_log
1111
import time
1212
from concurrent.futures import ThreadPoolExecutor, as_completed
1313
from db.conn import get_single_upload_queue, delete_upload_queue, update_upload_queue_lock, get_single_lock_queue
1414
from .bilitool.bilitool import UploadController, FeedController, LoginController
1515
from src.log.retry import Retry
16+
from src.cover.cover_generator import generate_cover
1617

1718
@Retry(max_retry = 3, interval = 5).decorator
1819
def upload_video(upload_path):
1920
try:
2021
if upload_path.endswith('.flv'):
2122
copyright, title, tid, tag = generate_slice_data(upload_path)
22-
yaml, desc, source, cover, dynamic = ("",) * 5
23+
if GENERATE_COVER:
24+
cover = generate_cover(upload_path)
25+
else:
26+
cover = ""
27+
yaml, desc, source, dynamic = ("",) * 4
2328
if title is None:
24-
upload_log.error("Fail to upload slice video, the files will be reserved.")
25-
update_upload_queue_lock(upload_path, 0)
29+
upload_log.error("Fail to upload slice video, the files will be locked.")
30+
update_upload_queue_lock(upload_path, 1)
2631
return False
2732
else:
2833
copyright, title, desc, tid, tag, source, cover, dynamic = generate_video_data(upload_path)
@@ -31,16 +36,17 @@ def upload_video(upload_path):
3136
if result == True:
3237
upload_log.info("Upload successfully, then delete the video")
3338
os.remove(upload_path)
39+
if cover:
40+
os.remove(cover)
3441
delete_upload_queue(upload_path)
3542
return True
3643
else:
37-
upload_log.error("Fail to upload, the files will be reserved.")
38-
update_upload_queue_lock(upload_path, 0)
44+
upload_log.error("Fail to upload, the files will be locked.")
45+
update_upload_queue_lock(upload_path, 1)
3946
return False
40-
41-
except subprocess.CalledProcessError as e:
42-
upload_log.error(f"The upload_video called failed, the files will be reserved. error: {e}")
43-
update_upload_queue_lock(upload_path, 0)
47+
except Exception as e:
48+
upload_log.error(f"The upload_video called failed, the files will be converted to locked. error: {e}")
49+
update_upload_queue_lock(upload_path, 1)
4450
return False
4551

4652
@Retry(max_retry = 3, interval = 5).decorator
@@ -54,13 +60,13 @@ def append_upload(upload_path, bv_result):
5460
delete_upload_queue(upload_path)
5561
return True
5662
else:
57-
upload_log.error("Fail to append, the files will be reserved.")
58-
update_upload_queue_lock(upload_path, 0)
63+
upload_log.error("Fail to append, the files will be locked.")
64+
update_upload_queue_lock(upload_path, 1)
5965
return False
6066

61-
except subprocess.CalledProcessError as e:
62-
upload_log.error(f"The append_upload called failed, the files will be reserved. error: {e}")
63-
update_upload_queue_lock(upload_path, 0)
67+
except Exception as e:
68+
upload_log.error(f"The append_upload called failed, the files will be locked. error: {e}")
69+
update_upload_queue_lock(upload_path, 1)
6470
return False
6571

6672
def video_gate(video_path):

0 commit comments

Comments
 (0)