Skip to content

Commit ee30819

Browse files
authored
Merge pull request #313 from Ljzd-PRO/devel
Bump to v0.19.0
2 parents 6462958 + 1212b27 commit ee30819

File tree

10 files changed

+479
-43
lines changed

10 files changed

+479
-43
lines changed

CHANGELOG.md

Lines changed: 115 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,24 +1,129 @@
11
## Changes
22

3-
![Downloads](https://img.shields.io/github/downloads/Ljzd-PRO/KToolBox/v0.18.2/total)
3+
![Downloads](https://img.shields.io/github/downloads/Ljzd-PRO/KToolBox/v0.19.0/total)
4+
5+
### 💡 Feature
6+
7+
- Add **`--keywords-exclude`** parameter for post filtering - #309
8+
```shell
9+
# Method 1: Include only specific character posts
10+
ktoolbox sync_creator --url="https://kemono.cr/fanbox/user/32165989" --keywords="release"
11+
12+
# Method 2: Exclude unwanted character posts (OR logic)
13+
ktoolbox sync_creator --url="https://kemono.cr/fanbox/user/32165989" --keywords_exclude="announcement,vote,share"
14+
15+
# Method 3: Combined filtering (most flexible)
16+
ktoolbox sync_creator --url="https://kemono.cr/fanbox/user/32165989" --keywords="ブルアカ" --keywords_exclude="全体公開,結果発表"
17+
```
18+
- The `--keywords` and `--keywords-exclude` features for keyword filtering and exclusion can now also be set in the configuration
19+
- New configuration options:
20+
- `job.keywords`: Keyword filtering (default is empty)
21+
- `job.keywords_exclude`: Keyword exclusion (default is empty)
22+
- You can edit these configurations by running `ktoolbox config-editor` (`Job -> ...`)
23+
- Or manually edit them in the `.env` file or environment variables
24+
```dotenv
25+
KTOOLBOX_JOB__KEYWORDS='["expression", "sound effect variation"]'
26+
KTOOLBOX_JOB__KEYWORDS_EXCLUDE='["public", "result announcement"]'
27+
```
28+
- 📖More information: [Configuration-Reference-JobConfiguration](https://ktoolbox.readthedocs.io/latest/configuration/reference/#ktoolbox.configuration.JobConfiguration)
29+
- Add **year/month** **grouping** functionality for post organization - #306
30+
- You can group downloaded posts by year and month with customizable directory naming formats
31+
- New configuration options:
32+
- `job.group_by_year`: Enable grouping by year (Disabled by default)
33+
- `job.group_by_month`: Enable grouping by month (Disabled by default)
34+
- `job.year_month_format`: Customize the directory naming format for year grouping (Defaults to `{year}`)
35+
- `job.month_format`: Customize the directory naming format for month grouping (Defaults to `{year}-{month:02d}`)
36+
- Run `ktoolbox config-editor` to edit these configurations (`Job -> ...`)
37+
- Or manually edit them in `.env` file or environment variables
38+
```dotenv
39+
# Environment variables (Defaults to False)
40+
KTOOLBOX_JOB__GROUP_BY_YEAR=True
41+
KTOOLBOX_JOB__GROUP_BY_MONTH=True
42+
43+
# Custom style naming
44+
KTOOLBOX_JOB__YEAR_DIRNAME_FORMAT="Year {year}"
45+
KTOOLBOX_JOB__MONTH_DIRNAME_FORMAT="Month {month:02d}"
46+
```
47+
Resulting directory structure:
48+
```
49+
creator/
50+
├── Year 2020/
51+
│ ├── Month 01/
52+
│ │ └── post_title/
53+
│ └── Month 12/
54+
│ └── another_post/
55+
└── Year 2021/
56+
└── Month 03/
57+
└── latest_post/
58+
```
59+
- 📖More information: [Configuration-Reference-JobConfiguration](https://ktoolbox.readthedocs.io/latest/configuration/reference/#ktoolbox.configuration.JobConfiguration)
460
5-
[//]: # (### 💡 Feature)
661
762
### 🪲 Fix
863
9-
- Fixed the issue where **warning messages** were displayed regardless of whether **`job.include_revisions`** was enabled or not.
10-
- Fixed the issue where extracted external links contained extra characters (v0.18.0)
11-
- Related configuration options: `job.extract_external_links`, `job.external_link_patterns`
64+
- Fixed the issue where the `--keywords` parameter could not be parsed correctly in the `sync-creator` command
1265
1366
- - -
1467
15-
[//]: # (### 💡 新特性)
68+
### 💡 新特性
69+
70+
- 新增 **`--keywords-exclude`** 参数用于帖子筛选 - #309
71+
```shell
72+
# 方法1:仅包含特定关键词的帖子
73+
ktoolbox sync_creator --url="https://kemono.cr/fanbox/user/32165989" --keywords="发布"
74+
75+
# 方法2:排除不需要的关键词帖子(或逻辑)
76+
ktoolbox sync_creator --url="https://kemono.cr/fanbox/user/32165989" --keywords_exclude="公告,投票,分享"
77+
78+
# 方法3:组合筛选(最灵活)
79+
ktoolbox sync_creator --url="https://kemono.cr/fanbox/user/32165989" --keywords="ブルアカ" --keywords_exclude="全体公開,結果発表"
80+
```
81+
- 关键词筛选和关键词排除的 `--keywords``--keywords-exclude` 功能现在也可以在配置中设置
82+
- 新配置项:
83+
- `job.keywords`:关键词筛选(默认为空)
84+
- `job.keywords_exclude`:关键词排除(默认为空)
85+
- 可通过运行 `ktoolbox config-editor` 编辑这些配置(`Job -> ...`
86+
- 或手动在 `.env` 文件或环境变量中编辑
87+
```dotenv
88+
KTOOLBOX_JOB__KEYWORDS='["表情", "効果音差分"]'
89+
KTOOLBOX_JOB__KEYWORDS_EXCLUDE='["全体公開", "結果発表"]'
90+
```
91+
- 📖更多信息:[配置参考-JobConfiguration](https://ktoolbox.readthedocs.io/latest/configuration/reference/#ktoolbox.configuration.JobConfiguration)
92+
- 新增按**年份/月**分组功能用于帖子整理 - #306
93+
- 可按年份和月份分组下载的帖子,支持自定义目录命名格式
94+
- 新配置项:
95+
- `job.group_by_year`:启用按年份分组(默认关闭)
96+
- `job.group_by_month`:启用按月份分组(默认关闭)
97+
- `job.year_month_format`:自定义年份分组目录命名格式(默认为 `{year}`)
98+
- `job.month_format`:自定义月份分组目录命名格式(默认为 `{year}-{month:02d}`)
99+
- 可通过运行 `ktoolbox config-editor` 编辑这些配置(`Job -> ...`)
100+
- 或手动在 `.env` 文件或环境变量中编辑
101+
```dotenv
102+
# 是否启用(默认 False)
103+
KTOOLBOX_JOB__GROUP_BY_YEAR=True
104+
KTOOLBOX_JOB__GROUP_BY_MONTH=True
105+
106+
# 自定义目录命名
107+
KTOOLBOX_JOB__YEAR_DIRNAME_FORMAT="{year}年"
108+
KTOOLBOX_JOB__MONTH_DIRNAME_FORMAT="{month:02d}月"
109+
```
110+
目录结构示例:
111+
```
112+
creator/
113+
├── 2020年/
114+
│ ├── 01月/
115+
│ │ └── post_title/
116+
│ └── 12月/
117+
│ └── another_post/
118+
└── 2021年/
119+
└── 03月/
120+
└── latest_post/
121+
```
122+
- 📖更多信息:[配置参考-JobConfiguration](https://ktoolbox.readthedocs.io/latest/configuration/reference/#ktoolbox.configuration.JobConfiguration)
16123
17124
### 🪲 修复
18125
19-
- 修复了无论是否开启 **`job.include_revisions`** 都会提示**警告信息**的问题
20-
- 修复了程序提取的外部链接(external links)包含多余字符的问题 (v0.18.0)
21-
- 相关配置选项:`job.extract_external_links`, `job.external_link_patterns`
126+
- 修复 `--keywords` 参数在 `sync-creator` 命令中无法正确解析的问题
22127
23128
## Upgrade
24129
@@ -27,4 +132,4 @@ Use this command to upgrade if you are using **pipx**:
27132
pipx upgrade ktoolbox
28133
```
29134

30-
**Full Changelog**: https://github.com/Ljzd-PRO/KToolBox/compare/v0.18.1...v0.18.2
135+
**Full Changelog**: https://github.com/Ljzd-PRO/KToolBox/compare/v0.18.2...v0.19.0

ktoolbox/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
11
__title__ = "KToolBox"
22
# noinspection SpellCheckingInspection
33
__description__ = "A useful CLI tool for downloading posts in Kemono.cr / .su / .party"
4-
__version__ = "v0.18.2"
4+
__version__ = "v0.19.0"

ktoolbox/_cli_zh.py

Lines changed: 11 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -173,7 +173,8 @@ async def sync_creator(
173173
mix_posts: bool = None,
174174
start_time: str = None,
175175
end_time: str = None,
176-
keywords: str = None
176+
keywords: str = None,
177+
keywords_exclude: str = None
177178
):
178179
"""
179180
同步创作者所有帖子(通过 URL)
@@ -185,6 +186,7 @@ async def sync_creator(
185186
:param start_time: 帖子发布时间范围起始
186187
:param end_time: 帖子发布时间范围结束
187188
:param keywords: 按标题过滤帖子,逗号分隔关键词
189+
:param keywords_exclude: 按标题排除帖子,逗号分隔关键词
188190
"""
189191
...
190192

@@ -199,7 +201,8 @@ async def sync_creator(
199201
mix_posts: bool = None,
200202
start_time: str = None,
201203
end_time: str = None,
202-
keywords: str = None
204+
keywords: str = None,
205+
keywords_exclude: str = None
203206
):
204207
"""
205208
同步创作者所有帖子(通过参数)
@@ -212,6 +215,7 @@ async def sync_creator(
212215
:param start_time: 帖子发布时间范围起始
213216
:param end_time: 帖子发布时间范围结束
214217
:param keywords: 按标题过滤帖子,逗号分隔关键词
218+
:param keywords_exclude: 按标题排除帖子,逗号分隔关键词
215219
"""
216220
...
217221

@@ -228,7 +232,8 @@ async def sync_creator(
228232
end_time: str = None,
229233
offset: int = 0,
230234
length: int = None,
231-
keywords: str = None
235+
keywords: str = None,
236+
keywords_exclude: str = None
232237
):
233238
"""
234239
同步创作者所有帖子
@@ -248,6 +253,7 @@ async def sync_creator(
248253
:param offset: 结果偏移量
249254
:param length: 获取帖子数量,默认为全部
250255
:param keywords: 按标题过滤帖子,逗号分隔关键词
256+
:param keywords_exclude: 按标题排除帖子,逗号分隔关键词
251257
"""
252258
return await super().sync_creator(
253259
url=url,
@@ -260,5 +266,6 @@ async def sync_creator(
260266
end_time=end_time,
261267
offset=offset,
262268
length=length,
263-
keywords=keywords
269+
keywords=keywords,
270+
keywords_exclude=keywords_exclude
264271
)

ktoolbox/_configuration_zh.py

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -101,6 +101,13 @@ class JobConfiguration(ktoolbox.configuration.JobConfiguration):
101101
| ``published``| 日期 |
102102
| ``edited`` | 日期 |
103103
104+
- ``year_dirname_format`` 和 ``month_dirname_format`` 可用属性
105+
106+
| 属性 | 类型 |
107+
|--------------|--------|
108+
| ``year`` | 字符串 |
109+
| ``month`` | 字符串 |
110+
104111
:ivar count: 并发下载的协程数量
105112
:ivar include_revisions: 下载时包含修订帖子
106113
:ivar post_dirname_format: 自定义帖子目录名格式,可使用 [属性][ktoolbox.configuration.JobConfiguration]。例如:``[{published}]{id}`` > ``[2024-1-1]123123``,``{user}_{published}_{title}`` > ``234234_2024-1-1_TheTitle``
@@ -113,6 +120,10 @@ class JobConfiguration(ktoolbox.configuration.JobConfiguration):
113120
:ivar block_list: 不下载匹配这些模式(Unix shell 风格)的文件,如 ``["*.psd","*.zip"]``
114121
:ivar extract_external_links: 从帖子内容中提取外部文件分享链接并保存到单独文件
115122
:ivar external_link_patterns: 用于提取外部链接的正则表达式模式
123+
:ivar group_by_year: 根据发布日期按年分组到不同目录
124+
:ivar group_by_month: 根据发布日期按月分组到不同目录(需要启用 group_by_year)
125+
:ivar year_dirname_format: 自定义年份目录名格式。可用属性:``year``。例如:``{year}`` > ``2024``,``Year_{year}`` > ``Year_2024``
126+
:ivar month_dirname_format: 自定义月份目录名格式。可用属性:``year``、``month``。例如:``{year}-{month}`` > ``2024-01``,``{year}_{month}`` > ``2024_01``
116127
"""
117128
...
118129

ktoolbox/action/job.py

Lines changed: 21 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111

1212
from ktoolbox._enum import PostFileTypeEnum, DataStorageNameEnum
1313
from ktoolbox.action import ActionRet, fetch_creator_posts, FetchInterruptError
14-
from ktoolbox.action.utils import generate_post_path_name, filter_posts_by_date, generate_filename, filter_posts_by_keywords
14+
from ktoolbox.action.utils import generate_post_path_name, filter_posts_by_date, generate_filename, filter_posts_by_keywords, filter_posts_by_keywords_exclude, generate_grouped_post_path
1515
from ktoolbox.api.model import Post, Attachment
1616
from ktoolbox.api.posts import get_post_revisions as get_post_revisions_api
1717
from ktoolbox.configuration import config, PostStructureConfiguration
@@ -155,7 +155,8 @@ async def create_job_from_creator(
155155
mix_posts: bool = None,
156156
start_time: Optional[datetime],
157157
end_time: Optional[datetime],
158-
keywords: Optional[Set[str]] = None
158+
keywords: Optional[Set[str]] = None,
159+
keywords_exclude: Optional[Set[str]] = None
159160
) -> ActionRet[List[Job]]:
160161
"""
161162
Create a list of download job from a creator
@@ -172,6 +173,7 @@ async def create_job_from_creator(
172173
:param start_time: Start time of the time range
173174
:param end_time: End time of the time range
174175
:param keywords: Set of keywords to filter posts by title (case-insensitive)
176+
:param keywords_exclude: Set of keywords to exclude posts by title (case-insensitive)
175177
"""
176178
mix_posts = config.job.mix_posts if mix_posts is None else mix_posts
177179

@@ -207,16 +209,26 @@ async def create_job_from_creator(
207209
if keywords:
208210
post_list = list(filter_posts_by_keywords(post_list, keywords))
209211

212+
# Filter out posts by exclude keywords
213+
if keywords_exclude:
214+
post_list = list(filter_posts_by_keywords_exclude(post_list, keywords_exclude))
215+
210216
logger.info(f"Get {len(post_list)} posts after filtering, start creating jobs")
211217

212218
# Filter posts and generate ``CreatorIndices``
213219
if not mix_posts:
214220
if save_creator_indices:
221+
# Generate posts_path with year/month grouping if enabled
222+
posts_path = {}
223+
for post in post_list:
224+
grouped_base_path = generate_grouped_post_path(post, path)
225+
posts_path[post.id] = grouped_base_path / sanitize_filename(post.title)
226+
215227
indices = CreatorIndices(
216228
creator_id=creator_id,
217229
service=service,
218230
posts={post.id: post for post in post_list},
219-
posts_path={post.id: path / sanitize_filename(post.title) for post in post_list}
231+
posts_path=posts_path
220232
)
221233
async with aiofiles.open(
222234
path / DataStorageNameEnum.CreatorIndicesData.value,
@@ -232,7 +244,12 @@ async def create_job_from_creator(
232244
job_list: List[Job] = []
233245
for post in post_list:
234246
# Get post path
235-
post_path = path if mix_posts else path / generate_post_path_name(post)
247+
if mix_posts:
248+
post_path = path
249+
else:
250+
# Apply year/month grouping if enabled
251+
grouped_base_path = generate_grouped_post_path(post, path)
252+
post_path = grouped_base_path / generate_post_path_name(post)
236253

237254
# Generate jobs for the main post
238255
job_list += await create_job_from_post(

0 commit comments

Comments
 (0)