Skip to content

Commit b905b20

Browse files
authored
Enhance skills (#17801)
* Upgrade skills readme * Enhance skill * Update description
1 parent fed730d commit b905b20

File tree

4 files changed

+39
-50
lines changed

4 files changed

+39
-50
lines changed

skills/README.md

Lines changed: 5 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -88,25 +88,23 @@ Below are configuration methods for some AI apps:
8888
"entries": {
8989
"paddleocr-text-recognition": {
9090
"enabled": true,
91-
"apiKey": "<ACCESS_TOKEN>",
9291
"env": {
93-
"PADDLEOCR_OCR_API_URL": "<OCR_API_URL>"
92+
"PADDLEOCR_OCR_API_URL": "<OCR_API_URL>",
93+
"PADDLEOCR_ACCESS_TOKEN": "<ACCESS_TOKEN>"
9494
}
9595
},
9696
"paddleocr-doc-parsing": {
9797
"enabled": true,
98-
"apiKey": "<ACCESS_TOKEN>",
9998
"env": {
100-
"PADDLEOCR_DOC_PARSING_API_URL": "<DOC_PARSING_API_URL>"
99+
"PADDLEOCR_DOC_PARSING_API_URL": "<DOC_PARSING_API_URL>",
100+
"PADDLEOCR_ACCESS_TOKEN": "<ACCESS_TOKEN>"
101101
}
102102
}
103103
}
104104
}
105105
}
106106
```
107107

108-
Please note that this approach may store the access token in plain text in the configuration file. A more secure way is to configure it through the OpenClaw onboarding wizard or the dashboard.
109-
110108
### Usage Examples
111109

112110
After configuration, describe the OCR or document parsing task in natural language and provide a file URL or local path so the AI app can invoke the corresponding skill.
@@ -158,7 +156,7 @@ Make sure your working directory is the directory containing this file.
158156

159157
2. Configure environment variables (see [Configure Environment Variables](#configure-environment-variables) for the list of variables). Choose one of the following methods:
160158

161-
**Option A**: run the interactive configuration script.
159+
**Option A**: run the configuration script.
162160

163161
```shell
164162
python paddleocr-text-recognition/scripts/configure.py

skills/README_cn.md

Lines changed: 17 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -72,10 +72,10 @@ git clone https://github.com/PaddlePaddle/PaddleOCR.git
7272

7373
```json
7474
{
75-
env: {
76-
PADDLEOCR_ACCESS_TOKEN”: “<ACCESS_TOKEN>,
77-
PADDLEOCR_OCR_API_URL”: “<OCR_API_URL>,
78-
PADDLEOCR_DOC_PARSING_API_URL”: “<DOC_PARSING_API_URL>
75+
"env": {
76+
"PADDLEOCR_ACCESS_TOKEN": "<ACCESS_TOKEN>",
77+
"PADDLEOCR_OCR_API_URL": "<OCR_API_URL>",
78+
"PADDLEOCR_DOC_PARSING_API_URL": "<DOC_PARSING_API_URL>"
7979
}
8080
}
8181
```
@@ -84,29 +84,27 @@ git clone https://github.com/PaddlePaddle/PaddleOCR.git
8484

8585
```json
8686
{
87-
skills: {
88-
entries: {
89-
paddleocr-text-recognition: {
90-
enabled: true,
91-
“apiKey”: “<ACCESS_TOKEN>”,
92-
“env”: {
93-
“PADDLEOCR_OCR_API_URL”: “<OCR_API_URL>”
87+
"skills": {
88+
"entries": {
89+
"paddleocr-text-recognition": {
90+
"enabled": true,
91+
"env": {
92+
"PADDLEOCR_OCR_API_URL": "<OCR_API_URL>",
93+
"PADDLEOCR_ACCESS_TOKEN": "<ACCESS_TOKEN>"
9494
}
9595
},
96-
paddleocr-doc-parsing: {
97-
enabled: true,
98-
“apiKey”: “<ACCESS_TOKEN>”,
99-
“env”: {
100-
“PADDLEOCR_DOC_PARSING_API_URL”: “<DOC_PARSING_API_URL>”
96+
"paddleocr-doc-parsing": {
97+
"enabled": true,
98+
"env": {
99+
"PADDLEOCR_DOC_PARSING_API_URL": "<DOC_PARSING_API_URL>",
100+
"PADDLEOCR_ACCESS_TOKEN": "<ACCESS_TOKEN>"
101101
}
102102
}
103103
}
104104
}
105105
}
106106
```
107107

108-
请注意这种方式可能在配置文件中明文存储 access token。更安全的方式是通过 OpenClaw onboarding wizard 或者 dashboard 配置。
109-
110108
### 使用示例
111109

112110
配置完成后,可以直接用自然语言描述 OCR 或文档解析需求,并附上文件 URL 或本地路径,让 AI 应用调用对应 skill。以下是部分提示词示例:
@@ -158,7 +156,7 @@ git clone https://github.com/PaddlePaddle/PaddleOCR.git
158156

159157
2. 配置环境变量(需要配置的变量参见[配置环境变量](#配置环境变量)一节),可选择以下任一方式:
160158

161-
**方式 A**运行交互式配置脚本
159+
**方式 A**运行配置脚本
162160

163161
```shell
164162
python paddleocr-text-recognition/scripts/configure.py

skills/paddleocr-doc-parsing/SKILL.md

Lines changed: 9 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,6 @@
11
---
22
name: paddleocr-doc-parsing
3-
description: >
4-
Advanced document parsing with PaddleOCR. Returns complete document
5-
structure including text, tables, formulas, charts, and layout information. The AI agent extracts
6-
relevant content based on user needs.
3+
description: Complex document parsing with PaddleOCR. Intelligently converts complex PDFs and document images into Markdown and JSON files that preserve the original structure.
74
metadata:
85
openclaw:
96
requires:
@@ -198,11 +195,13 @@ Then return:
198195

199196
### First-Time Configuration
200197

198+
You can generally assume that the required environment variables have already been configured. Only when a parsing task fails should you analyze the error message to determine whether it is caused by a configuration issue. If it is indeed a configuration problem, you should notify the user to fix it.
199+
201200
**When API is not configured**:
202201

203202
The error will show:
204203
```
205-
PADDLEOCR_DOC_PARSING_API_URL not configured. Get your API at: https://paddleocr.com
204+
CONFIG_ERROR: PADDLEOCR_DOC_PARSING_API_URL not configured. Get your API at: https://paddleocr.com
206205
```
207206

208207
**Configuration workflow**:
@@ -217,29 +216,25 @@ PADDLEOCR_DOC_PARSING_API_URL not configured. Get your API at: https://paddleocr
217216
- PADDLEOCR_ACCESS_TOKEN
218217
- Optional: PADDLEOCR_DOC_PARSING_TIMEOUT
219218
```
219+
- For security reasons, do not run `configure.py` or create a local `.env` file by default if the skill is installed under a host application directory (for example, `~/.claude/skills`). You should also advise the user not to do this.
220220
221-
3. **If the user provides credentials in chat anyway** (accept any reasonable format):
221+
3. **If the user provides credentials in chat anyway** (accept any reasonable format), for example:
222222
- `PADDLEOCR_DOC_PARSING_API_URL=https://xxx.paddleocr.com/layout-parsing, PADDLEOCR_ACCESS_TOKEN=abc123...`
223223
- `Here's my API: https://xxx and token: abc123`
224224
- Copy-pasted code format
225225
- Any other reasonable format
226226
- **Security note**: Warn the user that credentials shared in chat may be stored in conversation history. Recommend setting them through the host application's configuration instead when possible.
227227
228-
4. **Parse and validate the values**:
228+
Then parse and validate the values:
229229
- Extract `PADDLEOCR_DOC_PARSING_API_URL` (look for URLs with `paddleocr.com` or similar)
230230
- Confirm `PADDLEOCR_DOC_PARSING_API_URL` is a full endpoint ending with `/layout-parsing`
231231
- Extract `PADDLEOCR_ACCESS_TOKEN` (long alphanumeric string, usually 40+ chars)
232-
- Tell the user exactly which environment variables to set
233232
234-
5. **Ask the user to confirm the environment is configured**:
235-
- Wait for the user to confirm these values have been set in their host application, runtime environment, or appropriate config file
236-
- For security reasons, do not run `configure.py` or create a local `.env` file by default if the skill is installed under a host application directory (for example, `~/.claude/skills`)
233+
4. **Ask the user to confirm the environment is configured**.
237234
238-
6. **Retry only after confirmation**:
235+
5. **Retry only after confirmation**:
239236
- Once the user confirms the environment variables are available, retry the original parsing task
240237
241-
**IMPORTANT**: The error message format is STRICT and must be shown exactly as provided by the script. Do not modify or paraphrase it.
242-
243238
### Handling Large Files
244239
245240
There is no file size limit for the API. For PDFs, the maximum is 100 pages per request.

skills/paddleocr-text-recognition/SKILL.md

Lines changed: 8 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,6 @@
11
---
22
name: paddleocr-text-recognition
3-
description: >
4-
Use this skill when users need to extract text from images, PDFs, or documents. Supports URLs and local files.
5-
Returns structured JSON containing recognized text.
3+
description: Extracts text (with locations) from images and PDF documents using PaddleOCR.
64
metadata:
75
openclaw:
86
requires:
@@ -156,6 +154,8 @@ The output JSON structure is as follows:
156154
157155
### First-Time Configuration
158156

157+
You can generally assume that the required environment variables have already been configured. Only when an OCR task fails should you analyze the error message to determine whether it is caused by a configuration issue. If it is indeed a configuration problem, you should notify the user to fix it.
158+
159159
**When API is not configured**:
160160

161161
The error will show:
@@ -175,25 +175,23 @@ CONFIG_ERROR: PADDLEOCR_OCR_API_URL not configured. Get your API at: https://pad
175175
- PADDLEOCR_ACCESS_TOKEN
176176
- Optional: PADDLEOCR_OCR_TIMEOUT
177177
```
178+
- For security reasons, do not run `configure.py` or create a local `.env` file by default if the skill is installed under a host application directory (for example, `~/.claude/skills`). You should also advise the user not to do this.
178179
179-
3. **If the user provides credentials in chat anyway** (accept any reasonable format):
180+
3. **If the user provides credentials in chat anyway** (accept any reasonable format), for example:
180181
- `PADDLEOCR_OCR_API_URL=https://xxx.paddleocr.com/ocr, PADDLEOCR_ACCESS_TOKEN=abc123...`
181182
- `Here's my API: https://xxx and token: abc123`
182183
- Copy-pasted code format
183184
- Any other reasonable format
184185
- **Security note**: Warn the user that credentials shared in chat may be stored in conversation history. Recommend setting them through the host application's configuration instead when possible.
185186
186-
4. **Parse and validate the values**:
187+
Then parse and validate the values:
187188
- Extract `PADDLEOCR_OCR_API_URL` (look for URLs with `paddleocr.com` or similar)
188189
- Confirm `PADDLEOCR_OCR_API_URL` is a full endpoint ending with `/ocr`
189190
- Extract `PADDLEOCR_ACCESS_TOKEN` (long alphanumeric string, usually 40+ chars)
190-
- Tell the user exactly which environment variables to set
191191
192-
5. **Ask the user to confirm the environment is configured**:
193-
- Wait for the user to confirm these values have been set in their host application, runtime environment, or appropriate config file
194-
- For security reasons, do not run `configure.py` or create a local `.env` file by default if the skill is installed under a host application directory (for example, `~/.claude/skills`)
192+
4. **Ask the user to confirm the environment is configured**.
195193
196-
6. **Retry only after confirmation**:
194+
5. **Retry only after confirmation**:
197195
- Once the user confirms the environment variables are available, retry the original OCR task
198196
199197
### Error Handling

0 commit comments

Comments
 (0)