Skip to content

Commit 86472bd

Browse files
authored
knowledge: fix knowledge reader not auto registered (#760)
1 parent 16307d7 commit 86472bd

File tree

3 files changed

+41
-0
lines changed

3 files changed

+41
-0
lines changed

docs/mkdocs/en/knowledge.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -76,6 +76,9 @@ import (
7676
"trpc.group/trpc-go/trpc-agent-go/model/openai"
7777
"trpc.group/trpc-go/trpc-agent-go/runner"
7878
"trpc.group/trpc-go/trpc-agent-go/session/inmemory"
79+
80+
// Import PDF reader to register it (optional - has separate go.mod to avoid unnecessary dependencies).
81+
// _ "trpc.group/trpc-go/trpc-agent-go/knowledge/document/reader/pdf"
7982
)
8083

8184
func main() {
@@ -1206,6 +1209,9 @@ import (
12061209
vectorinmemory "trpc.group/trpc-go/trpc-agent-go/knowledge/vectorstore/inmemory"
12071210
vectorpgvector "trpc.group/trpc-go/trpc-agent-go/knowledge/vectorstore/pgvector"
12081211
vectortcvector "trpc.group/trpc-go/trpc-agent-go/knowledge/vectorstore/tcvector"
1212+
1213+
// Import PDF reader to register it (optional - has separate go.mod to avoid unnecessary dependencies).
1214+
// _ "trpc.group/trpc-go/trpc-agent-go/knowledge/document/reader/pdf"
12091215
)
12101216

12111217
func main() {
@@ -1542,3 +1548,15 @@ go run main.go -embedder openai -vectorstore elasticsearch -es-version v9
15421548
- Confirm files exist and extensions are supported (.md/.txt/.pdf/.csv/.json/.docx, etc.);
15431549
- Whether directory source needs `WithRecursive(true)`;
15441550
- Use `WithFileExtensions` for whitelist filtering.
1551+
1552+
7. **PDF file reading support**
1553+
1554+
- Note: The PDF reader depends on third-party libraries. To avoid introducing unnecessary dependencies into the main module, the PDF reader uses a separate `go.mod`.
1555+
- Usage: To support PDF file reading, manually import the PDF reader package for registration:
1556+
```go
1557+
import (
1558+
// Import PDF reader to support .pdf file parsing.
1559+
_ "trpc.group/trpc-go/trpc-agent-go/knowledge/document/reader/pdf"
1560+
)
1561+
```
1562+
- Note: Readers for other formats (.txt/.md/.csv/.json, etc.) are automatically registered and do not require manual import.

docs/mkdocs/zh/knowledge.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -76,6 +76,9 @@ import (
7676
"trpc.group/trpc-go/trpc-agent-go/model/openai"
7777
"trpc.group/trpc-go/trpc-agent-go/runner"
7878
"trpc.group/trpc-go/trpc-agent-go/session/inmemory"
79+
80+
// 如需支持 PDF 文件,需手动引入 PDF reader(独立 go.mod,避免引入不必要的第三方依赖)
81+
// _ "trpc.group/trpc-go/trpc-agent-go/knowledge/document/reader/pdf"
7982
)
8083

8184
func main() {
@@ -1217,6 +1220,9 @@ import (
12171220
vectorinmemory "trpc.group/trpc-go/trpc-agent-go/knowledge/vectorstore/inmemory"
12181221
vectorpgvector "trpc.group/trpc-go/trpc-agent-go/knowledge/vectorstore/pgvector"
12191222
vectortcvector "trpc.group/trpc-go/trpc-agent-go/knowledge/vectorstore/tcvector"
1223+
1224+
// 如需支持 PDF 文件,需手动引入 PDF reader(独立 go.mod,避免引入不必要的第三方依赖)
1225+
// _ "trpc.group/trpc-go/trpc-agent-go/knowledge/document/reader/pdf"
12201226
)
12211227

12221228
func main() {
@@ -1553,3 +1559,15 @@ go run main.go -embedder openai -vectorstore elasticsearch -es-version v9
15531559
- 确认文件存在且后缀受支持(.md/.txt/.pdf/.csv/.json/.docx 等);
15541560
- 目录源是否需要 `WithRecursive(true)`
15551561
- 使用 `WithFileExtensions` 做白名单过滤。
1562+
1563+
7. **PDF 文件读取支持**
1564+
1565+
- 说明:由于 PDF reader 依赖第三方库,为避免主模块引入不必要的依赖,PDF reader 采用独立 `go.mod` 管理。
1566+
- 使用方式:如需支持 PDF 文件读取,需在代码中手动引入 PDF reader 包进行注册:
1567+
```go
1568+
import (
1569+
// 引入 PDF reader 以支持 .pdf 文件解析
1570+
_ "trpc.group/trpc-go/trpc-agent-go/knowledge/document/reader/pdf"
1571+
)
1572+
```
1573+
- 注意:其他格式(.txt/.md/.csv/.json 等)的 reader 已自动注册,无需手动引入。

knowledge/source/internal/source/source.go

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,11 @@ import (
1717
"trpc.group/trpc-go/trpc-agent-go/knowledge/chunking"
1818
"trpc.group/trpc-go/trpc-agent-go/knowledge/document/reader"
1919
"trpc.group/trpc-go/trpc-agent-go/knowledge/ocr"
20+
21+
_ "trpc.group/trpc-go/trpc-agent-go/knowledge/document/reader/csv"
22+
_ "trpc.group/trpc-go/trpc-agent-go/knowledge/document/reader/json"
23+
_ "trpc.group/trpc-go/trpc-agent-go/knowledge/document/reader/markdown"
24+
_ "trpc.group/trpc-go/trpc-agent-go/knowledge/document/reader/text"
2025
)
2126

2227
// ReaderConfig holds configuration for creating readers.

0 commit comments

Comments
 (0)