Skip to content

Commit c458ae6

Browse files
authored
Merge pull request #13 from achetronic/feat/add-message-splitting
Feat/add message splitting
2 parents 4a064f7 + c84056b commit c458ae6

File tree

12 files changed

+997
-67
lines changed

12 files changed

+997
-67
lines changed

.agents/CLIENT_DESIGN.md

Lines changed: 73 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -181,6 +181,9 @@ server/clients/
181181
├── provider.go — Provider interface, Schema type alias
182182
├── registry.go — Global registry: Register(), ValidateConfig() with oneOf
183183
├── executor.go — Shared execution logic (webhook + cron)
184+
├── msgutil/
185+
│ ├── msgutil.go — Shared message validation and splitting utilities
186+
│ └── msgutil_test.go — Tests for validation and splitting
184187
├── direct/spec.go — Direct provider (empty schema)
185188
├── telegram/
186189
│ ├── spec.go — Telegram provider (JSON Schema with x-format, enum)
@@ -444,8 +447,77 @@ If the message has a caption, it goes as the text part. If no caption, the text
444447

445448
### File size validation
446449

447-
20MB limit enforced before downloading. Telegram bot API limits files to 20MB anyway. For Slack, check `file.Size` before calling `downloadSlackFile()`.
450+
5MB per file, 10MB total per message, max 10 files. Enforced client-side before downloading. Telegram bot API limits files to 20MB anyway. For Slack, check `file.Size` before calling `downloadSlackFile()`.
448451

449452
### A2A (future)
450453

451454
A2A agent cards (`server/a2a/handler.go`) currently declare `DefaultInputModes: []string{"text/plain"}`. When A2A file support is added, include additional MIME types (`image/*`, `application/pdf`, etc.) and convert A2A `FilePart``genai.Part{InlineData}` in the executor.
455+
456+
## Message Size Handling (Implemented)
457+
458+
### Package: `server/clients/msgutil/`
459+
460+
Shared utility package for message validation and splitting. Both Telegram and Slack clients import it. Clients stay decoupled — each calls the utility with its own platform limits.
461+
462+
### Constants
463+
464+
| Constant | Value | Usage |
465+
|----------|-------|-------|
466+
| `TelegramMaxMessageLength` | 4096 | Telegram API limit per message |
467+
| `SlackMaxMessageLength` | 39000 | Slack API limit per message block |
468+
| `DefaultMaxInputLength` | 16000 | Max inbound user message length |
469+
470+
### Functions
471+
472+
**`ValidateInputLength(text string, maxLen int) (string, bool)`**
473+
- Truncates at `maxLen` runes (unicode-safe), appends `\n\n[message truncated]`
474+
- Returns the (possibly truncated) text and whether truncation occurred
475+
- Applied at client entry points before calling the agent
476+
477+
**`SplitMessage(text string, maxLen int) []string`**
478+
- Splits into chunks respecting `maxLen` runes per chunk
479+
- Split priority: paragraph (`\n\n`) > line (`\n`) > word (space) > hard cut
480+
- Returns `[]string` — all chunks non-empty, within limit
481+
482+
### Where it's applied
483+
484+
| Client | Inbound | Outbound |
485+
|--------|---------|----------|
486+
| **Telegram** | `handleMessage()` validates `msg.Text`; `handleVoice()` validates transcribed text | `sendResponse()` splits via `SplitMessage(text, 4096)`, sends chunks sequentially |
487+
| **Slack** | `processMessage()` validates text (covers DMs + audio clips) | `postMessage()` splits via `SplitMessage(text, 39000)`, posts chunks sequentially |
488+
| **Voice UI** | No validation (browser input is bounded) | No splitting (browser has no render limit) |
489+
| **Executor** | No validation (prompts are from commands/webhooks, admin-controlled) | No splitting (returns string to HTTP caller) |
490+
491+
---
492+
493+
## Artifact Delivery
494+
495+
When an LLM uses `save_artifact` during a `/run` call, clients automatically deliver the new artifact as a file attachment. The flow:
496+
497+
1. **Before `/run`**: Client calls `GET /apps/{agent}/users/{user}/sessions/{session}/artifacts` to snapshot existing artifact names
498+
2. **After `/run`**: Client calls the same endpoint again and diffs the two lists
499+
3. **New artifacts**: Each new name is downloaded via `GET .../artifacts/{name}` and sent as a file
500+
501+
### Delivery per client
502+
503+
| Client | Method | Details |
504+
|--------|--------|---------|
505+
| **Telegram** | `ctx.Bot().SendDocument()` with `tu.FileFromReader()` | Artifact name used as filename |
506+
| **Slack** | `c.api.UploadFileV2()` | Artifact name as filename + title, respects thread |
507+
| **Voice UI** | Not yet implemented | Would need download button in UI |
508+
509+
### Artifact REST response format
510+
511+
The ADK artifact endpoint returns a `genai.Part` JSON:
512+
- **Text artifacts**: `{"text": "content..."}`
513+
- **Binary artifacts**: `{"inlineData": {"mimeType": "...", "data": "<base64>"}}`
514+
515+
### Key files
516+
517+
| File | Role |
518+
|------|------|
519+
| `server/agent/tools/artifacts/toolset.go` | Toolset with save/load/list tools |
520+
| `server/agent/base_toolset.go` | Wires artifact toolset into all agents |
521+
| `server/agent/agent.go` | Creates `artifactfs.NewFilesystemService()`, sets `launcherCfg.ArtifactService` |
522+
| `server/clients/telegram/bot.go` | `listArtifacts()`, `downloadArtifact()`, `sendNewArtifacts()` |
523+
| `server/clients/slack/bot.go` | Same three methods, adapted for Slack API |

.agents/DECISIONS.md

Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -361,3 +361,69 @@ the existing value.
361361
and recreate. Non-secret entities remain intact.
362362

363363
**Do not**: Return secret values in GET responses. Do not store secrets in config.yaml.
364+
365+
---
366+
367+
## Message size validation and splitting via shared msgutil package
368+
369+
**Date**: 2026-02-23
370+
**Status**: Implemented
371+
372+
Large inbound messages and oversized outbound responses are handled by a shared utility package `server/clients/msgutil/`. Both Telegram and Slack clients import it — the logic is DRY and testable, while clients remain decoupled from each other and from the ADK API.
373+
374+
**Inbound validation**: `ValidateInputLength(text, maxLen)` truncates messages exceeding 16K runes (unicode-safe) and appends `[message truncated]`. Applied in both clients before calling the agent.
375+
376+
**Outbound splitting**: `SplitMessage(text, maxLen)` breaks responses into platform-safe chunks. Split priority: paragraph boundaries (`\n\n`) > line boundaries (`\n`) > word boundaries (space) > hard cut. Telegram uses 4096, Slack uses 39000.
377+
378+
**Platform constants**:
379+
- `TelegramMaxMessageLength = 4096`
380+
- `SlackMaxMessageLength = 39000`
381+
- `DefaultMaxInputLength = 16000`
382+
383+
**Where validation happens**:
384+
- Telegram: `handleMessage()` validates `msg.Text`, `handleVoice()` validates transcribed text — both before `callAgent()`
385+
- Slack: `processMessage()` validates `text` before building the request — covers both DMs and audio clips (which flow through `processMessage`)
386+
- Voice UI: no splitting needed (browser has no render limit)
387+
- Executor: no splitting needed (returns string to HTTP caller)
388+
389+
**Do not**: Validate or split inside `callAgent()` / the ADK request path. Keep it at the client entry/exit points so each client controls its own limits. Do not add platform-specific logic to the shared package — it only provides generic split/validate functions with configurable limits.
390+
391+
---
392+
393+
## 17. Artifact Toolset — Universal via Base Toolset, No Delete
394+
395+
**Date**: 2025-02-23
396+
397+
All agents get the artifact toolset (save/load/list) unconditionally via `base_toolset.go`, not opt-in per agent. This avoids config complexity and ensures every agent can produce files for users.
398+
399+
**No delete tool**: ADK's `agent.Artifacts` interface (exposed via `tool.Context`) has Save, Load, List, and LoadVersion — but no Delete. Delete exists only on `artifact.Service` directly. Rather than breaking the abstraction by passing the raw service into tools, we omit delete. Artifacts are versioned and session-scoped, so stale artifacts are naturally cleaned up when sessions expire.
400+
401+
**Storage**: `adk-utils-go/artifact/filesystem` — filesystem-backed `artifact.Service` implementation. Stores artifacts as JSON at `data/artifacts/{appName}/{userID}/{sessionID}/{fileName}/{version}.json`. Supports versioning and user-scoped artifacts. Data persists across restarts.
402+
403+
**Client delivery**: Telegram and Slack clients list artifacts before and after each `/run` call, diff the lists, and deliver new artifacts as file attachments (Telegram: `SendDocument`, Slack: `UploadFileV2`). Artifacts are always files, never inlined in chat text.
404+
405+
**Files**: `server/agent/tools/artifacts/toolset.go` (toolset), `server/agent/base_toolset.go` (wiring), `server/agent/agent.go` (FilesystemService + launcher config), `server/clients/telegram/bot.go` and `server/clients/slack/bot.go` (delivery).
406+
407+
---
408+
409+
## 18. Multimodal Adapter Parity — Error on Unsupported Types
410+
411+
**Date**: 2026-02-23
412+
413+
When an adapter receives `genai.Part{InlineData}` with a MIME type it can't translate, it returns an error — **not** `nil` (silent drop). This matches Gemini's native behavior where unsupported types cause the API request to fail.
414+
415+
**Rationale**: Silent drops are a bug — the user sends a file, the LLM never sees it, and nobody gets feedback. With errors, either the client validates beforehand (preferred) or the user sees an explicit failure. All three providers behave identically: unsupported = fail.
416+
417+
**Supported types per adapter (adk-utils-go v0.3.1)**:
418+
419+
| Type | Gemini | OpenAI | Anthropic |
420+
|---|---|---|---|
421+
| Images (JPEG, PNG, GIF, WebP) | ✅ (native) | ✅ (data URI) | ✅ (Base64ImageSource) |
422+
| PDF | ✅ (native) | ✅ (FileParam) | ✅ (Base64PDFSource) |
423+
| Text (text/*) | ✅ (native) | ✅ (FileParam) | ✅ (PlainTextSource) |
424+
| Audio (WAV, MP3, WebM) | ✅ (native) | ✅ (InputAudio) | ❌ error |
425+
| Video, other | ✅ (native) | ❌ error | ❌ error |
426+
427+
**Do not**: Silently drop unsupported `InlineData` parts. Do not convert them to text descriptions. Return `fmt.Errorf("unsupported inline data MIME type for %s: %s")`.
428+
429+
**Files**: `adk-utils-go/genai/openai/openai.go` (`convertInlineDataToPart`), `adk-utils-go/genai/anthropic/anthropic.go` (`convertInlineDataToBlock`).

.agents/TODO.md

Lines changed: 17 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -1,26 +1,29 @@
11
# Magec - TODO
22

3-
## High Priority
4-
5-
### Large Message Handling in Telegram and Slack
3+
## ~~Large Message Handling in Telegram and Slack~~
64

7-
**Problem**: No validation on inbound message size from Telegram/Slack, and outbound responses to Telegram may exceed the 4096-character message limit. Large inputs could cause excessive memory usage or unexpected behavior, and oversized responses will fail silently or get truncated by the API.
8-
9-
**Solution**:
10-
- **Inbound**: Add a max input length check in both clients. Reject or truncate messages that exceed a reasonable threshold (e.g. 16K chars) with a user-friendly error.
11-
- **Outbound (Telegram)**: Split responses exceeding 4096 chars into multiple sequential messages. Preserve markdown formatting across splits where possible.
12-
- **Outbound (Slack)**: Slack's limit is ~40K per message block — less urgent but should still have a safety check.
13-
14-
**Modify**: `server/clients/telegram/bot.go`, `server/clients/slack/bot.go`
5+
Implemented. See `server/clients/msgutil/` package.
156

167
---
178

9+
## High Priority
10+
1811
### Multimodal File/Image Support in Clients
1912

2013
**Problem**: Telegram and Slack clients only handle text and voice messages. Users sending images, documents, PDFs, or other files get silently ignored.
2114

2215
**Solution**: Download files from Telegram/Slack, encode as base64, and send as `inlineData` parts alongside text in the ADK `/run` request. The ADK already supports `genai.Part{InlineData: &Blob{Data, MIMEType}}` — zero backend changes needed.
2316

17+
**Adapter support (adk-utils-go v0.3.1)**:
18+
- **Gemini**: passes all `InlineData` transparently to the API. Unsupported types are rejected by Google's API.
19+
- **OpenAI**: translates images (JPEG, PNG, GIF, WebP), audio (WAV, MP3, MPEG, WebM), and files (PDF, text/*). Unsupported types return an error.
20+
- **Anthropic**: translates images (JPEG, PNG, GIF, WebP), PDFs, and text documents (text/*). Unsupported types return an error.
21+
- All three adapters behave the same: if a MIME type can't be translated, the request fails. No silent drops.
22+
23+
**File size limits**: 5MB per file, 10MB total per message, max 10 files per message. Validated client-side before download.
24+
25+
**Supported types (denominator común)**: JPEG, PNG, GIF, WebP. PDF and text/* work on Gemini + Anthropic. Audio works on Gemini + OpenAI.
26+
2427
**Telegram** (`server/clients/telegram/bot.go`):
2528
- Current state: only `Voice` (dedicated handler) and `Text` (predicate at ~line 171 requires `Text != ""` and `Voice == nil`). Everything else is silently dropped.
2629
- Add handler for `Document`, `Photo`, `Video`, `Audio`, `Animation`, `VideoNote`, `Sticker`. All have `FileID``bot.GetFile()` → download bytes.
@@ -46,7 +49,7 @@
4649
}
4750
```
4851

49-
**File size validation**: Add 20MB limit (denominator común: Gemini 20MB, OpenAI 20MB, Anthropic 5MB for images). Telegram API limits bots to 20MB anyway. Reject oversized files with user-friendly message.
52+
**File size validation**: 5MB per file, 10MB total per message, max 10 files. Reject oversized files with user-friendly message.
5053

5154
**LLM limitations**: GPT-4o/Claude/Gemini handle images and PDFs natively. For Word/Excel/CSV, the model may not support them — the user gets a natural "I can't process this format" response from the LLM itself.
5255

@@ -127,28 +130,9 @@ See `.agents/ADK_TOOLS.md` for protocol details.
127130

128131
---
129132

130-
### Artifact Management Toolset
131-
132-
**Problem**: ADK has artifact storage (versioned, session-scoped) and REST endpoints for clients to download them, but the LLM has no way to create, read, list, or delete artifacts. Without tools that call `ctx.SaveArtifact()` / `ctx.LoadArtifact()`, the artifact system is dead weight.
133-
134-
**Solution**: Build a base toolset with four Go-native tools using `functiontool`:
135-
- `save_artifact(name, content, mimeType)` — saves content as a versioned artifact in the session
136-
- `load_artifact(name)` — reads an artifact (latest version) back into context
137-
- `list_artifacts()` — lists all artifacts in the current session
138-
- `delete_artifact(name)` — removes an artifact
139-
140-
**Use cases**:
141-
- LLM generates a report/export → `save_artifact()` → user downloads via Voice UI / Telegram / Slack using existing ADK GET endpoints
142-
- Flow pipelines: step 1 produces data → `save_artifact()`, step 2 reads → `load_artifact()` and transforms
143-
- Combined with a filesystem MCP: `load_artifact()` → process → `write_file()` to persist externally, or `read_file()``save_artifact()` to make available for download
144-
145-
**Design**:
146-
- Configurable per agent (not all agents need it) — toggle in agent config, similar to memory tools
147-
- Sandboxed by session — no security risk, no file system access
148-
- ADK handles versioning and storage automatically
149-
- Replaces `loadartifactstool` from ADK (read-only) with a complete CRUD toolset
133+
### ~~Artifact Management Toolset~~
150134

151-
**Modify**: `server/agent/agent.go` (register toolset in `buildToolsets`), new file `server/agent/tools/artifacts.go`, `server/store/types.go` (agent config toggle), `frontend/admin-ui/` (agent form toggle)
135+
Implemented. See `server/agent/tools/artifacts/toolset.go` — provides `save_artifact`, `load_artifact`, and `list_artifacts` tools via `functiontool.New`. Supports text and base64 binary content. Wired into `base_toolset.go` so all agents get it. Filesystem-backed via `adk-utils-go/artifact/filesystem` (persists across restarts). Clients (Telegram and Slack) auto-deliver new artifacts as file attachments after each `/run` response using before/after diff of the artifact list REST endpoint.
152136

153137
---
154138

server/agent/agent.go

Lines changed: 21 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,7 @@ import (
4646
memorypostgres "github.com/achetronic/adk-utils-go/memory/postgres"
4747
sessionredis "github.com/achetronic/adk-utils-go/session/redis"
4848
toolsmemory "github.com/achetronic/adk-utils-go/tools/memory"
49+
artifactfs "github.com/achetronic/adk-utils-go/artifact/filesystem"
4950

5051
"github.com/achetronic/magec/server/config"
5152
"github.com/achetronic/magec/server/contextwindow"
@@ -71,6 +72,14 @@ When a user asks you to remember something or asks about past information:
7172
7273
When a user shares preferences or important information, proactively save it to memory for future reference.`
7374

75+
const artifactInstruction = `
76+
You have access to artifact tools for creating and managing files:
77+
- Use 'save_artifact' to save code, documents, data files, or any content that should be delivered as a downloadable file. Provide a filename (e.g. "report.md", "main.py", "data.csv"), the content, and optionally a mime_type. For binary content, set is_base64=true and provide base64-encoded data.
78+
- Use 'load_artifact' to retrieve a previously saved artifact by name.
79+
- Use 'list_artifacts' to see all artifacts in the current session.
80+
81+
IMPORTANT: When generating code files, long documents, configuration files, scripts, or any substantial structured content, ALWAYS use save_artifact instead of pasting it in the chat. The artifact will be delivered to the user as a downloadable file automatically.`
82+
7483
// Service wraps the ADK REST handler that serves all configured agents.
7584
// Incoming requests are routed to the correct agent by the appName field.
7685
type Service struct {
@@ -128,6 +137,13 @@ func New(ctx context.Context, agents []store.AgentDefinition, backends []store.B
128137
// Rebuilt from scratch on every hot-reload (store change).
129138
llmMap := make(map[string]model.LLM, len(agents))
130139

140+
artifactSvc, err := artifactfs.NewFilesystemService(artifactfs.FilesystemServiceConfig{
141+
BasePath: filepath.Join("data", "artifacts"),
142+
})
143+
if err != nil {
144+
return nil, fmt.Errorf("artifact service: %w", err)
145+
}
146+
131147
baseTset, err := newBaseToolset()
132148
if err != nil {
133149
return nil, fmt.Errorf("failed to create base toolset: %w", err)
@@ -195,8 +211,9 @@ func New(ctx context.Context, agents []store.AgentDefinition, backends []store.B
195211
}
196212

197213
launcherCfg := &launcher.Config{
198-
SessionService: sessionSvc,
199-
AgentLoader: loader,
214+
SessionService: sessionSvc,
215+
AgentLoader: loader,
216+
ArtifactService: artifactSvc,
200217
}
201218
if memorySvc != nil {
202219
launcherCfg.MemoryService = memorySvc
@@ -519,6 +536,8 @@ func buildInstruction(agentDef store.AgentDefinition, mcpServerMap map[string]st
519536
instruction += memoryInstruction
520537
}
521538

539+
instruction += artifactInstruction
540+
522541
for _, mcpName := range agentDef.MCPServers {
523542
if srv, ok := mcpServerMap[mcpName]; ok && srv.SystemPrompt != "" {
524543
instruction += "\n\n" + srv.SystemPrompt

server/agent/base_toolset.go

Lines changed: 24 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,28 +1,42 @@
11
package agent
22

33
import (
4+
"fmt"
5+
46
"google.golang.org/adk/agent"
57
"google.golang.org/adk/tool"
8+
9+
toolsartifacts "github.com/achetronic/magec/server/agent/tools/artifacts"
610
)
711

8-
// baseToolset provides tools that are available to every agent regardless of
9-
// configuration.
10-
//
11-
// TODO: Explore injecting exit_loop only to agents inside a loopagent (option 3).
12-
// This would require cloning agents when building flow steps so the same agent
13-
// definition can participate in a loop (with exit_loop) and outside one (without).
1412
type baseToolset struct {
15-
tools []tool.Tool
13+
tools []tool.Tool
14+
artifactTools *toolsartifacts.Toolset
1615
}
1716

1817
func newBaseToolset() (*baseToolset, error) {
19-
return &baseToolset{tools: []tool.Tool{}}, nil
18+
artifactTs, err := toolsartifacts.NewToolset()
19+
if err != nil {
20+
return nil, fmt.Errorf("failed to create artifact toolset: %w", err)
21+
}
22+
23+
return &baseToolset{
24+
tools: []tool.Tool{},
25+
artifactTools: artifactTs,
26+
}, nil
2027
}
2128

2229
func (b *baseToolset) Name() string {
2330
return "base_toolset"
2431
}
2532

26-
func (b *baseToolset) Tools(_ agent.ReadonlyContext) ([]tool.Tool, error) {
27-
return b.tools, nil
33+
func (b *baseToolset) Tools(ctx agent.ReadonlyContext) ([]tool.Tool, error) {
34+
artTools, err := b.artifactTools.Tools(ctx)
35+
if err != nil {
36+
return b.tools, nil
37+
}
38+
all := make([]tool.Tool, 0, len(b.tools)+len(artTools))
39+
all = append(all, b.tools...)
40+
all = append(all, artTools...)
41+
return all, nil
2842
}

0 commit comments

Comments
 (0)