Skip to content

Commit 3ca2387

Browse files
BentlybroOtto-AGPTmajiayu000aryancodes1ntindle
authored
feat(blocks): Implement Text Encode block (#11857)
## Summary Implements a `TextEncoderBlock` that encodes plain text into escape sequences (the reverse of `TextDecoderBlock`). ## Changes ### Block Implementation - Added `encoder_block.py` with `TextEncoderBlock` in `autogpt_platform/backend/backend/blocks/` - Uses `codecs.encode(text, "unicode_escape").decode("utf-8")` for encoding - Mirrors the structure and patterns of the existing `TextDecoderBlock` - Categorised as `BlockCategory.TEXT` ### Documentation - Added Text Encoder section to `docs/integrations/block-integrations/text.md` (the auto-generated docs file for TEXT category blocks) - Expanded "How it works" with technical details on the encoding method, validation, and edge cases - Added 3 structured use cases per docs guidelines: JSON payload preparation, Config/ENV generation, Snapshot fixtures - Added Text Encoder to the overview table in `docs/integrations/README.md` - Removed standalone `encoder_block.md` (TEXT category blocks belong in `text.md` per `CATEGORY_FILE_MAP` in `generate_block_docs.py`) ### Documentation Formatting (CodeRabbit feedback) - Added blank lines around markdown tables (MD058) - Added `text` language tags to fenced code blocks (MD040) - Restructured use case section with bold headings per coding guidelines ## How Docs Were Synced The `check-docs-sync` CI job runs `poetry run python scripts/generate_block_docs.py --check` which expects blocks to be documented in category-grouped files. Since `TextEncoderBlock` uses `BlockCategory.TEXT`, the `CATEGORY_FILE_MAP` maps it to `text.md` — not a standalone file. The block entry was added to `text.md` following the exact format used by the generator (with `<!-- MANUAL -->` markers for hand-written sections). ## Related Issue Fixes #11111 --------- Co-authored-by: Otto <otto@agpt.co> Co-authored-by: lif <19658300+majiayu000@users.noreply.github.com> Co-authored-by: Aryan Kaul <134673289+aryancodes1@users.noreply.github.com> Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co> Co-authored-by: Nick Tindle <nick@ntindle.com>
1 parent ed07f02 commit 3ca2387

File tree

4 files changed

+191
-0
lines changed

4 files changed

+191
-0
lines changed
Lines changed: 77 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,77 @@
1+
"""Text encoding block for converting special characters to escape sequences."""
2+
3+
import codecs
4+
5+
from backend.data.block import (
6+
Block,
7+
BlockCategory,
8+
BlockOutput,
9+
BlockSchemaInput,
10+
BlockSchemaOutput,
11+
)
12+
from backend.data.model import SchemaField
13+
14+
15+
class TextEncoderBlock(Block):
16+
"""
17+
Encodes a string by converting special characters into escape sequences.
18+
19+
This block is the inverse of TextDecoderBlock. It takes text containing
20+
special characters (like newlines, tabs, etc.) and converts them into
21+
their escape sequence representations (e.g., newline becomes \\n).
22+
"""
23+
24+
class Input(BlockSchemaInput):
25+
"""Input schema for TextEncoderBlock."""
26+
27+
text: str = SchemaField(
28+
description="A string containing special characters to be encoded",
29+
placeholder="Your text with newlines and quotes to encode",
30+
)
31+
32+
class Output(BlockSchemaOutput):
33+
"""Output schema for TextEncoderBlock."""
34+
35+
encoded_text: str = SchemaField(
36+
description="The encoded text with special characters converted to escape sequences"
37+
)
38+
error: str = SchemaField(description="Error message if encoding fails")
39+
40+
def __init__(self):
41+
super().__init__(
42+
id="5185f32e-4b65-4ecf-8fbb-873f003f09d6",
43+
description="Encodes a string by converting special characters into escape sequences",
44+
categories={BlockCategory.TEXT},
45+
input_schema=TextEncoderBlock.Input,
46+
output_schema=TextEncoderBlock.Output,
47+
test_input={
48+
"text": """Hello
49+
World!
50+
This is a "quoted" string."""
51+
},
52+
test_output=[
53+
(
54+
"encoded_text",
55+
"""Hello\\nWorld!\\nThis is a "quoted" string.""",
56+
)
57+
],
58+
)
59+
60+
async def run(self, input_data: Input, **kwargs) -> BlockOutput:
61+
"""
62+
Encode the input text by converting special characters to escape sequences.
63+
64+
Args:
65+
input_data: The input containing the text to encode.
66+
**kwargs: Additional keyword arguments (unused).
67+
68+
Yields:
69+
The encoded text with escape sequences, or an error message if encoding fails.
70+
"""
71+
try:
72+
encoded_text = codecs.encode(input_data.text, "unicode_escape").decode(
73+
"utf-8"
74+
)
75+
yield "encoded_text", encoded_text
76+
except Exception as e:
77+
yield "error", f"Encoding error: {str(e)}"
Lines changed: 77 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,77 @@
1+
import pytest
2+
3+
from backend.blocks.encoder_block import TextEncoderBlock
4+
5+
6+
@pytest.mark.asyncio
7+
async def test_text_encoder_basic():
8+
"""Test basic encoding of newlines and special characters."""
9+
block = TextEncoderBlock()
10+
result = []
11+
async for output in block.run(TextEncoderBlock.Input(text="Hello\nWorld")):
12+
result.append(output)
13+
14+
assert len(result) == 1
15+
assert result[0][0] == "encoded_text"
16+
assert result[0][1] == "Hello\\nWorld"
17+
18+
19+
@pytest.mark.asyncio
20+
async def test_text_encoder_multiple_escapes():
21+
"""Test encoding of multiple escape sequences."""
22+
block = TextEncoderBlock()
23+
result = []
24+
async for output in block.run(
25+
TextEncoderBlock.Input(text="Line1\nLine2\tTabbed\rCarriage")
26+
):
27+
result.append(output)
28+
29+
assert len(result) == 1
30+
assert result[0][0] == "encoded_text"
31+
assert "\\n" in result[0][1]
32+
assert "\\t" in result[0][1]
33+
assert "\\r" in result[0][1]
34+
35+
36+
@pytest.mark.asyncio
37+
async def test_text_encoder_unicode():
38+
"""Test that unicode characters are handled correctly."""
39+
block = TextEncoderBlock()
40+
result = []
41+
async for output in block.run(TextEncoderBlock.Input(text="Hello 世界\n")):
42+
result.append(output)
43+
44+
assert len(result) == 1
45+
assert result[0][0] == "encoded_text"
46+
# Unicode characters should be escaped as \uXXXX sequences
47+
assert "\\n" in result[0][1]
48+
49+
50+
@pytest.mark.asyncio
51+
async def test_text_encoder_empty_string():
52+
"""Test encoding of an empty string."""
53+
block = TextEncoderBlock()
54+
result = []
55+
async for output in block.run(TextEncoderBlock.Input(text="")):
56+
result.append(output)
57+
58+
assert len(result) == 1
59+
assert result[0][0] == "encoded_text"
60+
assert result[0][1] == ""
61+
62+
63+
@pytest.mark.asyncio
64+
async def test_text_encoder_error_handling():
65+
"""Test that encoding errors are handled gracefully."""
66+
from unittest.mock import patch
67+
68+
block = TextEncoderBlock()
69+
result = []
70+
71+
with patch("codecs.encode", side_effect=Exception("Mocked encoding error")):
72+
async for output in block.run(TextEncoderBlock.Input(text="test")):
73+
result.append(output)
74+
75+
assert len(result) == 1
76+
assert result[0][0] == "error"
77+
assert "Mocked encoding error" in result[0][1]

docs/integrations/README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -192,6 +192,7 @@ Below is a comprehensive list of all available blocks, categorized by their prim
192192
| [Get Current Time](block-integrations/text.md#get-current-time) | This block outputs the current time |
193193
| [Match Text Pattern](block-integrations/text.md#match-text-pattern) | Matches text against a regex pattern and forwards data to positive or negative output based on the match |
194194
| [Text Decoder](block-integrations/text.md#text-decoder) | Decodes a string containing escape sequences into actual text |
195+
| [Text Encoder](block-integrations/text.md#text-encoder) | Encodes a string by converting special characters into escape sequences |
195196
| [Text Replace](block-integrations/text.md#text-replace) | This block is used to replace a text with a new text |
196197
| [Text Split](block-integrations/text.md#text-split) | This block is used to split a text into a list of strings |
197198
| [Word Character Count](block-integrations/text.md#word-character-count) | Counts the number of words and characters in a given text |

docs/integrations/block-integrations/text.md

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -380,6 +380,42 @@ This is useful when working with data from APIs or files where escape sequences
380380

381381
---
382382

383+
## Text Encoder
384+
385+
### What it is
386+
Encodes a string by converting special characters into escape sequences
387+
388+
### How it works
389+
<!-- MANUAL: how_it_works -->
390+
The Text Encoder takes the input string and applies Python's `unicode_escape` encoding (equivalent to `codecs.encode(text, "unicode_escape").decode("utf-8")`) to transform special characters like newlines, tabs, and backslashes into their escaped forms.
391+
392+
The block relies on the input schema to ensure the value is a string; non-string inputs are rejected by validation, and any encoding failures surface as block errors. Non-ASCII characters are emitted as `\uXXXX` sequences, which is useful for ASCII-only payloads.
393+
<!-- END MANUAL -->
394+
395+
### Inputs
396+
397+
| Input | Description | Type | Required |
398+
|-------|-------------|------|----------|
399+
| text | A string containing special characters to be encoded | str | Yes |
400+
401+
### Outputs
402+
403+
| Output | Description | Type |
404+
|--------|-------------|------|
405+
| error | Error message if encoding fails | str |
406+
| encoded_text | The encoded text with special characters converted to escape sequences | str |
407+
408+
### Possible use case
409+
<!-- MANUAL: use_case -->
410+
**JSON Payload Preparation**: Encode multiline or quoted text before embedding it in JSON string fields to ensure proper escaping.
411+
412+
**Config/ENV Generation**: Convert template text into escaped strings for `.env` or YAML values that require special character handling.
413+
414+
**Snapshot Fixtures**: Produce stable escaped strings for golden files or API tests where consistent text representation is needed.
415+
<!-- END MANUAL -->
416+
417+
---
418+
383419
## Text Replace
384420

385421
### What it is

0 commit comments

Comments
 (0)