Skip to content

Commit d15634a

Browse files
committed
feat: Add query length validation and input sanitization
Add configurable query expression length limits to prevent resource exhaustion from overly complex queries, and implement comprehensive input validation to reject control characters in client library parameters. Changes: - Add api.max_query_length config parameter (default: 128, 0 = unlimited) - Implement query length calculation across search text, AND/NOT terms, filters, and ORDER BY - Add QueryParser::SetMaxQueryLength() for runtime configuration - Apply length validation in both SEARCH and COUNT commands - Add control character detection in MygramClient for all string inputs - Update config schema, examples, and documentation (EN/JA) - Add serialization support for new config parameter in dump format - Add comprehensive unit tests for validation logic - Update documentation to explain query length limits and error handling This improves security by preventing injection attacks through control characters and guards against resource exhaustion from unbounded query complexity.
1 parent f73cc6b commit d15634a

23 files changed

+334
-6
lines changed

docker/entrypoint.sh

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -118,8 +118,8 @@ memory:
118118
width: "${MEMORY_NORMALIZE_WIDTH}"
119119
lower: ${MEMORY_NORMALIZE_LOWER}
120120
121-
# Snapshot Persistence
122-
snapshot:
121+
# Dump Persistence
122+
dump:
123123
dir: "${SNAPSHOT_DIR}"
124124
interval_sec: ${SNAPSHOT_INTERVAL_SEC}
125125
retain: ${SNAPSHOT_RETAIN}

docs/en/configuration.md

Lines changed: 16 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -83,6 +83,8 @@ api:
8383
enable: true
8484
bind: "127.0.0.1"
8585
port: 8080
86+
default_limit: 100
87+
max_query_length: 128
8688

8789
network:
8890
allow_cidrs: []
@@ -168,7 +170,9 @@ The same configuration in JSON format:
168170
"enable": true,
169171
"bind": "127.0.0.1",
170172
"port": 8080
171-
}
173+
},
174+
"default_limit": 100,
175+
"max_query_length": 128
172176
},
173177
"network": {
174178
"allow_cidrs": []
@@ -398,6 +402,17 @@ api:
398402
enable: true # Default: true
399403
bind: "127.0.0.1" # Default: 127.0.0.1 (localhost only)
400404
port: 8080 # Default: 8080
405+
default_limit: 100 # Default LIMIT when not specified (5-1000)
406+
max_query_length: 128 # Max query expression length (0 = unlimited)
407+
408+
### Query Defaults
409+
410+
- **default_limit**: Used when a SEARCH query omits `LIMIT`. Keeps pagination predictable and prevents large responses.
411+
- Range 5–1000, default 100.
412+
- Applies to TCP/HTTP/CLI clients alike.
413+
- **max_query_length**: Rejects overly long boolean expressions to avoid resource exhaustion.
414+
- Default 128 characters; set to `0` to disable the guard.
415+
- The length includes search text, AND/NOT terms, and FILTER values.
401416
```
402417
403418
## Network Section (Optional)

docs/en/query_syntax.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -407,6 +407,16 @@ SEARCH articles tech LIMIT 10 OFFSET 10
407407
SEARCH articles tech LIMIT 10 OFFSET 20
408408
```
409409

410+
### Maximum Query Length
411+
412+
MygramDB rejects queries whose combined expression length (search text + AND/NOT terms + FILTER values) exceeds the configured limit.
413+
414+
- **Default:** 128 characters
415+
- **Config:** `api.max_query_length` (`0` disables the guard)
416+
- **Error:** `ERROR Query expression length (...) exceeds maximum allowed length...`
417+
418+
Keep boolean expressions compact or raise the limit in `config.yaml` if applications require longer filters.
419+
410420
### Complete Example with All Options
411421

412422
```

docs/ja/configuration.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -78,6 +78,12 @@ api:
7878
tcp:
7979
bind: "0.0.0.0"
8080
port: 11016
81+
http:
82+
enable: true
83+
bind: "127.0.0.1"
84+
port: 8080
85+
default_limit: 100
86+
max_query_length: 128
8187

8288
network:
8389
allow_cidrs: []
@@ -158,6 +164,14 @@ JSON 形式での同じ設定:
158164
"tcp": {
159165
"bind": "0.0.0.0",
160166
"port": 11016
167+
},
168+
"http": {
169+
"enable": true,
170+
"bind": "127.0.0.1",
171+
"port": 8080
172+
},
173+
"default_limit": 100,
174+
"max_query_length": 128
161175
}
162176
},
163177
"network": {
@@ -383,6 +397,13 @@ api:
383397
enable: true # デフォルト: true
384398
bind: "127.0.0.1" # デフォルト: 127.0.0.1(ローカルホストのみ)
385399
port: 8080 # デフォルト: 8080
400+
default_limit: 100 # LIMIT 省略時のデフォルト (5-1000)
401+
max_query_length: 128 # クエリ式の最大長 (0 = 無制限)
402+
403+
### クエリ関連のデフォルト
404+
405+
- **default_limit**: SEARCH で `LIMIT` を指定しない場合に自動的に適用されます。レスポンス肥大化を防ぐため、5〜1000 の範囲で設定可能(既定 100)。
406+
- **max_query_length**: 検索語・AND/NOT 条件・FILTER 値を合計したクエリ式の長さ上限です。既定 128 文字、`0` を設定すると無制限。非常に長いクエリによるリソース消費を抑制します。
386407
```
387408

388409
## Network セクション(オプション)

docs/ja/query_syntax.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -407,6 +407,16 @@ SEARCH articles tech LIMIT 10 OFFSET 10
407407
SEARCH articles tech LIMIT 10 OFFSET 20
408408
```
409409

410+
### クエリ長の上限
411+
412+
MygramDB は、検索語・AND/NOT 条件・FILTER 値を合計したクエリ式が設定された長さを超えると `ERROR` を返します。
413+
414+
- **デフォルト:** 128文字
415+
- **設定:** `api.max_query_length``0` で無効化)
416+
- **エラー例:** `ERROR Query expression length (...) exceeds ...`
417+
418+
複雑な条件が必要な場合は、`config.yaml` で上限を調整するか、複数のクエリに分割してください。
419+
410420
### すべてのオプションを含む完全な例
411421

412422
```

examples/config.json

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -97,7 +97,14 @@
9797
"tcp": {
9898
"bind": "0.0.0.0",
9999
"port": 11016
100-
}
100+
},
101+
"http": {
102+
"enable": true,
103+
"bind": "127.0.0.1",
104+
"port": 8080
105+
},
106+
"default_limit": 100,
107+
"max_query_length": 128
101108
},
102109
"network": {
103110
"allow_cidrs": []

examples/config.yaml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -159,6 +159,12 @@ api:
159159
tcp:
160160
bind: "0.0.0.0" # TCP bind address (default: 0.0.0.0, all interfaces)
161161
port: 11016 # TCP port (default: 11016)
162+
http:
163+
enable: true # Enable HTTP/JSON API (default: true)
164+
bind: "127.0.0.1" # HTTP bind address (default: 127.0.0.1)
165+
port: 8080 # HTTP port (default: 8080)
166+
default_limit: 100 # Default LIMIT when not specified (range 5-1000)
167+
max_query_length: 128 # Max query expression length (0 = unlimited)
162168

163169
# Network Security (optional)
164170
network:

src/client/mygramclient.cpp

Lines changed: 90 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,9 @@
1111
#include <sys/time.h>
1212
#include <unistd.h>
1313

14+
#include <cctype>
1415
#include <cstring>
16+
#include <iomanip>
1517
#include <sstream>
1618
#include <utility>
1719

@@ -93,6 +95,22 @@ std::optional<DebugInfo> ParseDebugInfo(const std::vector<std::string>& tokens,
9395
return info;
9496
}
9597

98+
/**
99+
* @brief Validate that a string does not contain ASCII control characters
100+
*/
101+
std::optional<std::string> ValidateNoControlCharacters(const std::string& value, const char* field_name) {
102+
for (unsigned char character : value) {
103+
if (std::iscntrl(character) != 0) {
104+
std::ostringstream oss;
105+
oss << "Input for " << field_name << " contains control character 0x" << std::uppercase << std::hex
106+
<< std::setw(2) << std::setfill('0') << static_cast<int>(character) << ", which is not allowed";
107+
return oss.str();
108+
}
109+
}
110+
111+
return std::nullopt;
112+
}
113+
96114
/**
97115
* @brief Escape special characters in query strings
98116
*/
@@ -235,6 +253,36 @@ class MygramClient::Impl {
235253
const std::vector<std::string>& not_terms,
236254
const std::vector<std::pair<std::string, std::string>>& filters,
237255
const std::string& sort_column, bool sort_desc) {
256+
if (auto err = ValidateNoControlCharacters(table, "table name")) {
257+
return Error(*err);
258+
}
259+
if (auto err = ValidateNoControlCharacters(query, "search query")) {
260+
return Error(*err);
261+
}
262+
for (const auto& term : and_terms) {
263+
if (auto err = ValidateNoControlCharacters(term, "AND term")) {
264+
return Error(*err);
265+
}
266+
}
267+
for (const auto& term : not_terms) {
268+
if (auto err = ValidateNoControlCharacters(term, "NOT term")) {
269+
return Error(*err);
270+
}
271+
}
272+
for (const auto& [key, value] : filters) {
273+
if (auto err = ValidateNoControlCharacters(key, "filter key")) {
274+
return Error(*err);
275+
}
276+
if (auto err = ValidateNoControlCharacters(value, "filter value")) {
277+
return Error(*err);
278+
}
279+
}
280+
if (!sort_column.empty()) {
281+
if (auto err = ValidateNoControlCharacters(sort_column, "sort column")) {
282+
return Error(*err);
283+
}
284+
}
285+
238286
// Build command
239287
std::ostringstream cmd;
240288
cmd << "SEARCH " << table << " " << EscapeQueryString(query);
@@ -330,6 +378,31 @@ class MygramClient::Impl {
330378
const std::vector<std::string>& and_terms,
331379
const std::vector<std::string>& not_terms,
332380
const std::vector<std::pair<std::string, std::string>>& filters) {
381+
if (auto err = ValidateNoControlCharacters(table, "table name")) {
382+
return Error(*err);
383+
}
384+
if (auto err = ValidateNoControlCharacters(query, "search query")) {
385+
return Error(*err);
386+
}
387+
for (const auto& term : and_terms) {
388+
if (auto err = ValidateNoControlCharacters(term, "AND term")) {
389+
return Error(*err);
390+
}
391+
}
392+
for (const auto& term : not_terms) {
393+
if (auto err = ValidateNoControlCharacters(term, "NOT term")) {
394+
return Error(*err);
395+
}
396+
}
397+
for (const auto& [key, value] : filters) {
398+
if (auto err = ValidateNoControlCharacters(key, "filter key")) {
399+
return Error(*err);
400+
}
401+
if (auto err = ValidateNoControlCharacters(value, "filter value")) {
402+
return Error(*err);
403+
}
404+
}
405+
333406
// Build command
334407
std::ostringstream cmd;
335408
cmd << "COUNT " << table << " " << EscapeQueryString(query);
@@ -385,6 +458,13 @@ class MygramClient::Impl {
385458
}
386459

387460
std::variant<Document, Error> Get(const std::string& table, const std::string& primary_key) {
461+
if (auto err = ValidateNoControlCharacters(table, "table name")) {
462+
return Error(*err);
463+
}
464+
if (auto err = ValidateNoControlCharacters(primary_key, "primary key")) {
465+
return Error(*err);
466+
}
467+
388468
std::ostringstream cmd;
389469
cmd << "GET " << table << " " << primary_key;
390470

@@ -505,6 +585,12 @@ class MygramClient::Impl {
505585
}
506586

507587
std::variant<std::string, Error> Save(const std::string& filepath) {
588+
if (!filepath.empty()) {
589+
if (auto err = ValidateNoControlCharacters(filepath, "filepath")) {
590+
return Error(*err);
591+
}
592+
}
593+
508594
std::string cmd = filepath.empty() ? "SAVE" : "SAVE " + filepath;
509595

510596
auto result = SendCommand(cmd);
@@ -526,6 +612,10 @@ class MygramClient::Impl {
526612
}
527613

528614
std::variant<std::string, Error> Load(const std::string& filepath) {
615+
if (auto err = ValidateNoControlCharacters(filepath, "filepath")) {
616+
return Error(*err);
617+
}
618+
529619
auto result = SendCommand("LOAD " + filepath);
530620
if (auto* err = std::get_if<Error>(&result)) {
531621
return *err;

src/config/config-schema.json

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -496,6 +496,13 @@
496496
"default": 100,
497497
"minimum": 5,
498498
"maximum": 1000
499+
},
500+
"max_query_length": {
501+
"type": "integer",
502+
"description": "Maximum character length allowed for search query expressions (text + conditions)",
503+
"default": 128,
504+
"minimum": 1,
505+
"maximum": 4096
499506
}
500507
}
501508
},

src/config/config.cpp

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -517,6 +517,9 @@ Config ParseConfigFromJson(const json& root) {
517517
if (api.contains("default_limit")) {
518518
config.api.default_limit = api["default_limit"].get<int>();
519519
}
520+
if (api.contains("max_query_length")) {
521+
config.api.max_query_length = api["max_query_length"].get<int>();
522+
}
520523
}
521524

522525
// Parse network config

0 commit comments

Comments
 (0)