Skip to content

Commit c00518c

Browse files
authored
MOD-14124 Handle shard error replies in internal command execution (#85)
* Handle shard error replies in internal command execution When a shard replies with a Redis error (e.g. NOPERM) to an internal command, the framework previously either silently disconnected (cluster.c rejected non-ARRAY replies) or let the error reach application reply parsers, causing assertion failures. Changes: - Allow REDIS_REPLY_ERROR through MR_OnDataResponseArrived to MR_SetInternalCommandResults instead of disconnecting (cluster.c) - Add MR_SetShardError() for top-level shard errors: captures the error in e->errors, marks all steps done with NULL result placeholders, and advances the nodesDone counter so the execution completes normally - Handle per-element errors: if an individual internal command reply is an error, capture it in e->errors instead of calling replyParser - Add NULL guard to MR_RecordFree for safe cleanup of NULL placeholders Errors propagate through the existing e->errors mechanism, which application-level done callbacks already consume via check_and_reply_on_error(). * ## Summary - Homebrew's LLVM `libclang` on macOS causes `bindgen` to generate an opaque `Record` struct (with only `_address: u8`) instead of the actual definition containing `recordType: *mut MRRecordType`, failing the Rust compilation. - Added a post-processing step in `build.rs` that detects when bindgen produces an opaque `Record` and patches the generated bindings with the correct field. - Bumped `bindgen` dependency from `0.70` to `0.72`. ## Test plan - [ ] macOS CI pipeline passes (`make run_tests`) - [ ] Linux CI pipeline still passes (no change in behavior when Record is generated correctly) * libmr error remove * remove dead code * fix remarks * remove null check
1 parent c21b080 commit c00518c

File tree

3 files changed

+16
-7
lines changed

3 files changed

+16
-7
lines changed

Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ libc = "0.2"
1515
linkme = "0.3"
1616

1717
[build-dependencies]
18-
bindgen = "0.70"
18+
bindgen = "0.72"
1919

2020
[lib]
2121
crate-type = ["rlib"]

src/mr.c

Lines changed: 13 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1180,17 +1180,25 @@ void MR_SetInternalCommandResults(unsigned short nodeIndex, redisReply* reply, E
11801180
++mrCtx.stats.nMissedExecutions;
11811181
return;
11821182
}
1183+
11831184
RedisModule_Assert(reply->type == REDIS_REPLY_ARRAY && reply->elements > 0);
11841185
RedisModule_Assert(array_len(e->steps) == reply->elements);
11851186
size_t nodesDone = 0;
11861187
for (size_t i = 0; i < reply->elements; i++) {
11871188
Step *s = e->steps + i;
1188-
s->internalCommand.reply = reply->element[i]; // First set the reply to be parsed
1189-
Record *record = s->internalCommand.replyParser(s->internalCommand.reply); // Then parse it
1190-
s->internalCommand.reply = NULL; // And keep things tidy
1189+
s->internalCommand.reply = reply->element[i];
1190+
1191+
// Per-element error: one internal command failed (e.g. NOPERM on a key).
1192+
// Capture the error instead of calling the replyParser.
1193+
if (s->internalCommand.reply->type == REDIS_REPLY_ERROR) {
1194+
e->errors = array_append(e->errors, MR_ErrorRecordCreate(s->internalCommand.reply->str));
1195+
s->internalCommand.reply = NULL;
1196+
} else {
1197+
Record *record = s->internalCommand.replyParser(s->internalCommand.reply);
1198+
s->internalCommand.reply = NULL;
1199+
e->results = array_append(e->results, record);
1200+
}
11911201

1192-
e->results = array_append(e->results, record);
1193-
// All steps should return the same number of done nodes because we update all steps of a single node in one go
11941202
if (nodesDone == 0)
11951203
nodesDone = MR_PerformStepDoneOp(e, i);
11961204
else if (nodesDone != MR_PerformStepDoneOp(e, i))
@@ -1199,7 +1207,6 @@ void MR_SetInternalCommandResults(unsigned short nodeIndex, redisReply* reply, E
11991207
}
12001208

12011209
if (nodesDone == MR_ClusterGetSize()) {
1202-
// All nodes (shards), including myself, have answered
12031210
MR_ExecutionAddTask(e, MR_RunExecution, NULL);
12041211
}
12051212
}

src/redismodule.h

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1379,6 +1379,7 @@ REDISMODULE_API size_t (*RedisModule_MallocSizeDict)(RedisModuleDict* dict) REDI
13791379
REDISMODULE_API RedisModuleUser * (*RedisModule_CreateModuleUser)(const char *name) REDISMODULE_ATTR;
13801380
REDISMODULE_API void (*RedisModule_FreeModuleUser)(RedisModuleUser *user) REDISMODULE_ATTR;
13811381
REDISMODULE_API void (*RedisModule_SetContextUser)(RedisModuleCtx *ctx, const RedisModuleUser *user) REDISMODULE_ATTR;
1382+
REDISMODULE_API const RedisModuleUser * (*RedisModule_GetContextUser)(RedisModuleCtx *ctx) REDISMODULE_ATTR;
13821383
REDISMODULE_API int (*RedisModule_SetModuleUserACL)(RedisModuleUser *user, const char* acl) REDISMODULE_ATTR;
13831384
REDISMODULE_API int (*RedisModule_SetModuleUserACLString)(RedisModuleCtx * ctx, RedisModuleUser *user, const char* acl, RedisModuleString **error) REDISMODULE_ATTR;
13841385
REDISMODULE_API RedisModuleString * (*RedisModule_GetModuleUserACLString)(RedisModuleUser *user) REDISMODULE_ATTR;
@@ -1778,6 +1779,7 @@ static int RedisModule_Init(RedisModuleCtx *ctx, const char *name, int ver, int
17781779
REDISMODULE_GET_API(CreateModuleUser);
17791780
REDISMODULE_GET_API(FreeModuleUser);
17801781
REDISMODULE_GET_API(SetContextUser);
1782+
REDISMODULE_GET_API(GetContextUser);
17811783
REDISMODULE_GET_API(SetModuleUserACL);
17821784
REDISMODULE_GET_API(SetModuleUserACLString);
17831785
REDISMODULE_GET_API(GetModuleUserACLString);

0 commit comments

Comments
 (0)