Skip to content

Commit 9054ca9

Browse files
authored
🆔 fix: Atomic File Dedupe, Bedrock Tokens Fix, and Allowed MIME Types (#11675)
* feat: Add support for Apache Parquet MIME types - Introduced 'application/x-parquet' to the full MIME types list and code interpreter MIME types list. - Updated application MIME types regex to include 'x-parquet' and 'vnd.apache.parquet'. - Added mapping for '.parquet' files to 'application/x-parquet' in code type mapping, enhancing file format support. * feat: Implement atomic file claiming for code execution outputs - Added a new `claimCodeFile` function to atomically claim a file_id for code execution outputs, preventing duplicates by using a compound key of filename and conversationId. - Updated `processCodeOutput` to utilize the new claiming mechanism, ensuring that concurrent calls for the same filename converge on a single record. - Refactored related tests to validate the new atomic claiming behavior and its impact on file usage tracking and versioning. * fix: Update image file handling to use cache-busting filepath - Modified the `processCodeOutput` function to generate a cache-busting filepath for updated image files, improving browser caching behavior. - Adjusted related tests to reflect the change from versioned filenames to cache-busted filepaths, ensuring accurate validation of image updates. * fix: Update step handler to prevent undefined content for non-tool call types - Modified the condition in useStepHandler to ensure that undefined content is only assigned for specific content types, enhancing the robustness of content handling. * fix: Update bedrockOutputParser to handle maxTokens for adaptive models - Modified the bedrockOutputParser logic to ensure that maxTokens is not set for adaptive models when neither maxTokens nor maxOutputTokens are provided, improving the handling of adaptive thinking configurations. - Updated related tests to reflect these changes, ensuring accurate validation of the output for adaptive models. * chore: Update @librechat/agents to version 3.1.38 in package.json and package-lock.json * fix: Enhance file claiming and error handling in code processing - Updated the `processCodeOutput` function to use a consistent file ID for claiming files, preventing duplicates and improving concurrency handling. - Refactored the `createFileMethods` to include error handling for failed file claims, ensuring robust behavior when claiming files for conversations. - These changes enhance the reliability of file management in the application. * fix: Update adaptive thinking test for Opus 4.6 model - Modified the test for configuring adaptive thinking to reflect that no default maxTokens should be set for the Opus 4.6 model. - Updated assertions to ensure that maxTokens is undefined, aligning with the expected behavior for adaptive models.
1 parent a771d70 commit 9054ca9

File tree

12 files changed

+130
-126
lines changed

12 files changed

+130
-126
lines changed

‎api/package.json‎

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,7 @@
4444
"@google/genai": "^1.19.0",
4545
"@keyv/redis": "^4.3.3",
4646
"@langchain/core": "^0.3.80",
47-
"@librechat/agents": "^3.1.37",
47+
"@librechat/agents": "^3.1.38",
4848
"@librechat/api": "*",
4949
"@librechat/data-schemas": "*",
5050
"@microsoft/microsoft-graph-client": "^3.0.7",

‎api/server/services/Files/Code/process.js‎

Lines changed: 22 additions & 58 deletions
Original file line numberDiff line numberDiff line change
@@ -18,9 +18,9 @@ const {
1818
getEndpointFileConfig,
1919
} = require('librechat-data-provider');
2020
const { filterFilesByAgentAccess } = require('~/server/services/Files/permissions');
21+
const { createFile, getFiles, updateFile, claimCodeFile } = require('~/models');
2122
const { getStrategyFunctions } = require('~/server/services/Files/strategies');
2223
const { convertImage } = require('~/server/services/Files/images/convert');
23-
const { createFile, getFiles, updateFile } = require('~/models');
2424
const { determineFileType } = require('~/server/utils');
2525

2626
/**
@@ -56,50 +56,6 @@ const createDownloadFallback = ({
5656
};
5757
};
5858

59-
/**
60-
* Find an existing code-generated file by filename in the conversation.
61-
* Used to update existing files instead of creating duplicates.
62-
*
63-
* ## Deduplication Strategy
64-
*
65-
* Files are deduplicated by `(conversationId, filename)` - NOT including `messageId`.
66-
* This is an intentional design decision to handle iterative code development patterns:
67-
*
68-
* **Rationale:**
69-
* - When users iteratively refine code (e.g., "regenerate that chart with red bars"),
70-
* the same logical file (e.g., "chart.png") is produced multiple times
71-
* - Without deduplication, each iteration would create a new file, leading to storage bloat
72-
* - The latest version is what matters for re-upload to the code environment
73-
*
74-
* **Implications:**
75-
* - Different messages producing files with the same name will update the same file record
76-
* - The `messageId` field tracks which message last updated the file
77-
* - The `usage` counter tracks how many times the file has been generated
78-
*
79-
* **Future Considerations:**
80-
* - If file versioning is needed, consider adding a `versions` array or separate version collection
81-
* - The current approach prioritizes storage efficiency over history preservation
82-
*
83-
* @param {string} filename - The filename to search for.
84-
* @param {string} conversationId - The conversation ID.
85-
* @returns {Promise<MongoFile | null>} The existing file or null.
86-
*/
87-
const findExistingCodeFile = async (filename, conversationId) => {
88-
if (!filename || !conversationId) {
89-
return null;
90-
}
91-
const files = await getFiles(
92-
{
93-
filename,
94-
conversationId,
95-
context: FileContext.execute_code,
96-
},
97-
{ createdAt: -1 },
98-
{ text: 0 },
99-
);
100-
return files?.[0] ?? null;
101-
};
102-
10359
/**
10460
* Process code execution output files - downloads and saves both images and non-image files.
10561
* All files are saved to local storage with fileIdentifier metadata for code env re-upload.
@@ -170,12 +126,19 @@ const processCodeOutput = async ({
170126
const fileIdentifier = `${session_id}/${id}`;
171127

172128
/**
173-
* Check for existing file with same filename in this conversation.
174-
* If found, we'll update it instead of creating a duplicate.
129+
* Atomically claim a file_id for this (filename, conversationId, context) tuple.
130+
* Uses $setOnInsert so concurrent calls for the same filename converge on
131+
* a single record instead of creating duplicates (TOCTOU race fix).
175132
*/
176-
const existingFile = await findExistingCodeFile(name, conversationId);
177-
const file_id = existingFile?.file_id ?? v4();
178-
const isUpdate = !!existingFile;
133+
const newFileId = v4();
134+
const claimed = await claimCodeFile({
135+
filename: name,
136+
conversationId,
137+
file_id: newFileId,
138+
user: req.user.id,
139+
});
140+
const file_id = claimed.file_id;
141+
const isUpdate = file_id !== newFileId;
179142

180143
if (isUpdate) {
181144
logger.debug(
@@ -184,27 +147,29 @@ const processCodeOutput = async ({
184147
}
185148

186149
if (isImage) {
150+
const usage = isUpdate ? (claimed.usage ?? 0) + 1 : 1;
187151
const _file = await convertImage(req, buffer, 'high', `${file_id}${fileExt}`);
152+
const filepath = usage > 1 ? `${_file.filepath}?v=${Date.now()}` : _file.filepath;
188153
const file = {
189154
..._file,
155+
filepath,
190156
file_id,
191157
messageId,
192-
usage: isUpdate ? (existingFile.usage ?? 0) + 1 : 1,
158+
usage,
193159
filename: name,
194160
conversationId,
195161
user: req.user.id,
196162
type: `image/${appConfig.imageOutputType}`,
197-
createdAt: isUpdate ? existingFile.createdAt : formattedDate,
163+
createdAt: isUpdate ? claimed.createdAt : formattedDate,
198164
updatedAt: formattedDate,
199165
source: appConfig.fileStrategy,
200166
context: FileContext.execute_code,
201167
metadata: { fileIdentifier },
202168
};
203-
createFile(file, true);
169+
await createFile(file, true);
204170
return Object.assign(file, { messageId, toolCallId });
205171
}
206172

207-
// For non-image files, save to configured storage strategy
208173
const { saveBuffer } = getStrategyFunctions(appConfig.fileStrategy);
209174
if (!saveBuffer) {
210175
logger.warn(
@@ -221,7 +186,6 @@ const processCodeOutput = async ({
221186
});
222187
}
223188

224-
// Determine MIME type from buffer or extension
225189
const detectedType = await determineFileType(buffer, true);
226190
const mimeType = detectedType?.mime || inferMimeType(name, '') || 'application/octet-stream';
227191

@@ -258,11 +222,11 @@ const processCodeOutput = async ({
258222
metadata: { fileIdentifier },
259223
source: appConfig.fileStrategy,
260224
context: FileContext.execute_code,
261-
usage: isUpdate ? (existingFile.usage ?? 0) + 1 : 1,
262-
createdAt: isUpdate ? existingFile.createdAt : formattedDate,
225+
usage: isUpdate ? (claimed.usage ?? 0) + 1 : 1,
226+
createdAt: isUpdate ? claimed.createdAt : formattedDate,
263227
};
264228

265-
createFile(file, true);
229+
await createFile(file, true);
266230
return Object.assign(file, { messageId, toolCallId });
267231
} catch (error) {
268232
logAxiosError({

‎api/server/services/Files/Code/process.spec.js‎

Lines changed: 42 additions & 49 deletions
Original file line numberDiff line numberDiff line change
@@ -61,10 +61,12 @@ jest.mock('@librechat/api', () => ({
6161
}));
6262

6363
// Mock models
64+
const mockClaimCodeFile = jest.fn();
6465
jest.mock('~/models', () => ({
65-
createFile: jest.fn(),
66+
createFile: jest.fn().mockResolvedValue({}),
6667
getFiles: jest.fn(),
6768
updateFile: jest.fn(),
69+
claimCodeFile: (...args) => mockClaimCodeFile(...args),
6870
}));
6971

7072
// Mock permissions (must be before process.js import)
@@ -119,7 +121,11 @@ describe('Code Process', () => {
119121

120122
beforeEach(() => {
121123
jest.clearAllMocks();
122-
// Default mock implementations
124+
// Default mock: atomic claim returns a new file record (no existing file)
125+
mockClaimCodeFile.mockResolvedValue({
126+
file_id: 'mock-uuid-1234',
127+
user: 'user-123',
128+
});
123129
getFiles.mockResolvedValue(null);
124130
createFile.mockResolvedValue({});
125131
getStrategyFunctions.mockReturnValue({
@@ -128,67 +134,46 @@ describe('Code Process', () => {
128134
determineFileType.mockResolvedValue({ mime: 'text/plain' });
129135
});
130136

131-
describe('findExistingCodeFile (via processCodeOutput)', () => {
132-
it('should find existing file by filename and conversationId', async () => {
133-
const existingFile = {
137+
describe('atomic file claim (via processCodeOutput)', () => {
138+
it('should reuse file_id from existing record via atomic claim', async () => {
139+
mockClaimCodeFile.mockResolvedValue({
134140
file_id: 'existing-file-id',
135141
filename: 'test-file.txt',
136142
usage: 2,
137143
createdAt: '2024-01-01T00:00:00.000Z',
138-
};
139-
getFiles.mockResolvedValue([existingFile]);
144+
});
140145

141146
const smallBuffer = Buffer.alloc(100);
142147
axios.mockResolvedValue({ data: smallBuffer });
143148

144149
const result = await processCodeOutput(baseParams);
145150

146-
// Verify getFiles was called with correct deduplication query
147-
expect(getFiles).toHaveBeenCalledWith(
148-
{
149-
filename: 'test-file.txt',
150-
conversationId: 'conv-123',
151-
context: FileContext.execute_code,
152-
},
153-
{ createdAt: -1 },
154-
{ text: 0 },
155-
);
151+
expect(mockClaimCodeFile).toHaveBeenCalledWith({
152+
filename: 'test-file.txt',
153+
conversationId: 'conv-123',
154+
file_id: 'mock-uuid-1234',
155+
user: 'user-123',
156+
});
156157

157-
// Verify the existing file_id was reused
158158
expect(result.file_id).toBe('existing-file-id');
159-
// Verify usage was incremented
160159
expect(result.usage).toBe(3);
161-
// Verify original createdAt was preserved
162160
expect(result.createdAt).toBe('2024-01-01T00:00:00.000Z');
163161
});
164162

165163
it('should create new file when no existing file found', async () => {
166-
getFiles.mockResolvedValue(null);
164+
mockClaimCodeFile.mockResolvedValue({
165+
file_id: 'mock-uuid-1234',
166+
user: 'user-123',
167+
});
167168

168169
const smallBuffer = Buffer.alloc(100);
169170
axios.mockResolvedValue({ data: smallBuffer });
170171

171172
const result = await processCodeOutput(baseParams);
172173

173-
// Should use the mocked uuid
174174
expect(result.file_id).toBe('mock-uuid-1234');
175-
// Should have usage of 1 for new file
176175
expect(result.usage).toBe(1);
177176
});
178-
179-
it('should return null for invalid inputs (empty filename)', async () => {
180-
const smallBuffer = Buffer.alloc(100);
181-
axios.mockResolvedValue({ data: smallBuffer });
182-
183-
// The function handles this internally - with empty name
184-
// findExistingCodeFile returns null early for empty filename (guard clause)
185-
const result = await processCodeOutput({ ...baseParams, name: '' });
186-
187-
// getFiles should NOT be called due to early return in findExistingCodeFile
188-
expect(getFiles).not.toHaveBeenCalled();
189-
// A new file_id should be generated since no existing file was found
190-
expect(result.file_id).toBe('mock-uuid-1234');
191-
});
192177
});
193178

194179
describe('processCodeOutput', () => {
@@ -203,7 +188,6 @@ describe('Code Process', () => {
203188
bytes: 400,
204189
};
205190
convertImage.mockResolvedValue(convertedFile);
206-
getFiles.mockResolvedValue(null);
207191

208192
const result = await processCodeOutput(imageParams);
209193

@@ -218,23 +202,29 @@ describe('Code Process', () => {
218202
expect(result.filename).toBe('chart.png');
219203
});
220204

221-
it('should update existing image file and increment usage', async () => {
205+
it('should update existing image file with cache-busted filepath', async () => {
222206
const imageParams = { ...baseParams, name: 'chart.png' };
223-
const existingFile = {
207+
mockClaimCodeFile.mockResolvedValue({
224208
file_id: 'existing-img-id',
225209
usage: 1,
226210
createdAt: '2024-01-01T00:00:00.000Z',
227-
};
228-
getFiles.mockResolvedValue([existingFile]);
211+
});
229212

230213
const imageBuffer = Buffer.alloc(500);
231214
axios.mockResolvedValue({ data: imageBuffer });
232-
convertImage.mockResolvedValue({ filepath: '/uploads/img.webp' });
215+
convertImage.mockResolvedValue({ filepath: '/images/user-123/existing-img-id.webp' });
233216

234217
const result = await processCodeOutput(imageParams);
235218

219+
expect(convertImage).toHaveBeenCalledWith(
220+
mockReq,
221+
imageBuffer,
222+
'high',
223+
'existing-img-id.png',
224+
);
236225
expect(result.file_id).toBe('existing-img-id');
237226
expect(result.usage).toBe(2);
227+
expect(result.filepath).toMatch(/^\/images\/user-123\/existing-img-id\.webp\?v=\d+$/);
238228
expect(logger.debug).toHaveBeenCalledWith(
239229
expect.stringContaining('Updating existing file'),
240230
);
@@ -335,7 +325,6 @@ describe('Code Process', () => {
335325

336326
describe('usage counter increment', () => {
337327
it('should set usage to 1 for new files', async () => {
338-
getFiles.mockResolvedValue(null);
339328
const smallBuffer = Buffer.alloc(100);
340329
axios.mockResolvedValue({ data: smallBuffer });
341330

@@ -345,8 +334,11 @@ describe('Code Process', () => {
345334
});
346335

347336
it('should increment usage for existing files', async () => {
348-
const existingFile = { file_id: 'existing-id', usage: 5, createdAt: '2024-01-01' };
349-
getFiles.mockResolvedValue([existingFile]);
337+
mockClaimCodeFile.mockResolvedValue({
338+
file_id: 'existing-id',
339+
usage: 5,
340+
createdAt: '2024-01-01',
341+
});
350342
const smallBuffer = Buffer.alloc(100);
351343
axios.mockResolvedValue({ data: smallBuffer });
352344

@@ -356,14 +348,15 @@ describe('Code Process', () => {
356348
});
357349

358350
it('should handle existing file with undefined usage', async () => {
359-
const existingFile = { file_id: 'existing-id', createdAt: '2024-01-01' };
360-
getFiles.mockResolvedValue([existingFile]);
351+
mockClaimCodeFile.mockResolvedValue({
352+
file_id: 'existing-id',
353+
createdAt: '2024-01-01',
354+
});
361355
const smallBuffer = Buffer.alloc(100);
362356
axios.mockResolvedValue({ data: smallBuffer });
363357

364358
const result = await processCodeOutput(baseParams);
365359

366-
// (undefined ?? 0) + 1 = 1
367360
expect(result.usage).toBe(1);
368361
});
369362
});

‎client/src/hooks/SSE/useStepHandler.ts‎

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -111,7 +111,7 @@ export default function useStepHandler({
111111
const updatedContent = [...(message.content || [])] as Array<
112112
Partial<TMessageContentParts> | undefined
113113
>;
114-
if (!updatedContent[index]) {
114+
if (!updatedContent[index] && contentType !== ContentTypes.TOOL_CALL) {
115115
updatedContent[index] = { type: contentPart.type as AllContentTypes };
116116
}
117117

‎package-lock.json‎

Lines changed: 5 additions & 5 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

‎packages/api/package.json‎

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -87,7 +87,7 @@
8787
"@google/genai": "^1.19.0",
8888
"@keyv/redis": "^4.3.3",
8989
"@langchain/core": "^0.3.80",
90-
"@librechat/agents": "^3.1.37",
90+
"@librechat/agents": "^3.1.38",
9191
"@librechat/data-schemas": "*",
9292
"@modelcontextprotocol/sdk": "^1.26.0",
9393
"@smithy/node-http-handler": "^4.4.5",

0 commit comments

Comments
 (0)