-
-
Notifications
You must be signed in to change notification settings - Fork 6.8k
feat: add configurable file metadata exposure to LLMs #11094
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: add configurable file metadata exposure to LLMs #11094
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds a configurable file metadata exposure feature that allows LLMs to access file metadata (filename, type, size, etc.) when processing attachments. Metadata can be injected in markdown, JSON, or XML formats, with configurable field selection including both safe defaults and opt-in sensitive fields.
Key changes:
- Adds type definitions and Zod schemas for file metadata configuration
- Implements metadata formatting functions supporting markdown, JSON, and XML output
- Integrates metadata injection into the file context extraction pipeline
- Provides comprehensive test coverage for formatting and extraction logic
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
packages/data-provider/src/types/files.ts |
Adds FileMetadataConfig type definition for metadata configuration |
packages/data-provider/src/file-config.ts |
Defines FileMetadataFields enum, default fields, Zod schema, and merging logic |
packages/api/src/files/context.ts |
Implements metadata formatting functions and integrates metadata injection into extractFileContext |
packages/api/src/files/context.spec.ts |
Adds comprehensive unit tests for metadata formatting and file context extraction |
librechat.example.yaml |
Documents configuration options with examples of safe and opt-in fields |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@danny-avila I worked out the copilot comments, if you could review it again. |
f3d67b3 to
22d6b6d
Compare
22d6b6d to
99fb824
Compare
When users upload files, LLMs and tools/MCP servers can now access
key metadata (filename, mimeType, sizeBytes, etc.) through a new
configurable fileConfig.metadata option in librechat.yaml.
Features:
- New FileMetadataFields enum with core and opt-in fields
- Configurable output formats: markdown, json, xml
- Default fields: filename, type, bytes (safe defaults)
- Opt-in sensitive fields: filepath, conversationId, file_id
- Human-readable size formatting (e.g., "1.5 MB")
- Full test coverage for formatFileMetadata and extractFileContext
Configuration example:
```yaml
fileConfig:
metadata:
enabled: true
fields: [filename, type, bytes, source, filepath]
format: json
```
This enables LLMs to reference files by name and pass metadata
to tools when invoking them.
- Fix XML injection by escaping special characters in formatAsXml - Fix formatBytes overflow for files >= 1TB (add TB/PB units) - Remove unused hasTextFiles variable - Add token limit for metadata to prevent excessive context usage - Move source field to opt-in section (may expose infrastructure details) - Add tests for XML special characters and TB file sizes
99fb824 to
ced7e6e
Compare
|
@danny-avila can you review it again please, I fixed all copilot issues, and made manual tests on my end to see the metadata injection works properly with different providers. |
Summary
Changes
FileMetadataFieldsenum with core and opt-in fieldsfileMetadataConfigSchemafor YAML configurationformatFileMetadata()function with markdown/json/xml output formatsextractFileContext()to inject metadata into messagesConfiguration Example
Output Formats
Markdown (default):
JSON:
{ "filename": "report.pdf", "type": "application/pdf", "bytes": 1048576, "size_human": "1 MB" }Available Fields
filenametypebytessourcewidth/heightcreatedAt/updatedAtfilepathconversationIdfile_idTest plan
formatFileMetadata()covering all formats and fieldsextractFileContext()with metadata enabled/disabled