-
Notifications
You must be signed in to change notification settings - Fork 9
Refactor log data processing in dataLoader.ts for improved memory efficiency and error handling #73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…iciency and error handling Summary: - Removed console logs during NDJSON parsing to clean up output. - Introduced a new function, `parseLogDataFromStream`, to handle line-by-line parsing of NDJSON from a stream, enhancing memory efficiency for large files. - Updated `processArrayBuffer` to utilize the new streaming function for gzip files and improved error handling. - Adjusted `loadLogDataFromFile` to use streaming for files exceeding 100 MB, ensuring better performance with large datasets. These changes enhance the robustness and efficiency of log data processing in the application.
try { | ||
const parsedLine: LogEntry = JSON.parse(line); | ||
if (parsedLine && typeof parsedLine === 'object') { | ||
entries.push(parsedLine); | ||
} | ||
} catch (e) { | ||
console.warn(`Failed to parse line as JSON: ${line.substring(0, 100)}...`); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very minor nit - you can de-duplicate this with the other place (above) where you handle this logic
@@ -367,6 +400,18 @@ export async function loadLogData(url: string): Promise<LogEntry[]> { | |||
* @returns Promise resolving to an array of LogEntry objects | |||
*/ | |||
export function loadLogDataFromFile(file: File): Promise<LogEntry[]> { | |||
// For large files, we should use streaming to avoid memory issues | |||
const LARGE_FILE_THRESHOLD = 100 * 1024 * 1024; // 100 MB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you share the constant with CodeViewer.tsx?
Edit: nvm, I realize it's a different limit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
A few nits inline - otherwise looks good!
Summary:
parseLogDataFromStream
, to handle line-by-line parsing of NDJSON from a stream, enhancing memory efficiency for large files.processArrayBuffer
to utilize the new streaming function for gzip files and improved error handling.loadLogDataFromFile
to use streaming for files exceeding 100 MB, ensuring better performance with large datasets.These changes enhance the robustness and efficiency of log data processing in the application.