Skip to content

Commit c0eebb7

Browse files
committed
fix: Resolve Web Worker error by handling File objects correctly and enhance performance with chunked processing
1 parent 6e5bfc6 commit c0eebb7

File tree

4 files changed

+129
-68
lines changed

4 files changed

+129
-68
lines changed

PERFORMANCE_OPTIMIZATIONS.md

Lines changed: 17 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,48 +3,56 @@
33
## Issue Resolution
44

55
### Problem Identified
6+
67
The error "Cannot read properties of undefined (reading 'split')" was caused by the Web Worker expecting a string `fileContent` parameter, but receiving a `File` object instead.
78

89
### Root Cause
10+
911
The FileUpload component was passing a `File` object directly to the Web Worker, but the worker was trying to call `.split()` on `undefined` because it expected the file content as a string.
1012

1113
### Solution Implemented
14+
1215
1. **Updated Web Worker**: Modified `csvWorker.js` to properly handle `File` objects by using the `file.text()` method to read file content asynchronously.
1316
2. **Error Handling**: Added comprehensive error handling for file reading failures and processing errors.
1417
3. **Proper Async Flow**: Implemented proper promise-based file reading with `.then()` and `.catch()` handlers.
1518

1619
## Performance Improvements Implemented
1720

1821
### 1. Web Worker Integration ✅
22+
1923
- **Non-blocking CSV processing**: Large files no longer freeze the UI during upload and processing
2024
- **Progress tracking**: Real-time progress updates showing rows processed vs total rows
2125
- **Chunked processing**: Processes data in 10,000-row chunks to maintain responsiveness
2226
- **Memory efficient**: Processes data incrementally rather than loading everything into memory at once
2327

2428
### 2. DataProcessor Utility Class ✅
29+
2530
- **Memory-efficient aggregation**: Optimized data structures for large datasets
2631
- **Intelligent sampling**: Automatically samples large datasets while preserving trends
2732
- **Efficient filtering**: Early termination and optimized filtering logic
2833
- **Performance-aware operations**: Limits data points and uses chunked processing
2934

3035
### 3. Component Optimizations ✅
36+
3137
- **Memoized calculations**: Uses `useMemo` for expensive computations like repository aggregation
3238
- **Callback optimization**: Uses `useCallback` to prevent unnecessary re-renders
3339
- **Efficient data structures**: Pre-compiled regex patterns and optimized lookup operations
3440

3541
### 4. UI/UX Improvements ✅
42+
3643
- **Progress indicators**: Visual progress bar with row count display
3744
- **Error recovery**: Graceful error handling with user-friendly messages
3845
- **Background processing**: Non-blocking file uploads maintain UI responsiveness
3946

4047
## Technical Implementation Details
4148

4249
### Web Worker Architecture
50+
4351
```javascript
4452
// File object handling
45-
file.text().then(fileContent => {
53+
file.text().then((fileContent) => {
4654
processCSVContent(fileContent, chunkSize);
47-
})
55+
});
4856

4957
// Chunked processing
5058
function processChunk(startIndex) {
@@ -54,6 +62,7 @@ function processChunk(startIndex) {
5462
```
5563

5664
### DataProcessor Optimizations
65+
5766
```typescript
5867
// Memory-efficient repository aggregation
5968
static aggregateByRepository(data, topN = 10, breakdown = "quantity") {
@@ -69,9 +78,10 @@ private static sampleData(data, targetSize) {
6978
```
7079

7180
### Component Optimizations
81+
7282
```typescript
7383
// Memoized expensive calculations
74-
const { topRepos, repoTotals, dailyData } = useMemo(() =>
84+
const { topRepos, repoTotals, dailyData } = useMemo(() =>
7585
DataProcessor.aggregateByRepository(data, 10, breakdown),
7686
[data, breakdown]
7787
);
@@ -89,26 +99,30 @@ static filterData(data, filters) {
8999
## Performance Benefits
90100

91101
### Before Optimizations
102+
92103
- UI freezing during large file uploads
93104
- Slow rendering with large datasets
94105
- Memory issues with extensive data
95106
- Poor user experience during processing
96107

97108
### After Optimizations
109+
98110
- ✅ Non-blocking file processing with progress tracking
99111
- ✅ Responsive UI even with large datasets (1000+ data points)
100112
- ✅ Memory-efficient processing with chunked operations
101113
- ✅ Optimized rendering with memoized calculations
102114
- ✅ Graceful error handling and recovery
103115

104116
## Testing Results
117+
105118
- ✅ Build compilation successful with no errors
106119
- ✅ Development server running on localhost:3001
107120
- ✅ Web Worker properly handles File objects
108121
- ✅ Progress tracking functional during file processing
109122
- ✅ All existing functionality preserved
110123

111124
## Privacy-First Approach Maintained
125+
112126
- ✅ All processing remains client-side
113127
- ✅ No data sent to external servers
114128
- ✅ Web Workers run in browser context

src/components/charts/ServiceChart.tsx

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -995,14 +995,21 @@ function RepositoryBasedChart({
995995
<AreaChart data={orgChartData}>
996996
<CartesianGrid strokeDasharray="3 3" stroke="#374151" />
997997
<XAxis dataKey="date" stroke="#9ca3af" fontSize={12} />
998-
<YAxis stroke="#9ca3af" fontSize={12} tickFormatter={getFormatter()} />
998+
<YAxis
999+
stroke="#9ca3af"
1000+
fontSize={12}
1001+
tickFormatter={getFormatter()}
1002+
/>
9991003
<Tooltip
10001004
contentStyle={{
10011005
backgroundColor: "#1f2937",
10021006
border: "1px solid #374151",
10031007
borderRadius: "8px",
10041008
}}
1005-
formatter={(value: number) => [getFormatter()(value), getBreakdownLabel()]}
1009+
formatter={(value: number) => [
1010+
getFormatter()(value),
1011+
getBreakdownLabel(),
1012+
]}
10061013
labelStyle={{ color: "#d1d5db" }}
10071014
/>
10081015
{orgsToShow.map((org: string, index: number) => (

src/lib/dataProcessor.ts

Lines changed: 79 additions & 51 deletions
Original file line numberDiff line numberDiff line change
@@ -20,23 +20,27 @@ export class DataProcessor {
2020
data: ServiceData[],
2121
options: ProcessingOptions = {}
2222
): Record<string, any> {
23-
const { maxDataPoints = this.DEFAULT_MAX_DATA_POINTS, chunkSize = this.DEFAULT_CHUNK_SIZE } = options;
24-
23+
const {
24+
maxDataPoints = this.DEFAULT_MAX_DATA_POINTS,
25+
chunkSize = this.DEFAULT_CHUNK_SIZE,
26+
} = options;
27+
2528
// Sort data by date to enable efficient sampling
2629
const sortedData = [...data].sort((a, b) => a.date.localeCompare(b.date));
27-
30+
2831
// If data is too large, sample it intelligently
29-
const sampledData = sortedData.length > maxDataPoints
30-
? this.sampleData(sortedData, maxDataPoints)
31-
: sortedData;
32+
const sampledData =
33+
sortedData.length > maxDataPoints
34+
? this.sampleData(sortedData, maxDataPoints)
35+
: sortedData;
3236

3337
const aggregated: Record<string, any> = {};
34-
38+
3539
// Process in chunks to avoid blocking the main thread
3640
for (let i = 0; i < sampledData.length; i += chunkSize) {
3741
const chunk = sampledData.slice(i, i + chunkSize);
38-
39-
chunk.forEach(item => {
42+
43+
chunk.forEach((item) => {
4044
const date = item.date;
4145
if (!aggregated[date]) {
4246
aggregated[date] = {
@@ -45,15 +49,16 @@ export class DataProcessor {
4549
quantity: 0,
4650
repositories: new Set<string>(),
4751
organizations: new Set<string>(),
48-
skus: new Set<string>()
52+
skus: new Set<string>(),
4953
};
5054
}
51-
55+
5256
aggregated[date].cost += item.cost;
5357
aggregated[date].quantity += item.quantity;
54-
58+
5559
if (item.repository) aggregated[date].repositories.add(item.repository);
56-
if (item.organization) aggregated[date].organizations.add(item.organization);
60+
if (item.organization)
61+
aggregated[date].organizations.add(item.organization);
5762
aggregated[date].skus.add(item.sku);
5863
});
5964
}
@@ -74,16 +79,19 @@ export class DataProcessor {
7479
/**
7580
* Intelligently sample large datasets while preserving trends
7681
*/
77-
private static sampleData(data: ServiceData[], targetSize: number): ServiceData[] {
82+
private static sampleData(
83+
data: ServiceData[],
84+
targetSize: number
85+
): ServiceData[] {
7886
if (data.length <= targetSize) return data;
79-
87+
8088
const step = Math.ceil(data.length / targetSize);
8189
const sampled: ServiceData[] = [];
82-
90+
8391
for (let i = 0; i < data.length; i += step) {
8492
sampled.push(data[i]);
8593
}
86-
94+
8795
return sampled;
8896
}
8997

@@ -114,37 +122,48 @@ export class DataProcessor {
114122
costCenter?: string;
115123
}
116124
): ServiceData[] {
117-
const { startDate, endDate, organization, repository, costCenter } = filters;
118-
119-
return data.filter(item => {
125+
const { startDate, endDate, organization, repository, costCenter } =
126+
filters;
127+
128+
return data.filter((item) => {
120129
// Date filtering (most selective first)
121130
if (startDate && item.date < startDate) return false;
122131
if (endDate && item.date > endDate) return false;
123-
132+
124133
// String filtering with exact matches for performance
125-
if (organization && organization !== "all" && item.organization !== organization) return false;
126-
if (repository && repository !== "all" && item.repository !== repository) return false;
127-
if (costCenter && costCenter !== "all" && item.costCenter !== costCenter) return false;
128-
134+
if (
135+
organization &&
136+
organization !== "all" &&
137+
item.organization !== organization
138+
)
139+
return false;
140+
if (repository && repository !== "all" && item.repository !== repository)
141+
return false;
142+
if (costCenter && costCenter !== "all" && item.costCenter !== costCenter)
143+
return false;
144+
129145
return true;
130146
});
131147
}
132148

133149
/**
134150
* Memory-efficient unique value extraction
135151
*/
136-
static getUniqueValues(data: ServiceData[], field: keyof ServiceData): string[] {
152+
static getUniqueValues(
153+
data: ServiceData[],
154+
field: keyof ServiceData
155+
): string[] {
137156
const seen = new Set<string>();
138157
const result: string[] = [];
139-
158+
140159
for (const item of data) {
141160
const value = item[field] as string;
142161
if (value && !seen.has(value)) {
143162
seen.add(value);
144163
result.push(value);
145164
}
146165
}
147-
166+
148167
return result.sort();
149168
}
150169

@@ -163,22 +182,28 @@ export class DataProcessor {
163182
actionsStorage: [] as ServiceData[],
164183
packages: [] as ServiceData[],
165184
copilot: [] as ServiceData[],
166-
codespaces: [] as ServiceData[]
185+
codespaces: [] as ServiceData[],
167186
};
168187

169-
data.forEach(item => {
188+
data.forEach((item) => {
170189
const sku = item.sku.toLowerCase();
171-
190+
172191
// Simple pattern matching for basic categorization
173-
if (sku.includes('storage')) {
192+
if (sku.includes("storage")) {
174193
categories.actionsStorage.push(item);
175-
} else if (sku.includes('action') || sku.includes('minute') || sku.includes('linux') || sku.includes('windows') || sku.includes('macos')) {
194+
} else if (
195+
sku.includes("action") ||
196+
sku.includes("minute") ||
197+
sku.includes("linux") ||
198+
sku.includes("windows") ||
199+
sku.includes("macos")
200+
) {
176201
categories.actionsMinutes.push(item);
177-
} else if (sku.includes('package')) {
202+
} else if (sku.includes("package")) {
178203
categories.packages.push(item);
179-
} else if (sku.includes('copilot')) {
204+
} else if (sku.includes("copilot")) {
180205
categories.copilot.push(item);
181-
} else if (sku.includes('codespace')) {
206+
} else if (sku.includes("codespace")) {
182207
categories.codespaces.push(item);
183208
}
184209
});
@@ -200,9 +225,9 @@ export class DataProcessor {
200225
} {
201226
// First pass: calculate repository totals
202227
const repoTotals: Record<string, { cost: number; quantity: number }> = {};
203-
204-
data.forEach(item => {
205-
const repo = item.repository || 'Unknown';
228+
229+
data.forEach((item) => {
230+
const repo = item.repository || "Unknown";
206231
if (!repoTotals[repo]) {
207232
repoTotals[repo] = { cost: 0, quantity: 0 };
208233
}
@@ -212,31 +237,34 @@ export class DataProcessor {
212237

213238
// Get top repositories
214239
const topRepos = Object.entries(repoTotals)
215-
.sort(([, a], [, b]) => breakdown === "cost" ? b.cost - a.cost : b.quantity - a.quantity)
240+
.sort(([, a], [, b]) =>
241+
breakdown === "cost" ? b.cost - a.cost : b.quantity - a.quantity
242+
)
216243
.slice(0, topN)
217244
.map(([repo]) => repo);
218245

219246
// Second pass: aggregate daily data
220247
const dailyData: Record<string, any> = {};
221-
222-
data.forEach(item => {
248+
249+
data.forEach((item) => {
223250
const date = item.date;
224-
const repo = topRepos.includes(item.repository || 'Unknown')
225-
? (item.repository || 'Unknown')
226-
: 'Others';
227-
251+
const repo = topRepos.includes(item.repository || "Unknown")
252+
? item.repository || "Unknown"
253+
: "Others";
254+
228255
if (!dailyData[date]) {
229256
dailyData[date] = { date, total: 0, totalQuantity: 0 };
230-
topRepos.forEach(r => {
257+
topRepos.forEach((r) => {
231258
dailyData[date][r] = 0;
232259
dailyData[date][`${r}_quantity`] = 0;
233260
});
234-
dailyData[date]['Others'] = 0;
235-
dailyData[date]['Others_quantity'] = 0;
261+
dailyData[date]["Others"] = 0;
262+
dailyData[date]["Others_quantity"] = 0;
236263
}
237-
264+
238265
dailyData[date][repo] = (dailyData[date][repo] || 0) + item.cost;
239-
dailyData[date][`${repo}_quantity`] = (dailyData[date][`${repo}_quantity`] || 0) + item.quantity;
266+
dailyData[date][`${repo}_quantity`] =
267+
(dailyData[date][`${repo}_quantity`] || 0) + item.quantity;
240268
dailyData[date].total += item.cost;
241269
dailyData[date].totalQuantity += item.quantity;
242270
});

0 commit comments

Comments
 (0)