Skip to content

Commit f9d3a2b

Browse files
committed
- [Bug] Fixed QueryAsDataTable read big file will throw NotImplementedException #360
- [Doc] Reading big file by disk-base cache
1 parent c139f35 commit f9d3a2b

File tree

12 files changed

+107
-11
lines changed

12 files changed

+107
-11
lines changed

README.md

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -293,6 +293,32 @@ support variable length and width multi-row and column filling
293293

294294
![image](https://user-images.githubusercontent.com/12729184/117973820-6d2f1800-b35f-11eb-88d8-555063938108.png)
295295

296+
#### 12. Reading big file by disk-base cache (Disk-Base Cache - SharedString)
297+
298+
If the SharedStrings size exceeds 5 MB, MiniExcel default will use local disk cache, e.g, [10x100000.xlsx](https://github.com/MiniExcel/MiniExcel/files/8403819/NotDuplicateSharedStrings_10x100000.xlsx)(one million rows data), when disable disk cache the maximum memory usage is 195MB, but able disk cache only needs 65MB. Note, this optimization needs some efficiency cost, so this case will increase reading time from 7.4 seconds to 27.2 seconds, If you don't need it that you can disable disk cache with the following code:
299+
300+
```csharp
301+
var config = new OpenXmlConfiguration { EnableSharedStringCache = false };
302+
MiniExcel.Query(path,configuration: config)
303+
```
304+
305+
You can use `SharedStringCacheSize ` to change the sharedString file size beyond the specified size for disk caching
306+
```csharp
307+
var config = new OpenXmlConfiguration { SharedStringCacheSize=500*1024*1024 };
308+
MiniExcel.Query(path, configuration: config);
309+
```
310+
311+
312+
![image](https://user-images.githubusercontent.com/12729184/161411851-1c3f72a7-33b3-4944-84dc-ffc1d16747dd.png)
313+
314+
![image](https://user-images.githubusercontent.com/12729184/161411825-17f53ec7-bef4-4b16-b234-e24799ea41b0.png)
315+
316+
317+
318+
319+
320+
321+
296322

297323

298324
### Create/Export Excel <a name="getstart2"></a>

README.zh-CN.md

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -296,6 +296,30 @@ MiniExcel.Query(path,useHeaderRow:true,startCell:"B3")
296296

297297

298298

299+
#### 12. 读取大文件硬盘缓存 (Disk-Base Cache - SharedString)
300+
301+
概念 : MiniExcel 当判断文件 SharedString 大小超过 5MB,预设会使用本地缓存,如 [10x100000.xlsx](https://github.com/MiniExcel/MiniExcel/files/8403819/NotDuplicateSharedStrings_10x100000.xlsx)(一百万笔数据),读取不开启本地缓存需要最高内存使用约195MB,开启后降为65MB。但要特别注意,此优化是以`时间换取内存减少`,所以读取效率会变慢,此例子读取时间从 7.4 秒提高到 27.2 秒,假如不需要能用以下代码关闭硬盘缓存
302+
303+
```csharp
304+
var config = new OpenXmlConfiguration { EnableSharedStringCache = false };
305+
MiniExcel.Query(path,configuration: config)
306+
```
307+
308+
也能使用 SharedStringCacheSize 调整 sharedString 文件大小超过指定大小才做硬盘缓存
309+
```csharp
310+
var config = new OpenXmlConfiguration { SharedStringCacheSize=500*1024*1024 };
311+
MiniExcel.Query(path, configuration: config);
312+
```
313+
314+
315+
![image](https://user-images.githubusercontent.com/12729184/161411851-1c3f72a7-33b3-4944-84dc-ffc1d16747dd.png)
316+
317+
![image](https://user-images.githubusercontent.com/12729184/161411825-17f53ec7-bef4-4b16-b234-e24799ea41b0.png)
318+
319+
320+
321+
322+
299323
### 写/导出 Excel <a name="getstart2"></a>
300324

301325
1. 必须是非abstract 类别有公开无参数构造函数

README.zh-Hant.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -295,6 +295,25 @@ MiniExcel.Query(path,useHeaderRow:true,startCell:"B3")
295295
![image](https://user-images.githubusercontent.com/12729184/117973820-6d2f1800-b35f-11eb-88d8-555063938108.png)
296296

297297

298+
#### 12. 讀取大文件硬碟緩存 (Disk-Base Cache - SharedString)
299+
300+
概念 : MiniExcel 當判斷文件 SharedString 大小超過 5MB,預設會使用本地緩存,如 [10x100000.xlsx](https://github.com/MiniExcel/MiniExcel/files/8403819/NotDuplicateSharedStrings_10x100000.xlsx)(一百萬筆數據),讀取不開啟本地緩存需要最高記憶體使用約195MB,開啟後降為65MB。但要特別注意,此優化是以`時間換取記憶體減少`,所以讀取效率會變慢,此例子讀取時間從 7.4 秒提高到 27.2 秒,假如不需要能用以下代碼關閉硬碟緩存
301+
302+
```csharp
303+
var config = new OpenXmlConfiguration { EnableSharedStringCache = false };
304+
MiniExcel.Query(path,configuration: config)
305+
```
306+
307+
也能使用 SharedStringCacheSize 調整 sharedString 文件大小超過指定大小才做硬碟緩存
308+
```csharp
309+
var config = new OpenXmlConfiguration { SharedStringCacheSize=500*1024*1024 };
310+
MiniExcel.Query(path, configuration: config);
311+
```
312+
313+
314+
![image](https://user-images.githubusercontent.com/12729184/161411851-1c3f72a7-33b3-4944-84dc-ffc1d16747dd.png)
315+
316+
![image](https://user-images.githubusercontent.com/12729184/161411825-17f53ec7-bef4-4b16-b234-e24799ea41b0.png)
298317

299318

300319

docs/README.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,9 @@
2222

2323
---
2424

25+
### 1.25.1
26+
- [Bug] Fixed QueryAsDataTable read big file will throw NotImplementedException #360
27+
2528
### 1.25.0
2629
- [New] Support SharingStrings disk cache (when this file size >= 5 MB), it can reduce reading 2GB SharingStrings only needs 1~13 MB memory #117
2730
- [New] SaveAs support overwriteFile parameter for enable/unable overwriting exist file #307

docs/README.zh-CN.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,8 +32,10 @@
3232

3333
---
3434

35-
### 1.25.0
35+
### 1.25.1
36+
- [Bug] 修正 QueryAsDataTable 读取大文件会抛出 NotImplementedException #360
3637

38+
### 1.25.0
3739
- [New] 支持 SharingStrings disk cache (文件大小 >= 5 MB),现在读取 2GB SharingStrings 只需要使用 1~13MB 内存 #117
3840
- [New] SaveAs 支持 overwriteFile 参数,方便调整是否要覆盖已存在文件。 #307
3941
- [Bug] SaveAs by datareader, 有时会多一个 autoFilter column #352

docs/README.zh-Hant.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,9 @@
2424

2525
---
2626

27+
### 1.25.1
28+
- [Bug] 修正 QueryAsDataTable 讀取大文件會拋出 NotImplementedException #360
29+
2730
### 1.25.0
2831
- [New] 支持 SharingStrings disk cache (當該文件大小 >= 5 MB),現在讀取 2GB SharingStrings 只需要使用 1~13MB 記憶體 #117
2932
- [New] SaveAs 支持 overwriteFile 參數,方便調整是否要覆蓋已存在文件。 #307
16.2 KB
Binary file not shown.
7.86 MB
Binary file not shown.

src/MiniExcel/MiniExcelLibs.csproj

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
<Project Sdk="Microsoft.NET.Sdk">
22
<PropertyGroup>
33
<TargetFrameworks>net45;netstandard2.0;net5.0</TargetFrameworks>
4-
<Version>1.25.0</Version>
4+
<Version>1.25.1</Version>
55
</PropertyGroup>
66
<PropertyGroup>
77
<AssemblyName>MiniExcel</AssemblyName>

src/MiniExcel/OpenXml/OpenXmlConfiguration.cs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,6 @@ public class OpenXmlConfiguration : Configuration
1212
public bool EnableConvertByteArray { get; set; } = true;
1313
public bool IgnoreTemplateParameterMissing { get; set; } = true;
1414
public bool EnableSharedStringCache { get; set; } = true;
15-
public int SharedStringCacheSize { get; set; } = 5 * 1024 * 1024;
15+
public long SharedStringCacheSize { get; set; } = 5 * 1024 * 1024;
1616
}
1717
}

0 commit comments

Comments
 (0)