Skip to content

Use SharpZipLib to parse a .xlsx file hit ‘Data descriptor signature not found’ error #900

@Helen21356

Description

@Helen21356

Describe the bug

Use SharpZipLib to parse a .xlsx file hit ‘Data descriptor signature not found’ error:

Ask:

  1. Would like to understand what does this mean - what is the 'Data descriptor signature' code tried to get?
  2. Is there any plan to support such kind of file?

Exception details:
Only '.xls' and '.xlsx' format is supported in reading excel file while error is ' at ICSharpCode.SharpZipLib.Zip.ZipInputStream.ReadDataDescriptor()
at ICSharpCode.SharpZipLib.Zip.ZipInputStream.CompleteCloseEntry(Boolean testCrc)
at ICSharpCode.SharpZipLib.Zip.ZipInputStream.BodyRead(Byte[] buffer, Int32 offset, Int32 count)
at NPOI.OpenXml4Net.Util.ZipInputStreamZipEntrySource.FakeZipEntry..ctor(ZipEntry entry, ZipInputStream inp)
at NPOI.OpenXml4Net.Util.ZipInputStreamZipEntrySource..ctor(ZipInputStream inp)
at NPOI.OpenXml4Net.OPC.ZipPackage..ctor(Stream filestream, PackageAccess access)
at NPOI.OpenXml4Net.OPC.OPCPackage.Open(Stream in1)
at NPOI.Util.PackageHelper.Open(Stream is1)
at NPOI.XSSF.UserModel.XSSFWorkbook..ctor(Stream is1)
at Microsoft.DataTransfer.ClientLibrary.ExcelUtility.GetExcelWorkbook(String fileExtension, TransferStream stream)'.
Data descriptor signature not found

For comparison:

  1. Python openpyxl can parse this file successfully. Looks like SharpZipLib has Stricter verification.
    https://pypi.org/project/openpyxl/
  2. Office Excel cannot open this .xlsx file successfully. Also we tried to 'Save as' from Excel application then SharpZipLib can handle it successfully.

Reproduction Code

No response

Steps to reproduce

using following sample code to read excel file with 'bad data':

        using (FileStream file = new FileStream(filePath, FileMode.Open, FileAccess.Read))
        {
            XSSFWorkbook workbook = new XSSFWorkbook(file);
            ISheet sheet = workbook.GetSheet("Page1_1");
            if (sheet == null)
            {
                Console.WriteLine("Sheet 'Page1_1' not found.");
                return;
            }
            for (int row = 0; row <= sheet.LastRowNum; row++)
            {
                IRow currentRow = sheet.GetRow(row);
                if (currentRow == null) continue;
                for (int col = 0; col < currentRow.LastCellNum; col++)
                {
                    var cell = currentRow.GetCell(col);
                    Console.Write((cell?.ToString() ?? "") + "\t");
                }
                Console.WriteLine();
            }
        }
    }
    catch (Exception ex)
    {
        Console.WriteLine("Error reading Excel file:");
        Console.WriteLine(ex.StackTrace);
        Console.WriteLine(ex.Message);
        Console.WriteLine(ex.InnerException);
    }
}

Expected behavior

Can SharpXipLib supports reading this type of file

Operating System

No response

Framework Version

No response

Tags

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions