Skip to content
Open
Show file tree
Hide file tree
Changes from 10 commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
02de760
Added heuristics file content detector that determines the content ba…
Dimi1010 Sep 12, 2025
d2b6339
Merge remote-tracking branch 'upstream/dev' into feature/heuristic-fi…
Dimi1010 Sep 12, 2025
685dd9f
Moved stream checkpoint outside format detector as it is not directly…
Dimi1010 Sep 12, 2025
40dee69
Added a new factory function `createReader` that uses the new heurist…
Dimi1010 Sep 12, 2025
f1e3e18
Add <algorithm> include.
Dimi1010 Sep 12, 2025
8da1790
Added unit tests.
Dimi1010 Sep 12, 2025
3ad51e2
Deprecated old factory function.
Dimi1010 Sep 12, 2025
15c2000
Add byte-swapped zstd magic number.
Dimi1010 Sep 12, 2025
17af8d4
Lint
Dimi1010 Sep 12, 2025
46418ec
Move enum closer to first usage.
Dimi1010 Sep 12, 2025
3d713ab
Added unit tests for file reader device factory.
Dimi1010 Sep 15, 2025
a2391ec
Revert indentation.
Dimi1010 Sep 15, 2025
ea328d7
Fixed StreamCheckpoint to restore original stream state.
Dimi1010 Sep 15, 2025
db86c3e
Merge branch 'dev' into feature/heuristic-file-selection
Dimi1010 Sep 15, 2025
4aed9bd
Merge branch 'dev' into feature/heuristic-file-selection
Dimi1010 Sep 20, 2025
a83ae2b
Moved isStreamSeekable helper to inside `CaptureFileFormatDetector`.
Dimi1010 Sep 20, 2025
916e872
Added pcap magic number for Alexey Kuznetzov's modified pcap format.
Dimi1010 Sep 20, 2025
022529f
Merge remote-tracking branch 'upstream/dev' into feature/heuristic-fi…
Dimi1010 Sep 26, 2025
169fcd2
Split the unit test into multiple smaller tests.
Dimi1010 Sep 26, 2025
db8c848
Merge branch 'dev' into feature/heuristic-file-selection
Dimi1010 Oct 2, 2025
3e74912
Merge remote-tracking branch 'upstream/dev' into feature/heuristic-fi…
Dimi1010 Oct 2, 2025
f1613c4
Added helper to indicate if ZstSupport is enabled for PcapNg devices.
Dimi1010 Oct 2, 2025
bc2bacd
Split pcap microsecond and nanosecond file heuristics tests.
Dimi1010 Oct 2, 2025
58ac45d
Skipping Zst test case if zst is not supported.
Dimi1010 Oct 2, 2025
3b4b5ad
Due to file heuristics returning PcapNG format on Zstd archive, if Zs…
Dimi1010 Oct 2, 2025
18379b4
Lint
Dimi1010 Oct 2, 2025
8a4f6f8
Added invalid device factory to pcap tag.
Dimi1010 Oct 2, 2025
7776e0e
Updated static zst archives to be actual archives.
Dimi1010 Oct 2, 2025
4f52f59
Centralized PTF test name width under a macro.
Dimi1010 Oct 3, 2025
88ebfff
Add Pcap++Test header files to test sources for IDE tooling.
Dimi1010 Oct 3, 2025
41fe188
Fixed test output formatting.
Dimi1010 Oct 3, 2025
c8ae4f8
Lint
Dimi1010 Oct 3, 2025
c7cab2b
Typo fix.
Dimi1010 Oct 3, 2025
6d55077
Merge remote-tracking branch 'upstream/dev' into feature/heuristic-fi…
Dimi1010 Oct 6, 2025
682eeac
Shortened test names.
Dimi1010 Oct 6, 2025
07804da
Simplified invalid file test.
Dimi1010 Oct 6, 2025
9c4fc08
Simplified ZST tests.
Dimi1010 Oct 6, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions Pcap++/header/PcapFileDevice.h
Original file line number Diff line number Diff line change
Expand Up @@ -118,7 +118,19 @@ namespace pcpp
/// it returns an instance of PcapFileReaderDevice
/// @param[in] fileName The file name to open
/// @return An instance of the reader to read the file. Notice you should free this instance when done using it
/// @deprecated Prefer `createReader` due to selection of reader based on file content instead of extension.
PCPP_DEPRECATED("Prefer `createReader` due to selection of reader based on file content instead of extension.")
static IFileReaderDevice* getReader(const std::string& fileName);

/// @brief Creates an instance of the reader best fit to read the file.
///
/// The factory function uses heuristics based on the file content to decide the reader.
/// If the file type is known at compile time, it is better to construct a concrete reader instance directly.
///
/// @param[in] fileName The path to the file to open.
/// @return A unique pointer to a reader instance or nullptr if the file is not supported.
/// @throws std::runtime_error If the file could not be opened.
static std::unique_ptr<IFileReaderDevice> createReader(const std::string& fileName);
};

/// @class IFileWriterDevice
Expand Down
215 changes: 196 additions & 19 deletions Pcap++/src/PcapFileDevice.cpp
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
#define LOG_MODULE PcapLogModuleFileDevice

#include <cerrno>
#include <array>
#include <algorithm>
#include "PcapFileDevice.h"
#include "light_pcapng_ext.h"
#include "Logger.h"
Expand Down Expand Up @@ -28,32 +30,186 @@ namespace pcpp
{
return reinterpret_cast<light_pcapng_t*>(pcapngHandle);
}

struct pcap_file_header
{
uint32_t magic;
uint16_t version_major;
uint16_t version_minor;
int32_t thiszone;
uint32_t sigfigs;
uint32_t snaplen;
uint32_t linktype;
};

struct packet_header
{
uint32_t tv_sec;
uint32_t tv_usec;
uint32_t caplen;
uint32_t len;
};

/// @brief Check if a stream is seekable.
/// @param stream The stream to check.
/// @return True if the stream supports seek operations, false otherwise.
bool isStreamSeekable(std::istream& stream)
{
auto pos = stream.tellg();
if (stream.fail())
{
stream.clear();
return false;
}

if (stream.seekg(pos).fail())
{
stream.clear();
return false;
}

return true;
}

class StreamPositionCheckpoint
{
public:
explicit StreamPositionCheckpoint(std::istream& stream) : m_Stream(stream), m_Pos(stream.tellg())
{}

~StreamPositionCheckpoint()
{
m_Stream.seekg(m_Pos);
}

private:
std::istream& m_Stream;
std::streampos m_Pos;
};

enum class CaptureFileFormat
{
Unknown,
Pcap,
PcapNG,
Snoop,
};
Comment on lines +95 to +101
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto: this can be an enum inside of CaptureFileFormatDetector

Copy link
Collaborator Author

@Dimi1010 Dimi1010 Sep 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose. It has internal linkage so it doesn't really matter.

But then we would end up with really long case names: CaptureFileFormatDetector::FileFormat::Pcap?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it's fine? It's all internal anyway...


/// @brief Heuristic file format detector that scans the magic number of the file format header.
class CaptureFileFormatDetector
{
public:
/// @brief Checks a content stream for the magic number and determines the type.
/// @param content A content stream that contains the file content.
/// @return A CaptureFileFormat value with the detected content type.
CaptureFileFormat detectFormat(std::istream& content)
{
// Check if the stream supports seeking.
if (!isStreamSeekable(content))
{
throw std::runtime_error("Heuristic file format detection requires seekable stream");
}

if (isPcapFile(content))
return CaptureFileFormat::Pcap;

// PcapNG backend can support ZstdCompressed Pcap files, so we assume an archive is compressed PcapNG.
if (isPcapNgFile(content) || isZstdArchive(content))
return CaptureFileFormat::PcapNG;

if (isSnoopFile(content))
return CaptureFileFormat::Snoop;

return CaptureFileFormat::Unknown;
}

private:
bool isPcapFile(std::istream& content)
{
constexpr std::array<uint32_t, 4> pcapMagicNumbers = {
0xa1'b2'c3'd4, // regular pcap, microsecond-precision
0xd4'c3'b2'a1, // regular pcap, microsecond-precision (byte-swapped)
0xa1'b2'3c'4d, // regular pcap, nanosecond-precision
0x4d'3c'b2'a1 // regular pcap, nanosecond-precision (byte-swapped)
};

StreamPositionCheckpoint checkpoint(content);

pcap_file_header header;
content.read(reinterpret_cast<char*>(&header), sizeof(header));
if (content.gcount() != sizeof(header))
{
return false;
}

return std::find(pcapMagicNumbers.begin(), pcapMagicNumbers.end(), header.magic) !=
pcapMagicNumbers.end();
}

bool isPcapNgFile(std::istream& content)
{
constexpr std::array<uint32_t, 1> pcapMagicNumbers = {
0x0A'0D'0D'0A, // pcapng magic number (palindrome)
};

StreamPositionCheckpoint checkpoint(content);

uint32_t magic = 0;
content.read(reinterpret_cast<char*>(&magic), sizeof(magic));
if (content.gcount() != sizeof(magic))
{
return false;
}

return std::find(pcapMagicNumbers.begin(), pcapMagicNumbers.end(), magic) != pcapMagicNumbers.end();
}

bool isSnoopFile(std::istream& content)
{
constexpr std::array<uint64_t, 2> snoopMagicNumbers = {
0x73'6E'6F'6F'70'00'00'00, // snoop magic number, "snoop" in ASCII
0x00'00'00'70'6F'6F'6E'73 // snoop magic number, "snoop" in ASCII (byte-swapped)
};

StreamPositionCheckpoint checkpoint(content);

uint64_t magic = 0;
content.read(reinterpret_cast<char*>(&magic), sizeof(magic));
if (content.gcount() != sizeof(magic))
{
return false;
}

return std::find(snoopMagicNumbers.begin(), snoopMagicNumbers.end(), magic) != snoopMagicNumbers.end();
}

bool isZstdArchive(std::istream& content)
{
constexpr std::array<uint32_t, 2> zstdMagicNumbers = {
0x28'B5'2F'FD, // zstd archive magic number
0xFD'2F'B5'28, // zstd archive magic number (byte-swapped)
};

StreamPositionCheckpoint checkpoint(content);

uint32_t magic = 0;
content.read(reinterpret_cast<char*>(&magic), sizeof(magic));
if (content.gcount() != sizeof(magic))
{
return false;
}

return std::find(zstdMagicNumbers.begin(), zstdMagicNumbers.end(), magic) != zstdMagicNumbers.end();
}
};

} // namespace

template <typename T, size_t N> constexpr size_t ARRAY_SIZE(T (&)[N])
{
return N;
}

struct pcap_file_header
{
uint32_t magic;
uint16_t version_major;
uint16_t version_minor;
int32_t thiszone;
uint32_t sigfigs;
uint32_t snaplen;
uint32_t linktype;
};

struct packet_header
{
uint32_t tv_sec;
uint32_t tv_usec;
uint32_t caplen;
uint32_t len;
};

static bool checkNanoSupport()
{
#if defined(PCAP_TSTAMP_PRECISION_NANO)
Expand Down Expand Up @@ -130,6 +286,27 @@ namespace pcpp
return new PcapFileReaderDevice(fileName);
}

std::unique_ptr<IFileReaderDevice> IFileReaderDevice::createReader(const std::string& fileName)
{
std::ifstream fileContent(fileName, std::ios_base::binary);
if (fileContent.fail())
{
throw std::runtime_error("Could not open: " + fileName);
}

switch (CaptureFileFormatDetector().detectFormat(fileContent))
{
case CaptureFileFormat::Pcap:
return std::make_unique<PcapFileReaderDevice>(fileName);
case CaptureFileFormat::PcapNG:
return std::make_unique<PcapNgFileReaderDevice>(fileName);
case CaptureFileFormat::Snoop:
return std::make_unique<SnoopFileReaderDevice>(fileName);
default:
return nullptr;
}
}

uint64_t IFileReaderDevice::getFileSize() const
{
std::ifstream fileStream(m_FileName.c_str(), std::ifstream::ate | std::ifstream::binary);
Expand Down
69 changes: 50 additions & 19 deletions Tests/Pcap++Test/Tests/FileTests.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -754,7 +754,7 @@ PTF_TEST_CASE(TestPcapNgFileReadWriteAdv)

PTF_ASSERT_EQUAL(packetCount, 161);

// -------
// ------- IFileReaderDevice::getReader() Factory

// copy the .zstd file to a similar file with .zst extension
std::ifstream zstdFile(EXAMPLE2_PCAPNG_ZSTD_WRITE_PATH, std::ios::binary);
Expand All @@ -763,26 +763,57 @@ PTF_TEST_CASE(TestPcapNgFileReadWriteAdv)
zstdFile.close();
zstFile.close();

pcpp::IFileReaderDevice* genericReader = pcpp::IFileReaderDevice::getReader(EXAMPLE2_PCAP_PATH);
FileReaderTeardown genericReaderTeardown1(genericReader);
PTF_ASSERT_NOT_NULL(dynamic_cast<pcpp::PcapFileReaderDevice*>(genericReader));
PTF_ASSERT_NULL(dynamic_cast<pcpp::PcapNgFileReaderDevice*>(genericReader));

genericReader = pcpp::IFileReaderDevice::getReader(EXAMPLE2_PCAPNG_PATH);
FileReaderTeardown genericReaderTeardown2(genericReader);
PTF_ASSERT_NOT_NULL(dynamic_cast<pcpp::PcapNgFileReaderDevice*>(genericReader));

genericReader = pcpp::IFileReaderDevice::getReader(EXAMPLE_PCAPNG_ZSTD_WRITE_PATH);
FileReaderTeardown genericReaderTeardown3(genericReader);
PTF_ASSERT_NOT_NULL(dynamic_cast<pcpp::PcapNgFileReaderDevice*>(genericReader));
PTF_ASSERT_TRUE(genericReader->open());
{
pcpp::IFileReaderDevice* genericReader = pcpp::IFileReaderDevice::getReader(EXAMPLE2_PCAP_PATH);
FileReaderTeardown genericReaderTeardown1(genericReader);
PTF_ASSERT_NOT_NULL(dynamic_cast<pcpp::PcapFileReaderDevice*>(genericReader));
PTF_ASSERT_NULL(dynamic_cast<pcpp::PcapNgFileReaderDevice*>(genericReader));

genericReader = pcpp::IFileReaderDevice::getReader(EXAMPLE2_PCAPNG_PATH);
FileReaderTeardown genericReaderTeardown2(genericReader);
PTF_ASSERT_NOT_NULL(dynamic_cast<pcpp::PcapNgFileReaderDevice*>(genericReader));

genericReader = pcpp::IFileReaderDevice::getReader(EXAMPLE_PCAPNG_ZSTD_WRITE_PATH);
FileReaderTeardown genericReaderTeardown3(genericReader);
PTF_ASSERT_NOT_NULL(dynamic_cast<pcpp::PcapNgFileReaderDevice*>(genericReader));
PTF_ASSERT_TRUE(genericReader->open());

genericReader = pcpp::IFileReaderDevice::getReader(EXAMPLE2_PCAPNG_ZST_WRITE_PATH);
FileReaderTeardown genericReaderTeardown4(genericReader);
PTF_ASSERT_NOT_NULL(dynamic_cast<pcpp::PcapNgFileReaderDevice*>(genericReader));
PTF_ASSERT_TRUE(genericReader->open());

genericReader->close();
}

genericReader = pcpp::IFileReaderDevice::getReader(EXAMPLE2_PCAPNG_ZST_WRITE_PATH);
FileReaderTeardown genericReaderTeardown4(genericReader);
PTF_ASSERT_NOT_NULL(dynamic_cast<pcpp::PcapNgFileReaderDevice*>(genericReader));
PTF_ASSERT_TRUE(genericReader->open());
// ------- IFileReaderDevice::createReader() Factory
// TODO: Move to a separate unit test.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add the following to get more coverage:

  • Open a snoop file
  • Open a file that is not any of the options
  • Open pcap files with different magic numbers
  • Assuming we add a version check for snoop and pcap file: create temp files with bogus data that has the magic number but wrong versions

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3d713ab adds the following tests:

  • Pcap, PcapNG, Zst file with correct content + extension
  • Pcap, PcanNG file with correct content + wrong extension
  • Bogus content file with correct extension (pcap, pcapng, zst)
  • Bogus content file with wrong extension (txt)

Haven't found a snoop file to add. Do we have any?

Open pcap files with different magic numbers

Do you mean Pcap content that has just its magic number changed? Because IMO it is reasonable to consider that invalid format and fail as regular bogus data.

Assuming we add a version check for snoop and pcap file: create temp files with bogus data that has the magic number but wrong versions

Pending on #1962 (comment) .


genericReader->close();
{
PTF_ASSERT_RAISES(pcpp::IFileReaderDevice::createReader("BogusFile"), std::runtime_error,
"Could not open: BogusFile");

auto genericReader = pcpp::IFileReaderDevice::createReader(EXAMPLE2_PCAP_PATH);
PTF_ASSERT_NOT_NULL(genericReader);
PTF_ASSERT_NOT_NULL(dynamic_cast<pcpp::PcapFileReaderDevice*>(genericReader.get()));
genericReader->close();

genericReader = pcpp::IFileReaderDevice::createReader(EXAMPLE2_PCAPNG_PATH);
PTF_ASSERT_NOT_NULL(genericReader);
PTF_ASSERT_NOT_NULL(dynamic_cast<pcpp::PcapNgFileReaderDevice*>(genericReader.get()));

genericReader = pcpp::IFileReaderDevice::createReader(EXAMPLE_PCAPNG_ZSTD_WRITE_PATH);
PTF_ASSERT_NOT_NULL(genericReader);
PTF_ASSERT_NOT_NULL(dynamic_cast<pcpp::PcapNgFileReaderDevice*>(genericReader.get()));
PTF_ASSERT_TRUE(genericReader->open());
genericReader->close();

genericReader = pcpp::IFileReaderDevice::createReader(EXAMPLE2_PCAPNG_ZST_WRITE_PATH);
PTF_ASSERT_NOT_NULL(genericReader);
PTF_ASSERT_NOT_NULL(dynamic_cast<pcpp::PcapNgFileReaderDevice*>(genericReader.get()));
PTF_ASSERT_TRUE(genericReader->open());
genericReader->close();
}

// -------

Expand Down
Loading