Skip to content

Remove dependency on Boost.IOStreams#189

Merged
arahlin merged 35 commits intomasterfrom
no_boost_iostreams
Mar 29, 2025
Merged

Remove dependency on Boost.IOStreams#189
arahlin merged 35 commits intomasterfrom
no_boost_iostreams

Conversation

@arahlin
Copy link
Member

@arahlin arahlin commented Mar 22, 2025

Implement file IO functions using STL constructs.

A set of stream buffer classes are defined in the private streams.h header. API functions in the public dataio.h header are significantly reduced, relying on the stream buffers to provide the backend implementation for flush, seek, tell, etc. The callback registration API is used to ensure destruction of instantiated stream buffers.

The compression implementation here is API-compatible with the boost variant, verified by writing files with this branch and reading files with the master branch, and vice versa. These compression streambuf classes could straightforwardly be used in other applications, for example to implement the hybrid bzip2/flac compression scheme used in G3SuperTimestreams.

Seek/tell efficiency of the reader is maintained, although slightly reimplemented here with a caching counter.

New features:

  • Both reading and writing streams use a (tunable) 1MB buffer, for compressed, uncompressed, local and remote streams.
  • Update ARCFileReader API to expose the timeout and buffersize istream options.

Fixes #185.

@arahlin arahlin requested a review from nwhitehorn March 22, 2025 20:22
@arahlin arahlin self-assigned this Mar 22, 2025
@arahlin arahlin force-pushed the no_boost_iostreams branch 2 times, most recently from 1fdfd33 to 0705167 Compare March 22, 2025 21:09
@arahlin arahlin force-pushed the no_boost_iostreams branch from 0705167 to f124112 Compare March 22, 2025 21:18
@arahlin arahlin marked this pull request as ready for review March 23, 2025 15:22
@arahlin
Copy link
Member Author

arahlin commented Mar 25, 2025

Performance testing showed that the slow tellg issue cropped up again, so I implemented a byte counter that seems to work. Otherwise read/write speeds are the same as the boost implementation.

@arahlin arahlin force-pushed the no_boost_iostreams branch from 2a67762 to dfe851e Compare March 25, 2025 21:26
@arahlin arahlin requested a review from cnweaver March 26, 2025 16:29
@arahlin arahlin force-pushed the no_boost_iostreams branch from 00e8974 to d6fdb7f Compare March 26, 2025 17:16
Copy link
Member

@nwhitehorn nwhitehorn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me from an architectural perspective. I'm not 100% I understand the rationale for the file-extension arguments, but that's a minor quibble. Thanks!

@arahlin arahlin force-pushed the no_boost_iostreams branch from ae35aa4 to 5d4a1d4 Compare March 27, 2025 23:08
@arahlin arahlin force-pushed the no_boost_iostreams branch from 5d4a1d4 to e8c76e1 Compare March 27, 2025 23:18
arahlin added 11 commits March 27, 2025 19:36
Handle fstream construction/destruction in streambuf classes.
This also avoids the need for storing pointers in the pword() array.
destroyed, so dynamic casting will not work, and rdbuf() is not a public member
of std::ios_base.  The solution is to store a copy of the streambuf pointer in
the pword() array, and clear it using ios_base functions.
@arahlin arahlin force-pushed the no_boost_iostreams branch from 6959569 to a6dea50 Compare March 28, 2025 20:27
Copy link
Contributor

@cnweaver cnweaver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a general correctness issue, I don't see any calls to ios::xalloc, but it is a precondition that ios::pword can only be called with index values previously returned by xalloc. (I think the point here is that any user code is allowed to use pword and iword to attach things to ios_base, so each distinct use should call xalloc to ensure that it is using a unique index to do so. Hard-coding it to zero everywhere is unsafe, because it can conflict with any other (rare) code using this feature. It looks like there should be central bit of code that calls xalloc once, either the first time the index is needed, or on startup, and caches the value it gets in a variable which every function calling pword should read to get the index to use.

arahlin and others added 4 commits March 29, 2025 00:54
@arahlin arahlin merged commit be8441f into master Mar 29, 2025
2 checks passed
@arahlin arahlin deleted the no_boost_iostreams branch March 29, 2025 18:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Replace Boost.IOStreams with STL constructs

3 participants