Skip to content

Commit 7929c81

Browse files
authored
perf: file read performance -- double reads, extra copies (#4507)
Doing some benchmarking and adding extra LoggedTimer instrumentation (included in another PR), I discovered that quite frequently, the in-memory flavor of ImageBuf was reading its image multiple times -- once upon construction, and then again upon later calls to read(), even though the latter should have taken early outs when the read was unnecessary. So I put some additional mechanism for the IB to internally know if it read the pixels from disk into the buffer, and more effectively make calls to IB.read() be a no-op. To help facilitate this, I also clarified in the docs that read() should not be used to move an already-read image to a different subimage or MIP level or a different data type -- reset() really should be used for that purpose. (I don't know that anybody ever relied on read() to do reset's job, but assuming that they won't helps eliminate some tricky cases.) Also, I noticed that inside ImageInput::read_image(), depending on whether or not a data type conversion is needed, it can either read directly into the user's buffer, or must read the native data into a temp buffer and subsequently do a copy-and-type-convert into the user's buffer. The latter is obviously more expensive. The direct read approach was used when the user asked for no type conversion (just to receive the native data type in the file), but it neglected to take the same shortcut if the user asked for an explicit type conversion, but it just so happened that the requested type was the type in the file. Additional changes: The evaluation of whether the buffer was contiguous (for various shortcuts) only applied to LOCALBUFFER style storage, not APPBUFFER. Fixing these two problems dramatically speeds up some reads. A benchmark I was looking at where it reads a 10k x 10k uint8 uncompressed TIFF file into an ImageBuf while requesting uint8 data went from (timings on my laptop): function calls total IB::read 2 0.505s (avg 0.25s) II::read_image 2 0.478s (avg 0.24s) to: IB::read 1 0.172s (avg 0.17s) II::read_image 1 0.172s (avg 0.17s) So almost 3x faster. Most of which was due doing 2 reads instead of 1, but still each individual read improved from 0.25s to 0.17s because of the fix that lets us avoid the unnecessary buffer copies. Grain of salt: this was a very large image, and it was stored uncompressed (so NO decompression that can sometimes be a significant amount of overhead), so you may not see this much improvement in real world scenarios. This is only scratching the surface, I believe there are lots of other areas to optimize more carefully. Signed-off-by: Larry Gritz <[email protected]>
1 parent 95d246d commit 7929c81

File tree

5 files changed

+92
-42
lines changed

5 files changed

+92
-42
lines changed

src/include/OpenImageIO/imagebuf.h

Lines changed: 39 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -416,38 +416,44 @@ class OIIO_API ImageBuf {
416416
/// @{
417417
/// @name Reading and Writing disk images
418418

419-
/// Read the particular subimage and MIP level of the image. Generally,
420-
/// this will skip the expensive read if the file has already been read
421-
/// into the ImageBuf (at the specified subimage and MIP level). It
422-
/// will clear and re-allocate memory if the previously allocated space
423-
/// was not appropriate for the size or data type of the image being
424-
/// read.
419+
/// Read the particular subimage and MIP level of the image, if it has not
420+
/// already been read. It will clear and re-allocate memory if the
421+
/// previously allocated space was not appropriate for the size or data
422+
/// type of the image being read.
425423
///
426-
/// In general, `read()` will try not to do any I/O at the time of the
427-
/// `read()` call, but rather to have the ImageBuf "backed" by an
428-
/// ImageCache, which will do the file I/O on demand, as pixel values
429-
/// are needed, and in that case the ImageBuf doesn't actually allocate
430-
/// memory for the pixels (the data lives in the ImageCache). However,
431-
/// there are several conditions for which the ImageCache will be
432-
/// bypassed, the ImageBuf will allocate "local" memory, and the disk
433-
/// file will be read directly into allocated buffer at the time of the
434-
/// `read()` call: (a) if the `force` parameter is `true`; (b) if the
435-
/// `convert` parameter requests a data format conversion to a type that
436-
/// is not the native file type and also is not one of the internal
437-
/// types supported by the ImageCache (specifically, `float` and
438-
/// `uint8`); (c) if the ImageBuf already has local pixel memory
439-
/// allocated, or "wraps" an application buffer.
424+
/// In general, calling `read()` should be unnecessary for most uses of
425+
/// ImageBuf. When an ImageBuf is created (or when `reset()` is called),
426+
/// usually the opening of the file and reading of the header is deferred
427+
/// until the spec is accessed or needed, and the reading of the pixel
428+
/// values is usually deferred until pixel values are needed, at which
429+
/// point these things happen automatically. That is, every ImageBuf
430+
/// method that needs pixel values will call `read()` itself if it has
431+
/// not previously been called.
440432
///
441-
/// Note that `read()` is not strictly necessary. If you are happy with
442-
/// the filename, subimage and MIP level specified by the ImageBuf
443-
/// constructor (or the last call to `reset()`), and you want the
444-
/// storage to be backed by the ImageCache (including storing the
445-
/// pixels in whatever data format that implies), then the file contents
446-
/// will be automatically read the first time you make any other
447-
/// ImageBuf API call that requires the spec or pixel values. The only
448-
/// reason to call `read()` yourself is if you are changing the
449-
/// filename, subimage, or MIP level, or if you want to use `force=true`
450-
/// or a specific `convert` value to force data format conversion.
433+
/// There are a few situations where you want to call read() explicitly,
434+
/// after the ImageBuf is constructed but before any other methods have
435+
/// been called that would implicitly read the file:
436+
///
437+
/// 1. If you want to request that the internal buffer be a specific
438+
/// pixel data type that might differ from the pixel data type in
439+
/// the file itself (conveyed by the `convert` parameter).
440+
/// 2. You want the ImageBuf to read and contain only a subset of the
441+
/// channels in the file (conveyed by the `chmin` and `chmax`
442+
/// parameters on the version of `read()` that accepts them).
443+
/// 3. The ImageBuf has been set up to be backed by ImageCache, but you
444+
/// want to force it to read the whole file into memory now (conveyed
445+
/// by the `force` parameter, or if the `convert` parameter specifies
446+
/// a type that is not the native file type and also cannot be
447+
/// accommodated directly by the cache).
448+
/// 4. For whatever reason, you want to force a full read of the pixels
449+
/// to occur at this point in program execution, rather than at some
450+
/// undetermined future time when you first need to access those
451+
/// pixels.
452+
///
453+
/// The `read()` function should not be used to *change* an already
454+
/// established subimage, MIP level, pixel data type, or channel range
455+
/// of a file that has already read its pixels. You should use the
456+
/// `reset()` method for that purpose.
451457
///
452458
/// @param subimage/miplevel
453459
/// The subimage and MIP level to read.
@@ -460,7 +466,7 @@ class OIIO_API ImageBuf {
460466
/// `IMAGECACHE`, if the ImageBuf was originally constructed
461467
/// or reset with an ImageCache specified).
462468
/// @param convert
463-
/// If set to a specific type (not`UNKNOWN`), the ImageBuf
469+
/// If set to a specific type (not `UNKNOWN`), the ImageBuf
464470
/// memory will be allocated for that type specifically and
465471
/// converted upon read.
466472
/// @param progress_callback/progress_callback_data
@@ -1219,6 +1225,7 @@ class OIIO_API ImageBuf {
12191225
bool contains_roi(const ROI& roi) const;
12201226

12211227
bool pixels_valid(void) const;
1228+
bool pixels_read(void) const;
12221229

12231230
/// The data type of the pixels stored in the buffer (equivalent to
12241231
/// `spec().format`).
@@ -1242,7 +1249,7 @@ class OIIO_API ImageBuf {
12421249
/// Z plane stride within the localpixels memory.
12431250
stride_t z_stride() const;
12441251

1245-
/// Is the data layout "contiguous", i.e.,
1252+
/// Is this an in-memory buffer with the data layout "contiguous", i.e.,
12461253
/// ```
12471254
/// pixel_stride == nchannels * pixeltype().size()
12481255
/// scanline_stride == pixel_stride * spec().width

src/libOpenImageIO/imagebuf.cpp

Lines changed: 42 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -317,7 +317,9 @@ class ImageBufImpl {
317317

318318
void eval_contiguous()
319319
{
320-
m_contiguous = m_localpixels && m_storage == ImageBuf::LOCALBUFFER
320+
m_contiguous = m_localpixels
321+
&& (m_storage == ImageBuf::LOCALBUFFER
322+
|| m_storage == ImageBuf::APPBUFFER)
321323
&& m_xstride == m_spec.nchannels * m_channel_stride
322324
&& m_ystride == m_xstride * m_spec.width
323325
&& m_zstride == m_ystride * m_spec.height;
@@ -347,6 +349,7 @@ class ImageBufImpl {
347349
mutable mutex_t m_mutex; ///< Thread safety for this ImageBuf
348350
mutable bool m_spec_valid; ///< Is the spec valid
349351
mutable bool m_pixels_valid; ///< Image is valid
352+
mutable bool m_pixels_read; ///< Is file already in the local pixels?
350353
bool m_readonly; ///< The bufspan is read-only
351354
bool m_badfile; ///< File not found
352355
float m_pixelaspect; ///< Pixel aspect ratio of the image
@@ -421,6 +424,7 @@ ImageBufImpl::ImageBufImpl(string_view filename, int subimage, int miplevel,
421424
, m_localpixels(NULL)
422425
, m_spec_valid(false)
423426
, m_pixels_valid(false)
427+
, m_pixels_read(false)
424428
, m_readonly(readonly)
425429
, m_badfile(false)
426430
, m_pixelaspect(1)
@@ -506,6 +510,7 @@ ImageBufImpl::ImageBufImpl(const ImageBufImpl& src)
506510
{
507511
m_spec_valid = src.m_spec_valid;
508512
m_pixels_valid = src.m_pixels_valid;
513+
m_pixels_read = src.m_pixels_read;
509514
if (src.m_localpixels) {
510515
// Source had the image fully in memory (no cache)
511516
if (m_storage == ImageBuf::APPBUFFER) {
@@ -533,6 +538,7 @@ ImageBufImpl::ImageBufImpl(const ImageBufImpl& src)
533538
m_nmiplevels = 0;
534539
m_spec.erase_attribute("oiio:subimages");
535540
m_nativespec.erase_attribute("oiio:subimages");
541+
m_pixels_read = true;
536542
}
537543
if (src.m_configspec)
538544
m_configspec.reset(new ImageSpec(*src.m_configspec));
@@ -804,6 +810,7 @@ ImageBufImpl::clear()
804810
m_spec_valid = false;
805811
m_pixels_valid = false;
806812
m_badfile = false;
813+
m_pixels_read = false;
807814
m_pixelaspect = 1;
808815
m_xstride = 0;
809816
m_ystride = 0;
@@ -856,11 +863,14 @@ ImageBufImpl::reset(string_view filename, int subimage, int miplevel,
856863
m_configspec->attribute("oiio:ioproxy", TypeDesc::PTR, &m_rioproxy);
857864
}
858865
m_bufspan = {};
866+
m_storage = ImageBuf::LOCALBUFFER;
859867
if (m_name.length() > 0) {
860-
// filename non-empty means this ImageBuf refers to a file.
861-
read(subimage, miplevel);
862-
// FIXME: investigate if the above read is really necessary, or if
863-
// it can be eliminated and done fully lazily.
868+
// For IC-backed file ImageBuf's, call read now. For other file-based
869+
// images, just init the spec.
870+
if (m_imagecache)
871+
read(subimage, miplevel);
872+
else
873+
init_spec(m_name, subimage, miplevel, DoLock(true));
864874
}
865875
}
866876

@@ -1004,7 +1014,8 @@ ImageBufImpl::realloc()
10041014
m_deepdata.init(m_spec);
10051015
m_storage = ImageBuf::LOCALBUFFER;
10061016
}
1007-
m_readonly = false;
1017+
m_readonly = false;
1018+
m_pixels_read = false;
10081019
eval_contiguous();
10091020
#if 0
10101021
std::cerr << "ImageBuf " << m_name << " local allocation: " << m_allocated_size << "\n";
@@ -1138,6 +1149,7 @@ ImageBufImpl::init_spec(string_view filename, int subimage, int miplevel,
11381149
m_badfile = true;
11391150
m_pixels_valid = false;
11401151
m_spec_valid = false;
1152+
m_pixels_read = false;
11411153
m_nsubimages = 0;
11421154
m_nmiplevels = 0;
11431155
m_badfile = false;
@@ -1207,13 +1219,27 @@ ImageBufImpl::read(int subimage, int miplevel, int chbegin, int chend,
12071219
if (do_lock)
12081220
lock.lock();
12091221

1222+
// If this doesn't reference a file in any way, nothing to do here.
12101223
if (!m_name.length())
12111224
return true;
12121225

1226+
// If the pixels have already been read and we aren't switching
1227+
// subimage/miplevel or being force to read (for example, turning a cached
1228+
// image into an in-memory image), then there is nothing to do.
12131229
if (m_pixels_valid && !force && subimage == m_current_subimage
12141230
&& miplevel == m_current_miplevel)
12151231
return true;
12161232

1233+
// If it's a local buffer from a file and we've already read the pixels
1234+
// into memory, we're done, provided that we aren't asking it to force
1235+
// a read with a different data type conversion or different number of
1236+
// channels.
1237+
if (m_storage == ImageBuf::LOCALBUFFER && m_pixels_valid && m_pixels_read
1238+
&& (convert == TypeUnknown || convert == m_spec.format)
1239+
&& subimage == m_current_subimage && miplevel == m_current_miplevel
1240+
&& ((chend - chbegin) == m_spec.nchannels || (chend <= chbegin)))
1241+
return true;
1242+
12171243
if (!init_spec(m_name.string(), subimage, miplevel,
12181244
DoLock(false) /* we already hold the lock */)) {
12191245
m_badfile = true;
@@ -1241,6 +1267,7 @@ ImageBufImpl::read(int subimage, int miplevel, int chbegin, int chend,
12411267
if (ok) {
12421268
m_spec = m_nativespec; // Deep images always use native data
12431269
m_pixels_valid = true;
1270+
m_pixels_read = true;
12441271
m_storage = ImageBuf::LOCALBUFFER;
12451272
} else {
12461273
error(input->geterror());
@@ -1348,6 +1375,7 @@ ImageBufImpl::read(int subimage, int miplevel, int chbegin, int chend,
13481375
in->close();
13491376
if (ok) {
13501377
m_pixels_valid = true;
1378+
m_pixels_read = true;
13511379
} else {
13521380
m_pixels_valid = false;
13531381
error(in->geterror());
@@ -1943,6 +1971,14 @@ ImageBuf::pixels_valid(void) const
19431971

19441972

19451973

1974+
bool
1975+
ImageBuf::pixels_read(void) const
1976+
{
1977+
return m_impl->m_pixels_read;
1978+
}
1979+
1980+
1981+
19461982
TypeDesc
19471983
ImageBuf::pixeltype() const
19481984
{

src/libOpenImageIO/imagebufalgo.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -354,7 +354,7 @@ ImageBufAlgo::IBAprep(ROI& roi, ImageBuf& dst, cspan<const ImageBuf*> srcs,
354354
KWArgs options, ImageSpec* force_spec)
355355
{
356356
// OIIO_ASSERT(dst);
357-
// Helper: ANY_SRC returns true if any of the source images s satisfy the
357+
// Helper: ANY_SRC returns true if any of the source images satisfy the
358358
// condition cond.
359359
#define ANY_SRC(cond) \
360360
std::any_of(srcs.begin(), srcs.end(), \

src/libOpenImageIO/imagebufalgo_test.cpp

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@
1616
#include <OpenImageIO/argparse.h>
1717
#include <OpenImageIO/benchmark.h>
1818
#include <OpenImageIO/color.h>
19+
#include <OpenImageIO/filesystem.h>
1920
#include <OpenImageIO/half.h>
2021
#include <OpenImageIO/imagebuf.h>
2122
#include <OpenImageIO/imagebufalgo.h>
@@ -1109,8 +1110,10 @@ test_opencv()
11091110
OIIO_CHECK_EQUAL(comp.maxerror, 0.0f);
11101111

11111112
// Regression test: reading from ImageBuf-backed image to OpenCV
1112-
auto loaded_image = OIIO::ImageBuf("../../testsuite/common/tahoe-tiny.tif",
1113-
0, 0, ImageCache::create());
1113+
std::string filename = "testsuite/common/tahoe-tiny.tif";
1114+
if (!Filesystem::exists(filename))
1115+
filename = "../../testsuite/common/tahoe-tiny.tif";
1116+
auto loaded_image = OIIO::ImageBuf(filename, 0, 0, ImageCache::create());
11141117
OIIO_CHECK_ASSERT(loaded_image.initialized());
11151118
if (!loaded_image.initialized()) {
11161119
std::cout << loaded_image.geterror() << 'n';

src/libOpenImageIO/imageinput.cpp

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -302,6 +302,7 @@ ImageInput::read_scanlines(int subimage, int miplevel, int ybegin, int yend,
302302
// For scanline files, we also need one piece of metadata
303303
if (!spec.tile_width)
304304
rps = m_spec.get_int_attribute("tiff:RowsPerStrip", 64);
305+
// FIXME: does the above search of metadata have a significant cost?
305306
}
306307
if (spec.image_bytes() < 1) {
307308
errorfmt("Invalid image size {} x {} ({} chans)", m_spec.width,
@@ -330,7 +331,10 @@ ImageInput::read_scanlines(int subimage, int miplevel, int ybegin, int yend,
330331
bool contiguous = (xstride == (stride_t)buffer_pixel_bytes
331332
&& ystride == (stride_t)buffer_scanline_bytes);
332333

333-
if (native && contiguous) {
334+
// no_type_convert is true if asking for data in the native format
335+
bool no_type_convert = (format == spec.format
336+
&& spec.channelformats.empty());
337+
if ((native || no_type_convert) && contiguous) {
334338
if (chbegin == 0 && chend == spec.nchannels)
335339
return read_native_scanlines(subimage, miplevel, ybegin, yend, z,
336340
data);

0 commit comments

Comments
 (0)