Skip to content

Commit 07a65b0

Browse files
committed
More principled handling of string<->numeric conversions.
This refactor comes after a belated realization that our numeric conversion routines (principally from_string<> and format) are "locale-dependent" via their use of strtod, as well as atof() scattered through the code. In particular, this means that when software using this code is running with a locale in which the decimal mark is something other than '.' (e.g., many European locales that use ',' for decimal mark), these functions will incorrectly parse text floating point numbers that use the standard period as decimal mark. For example, if the locale is "fr_FR", atof("123.45") will return 123.0, rather than the correct 123.45. After much debate, and a few implementations, we are declaring that since the whole point of OIIO is about reading and writing persistent data (image files!), that in order to be consistent, ALL of OIIO's string parsing and formatting, including that done by utility library functions, will be locale-independent by using the "C" locale (in particular, we will use the usual North American default of '.' for decimal separator). Additionally: * Added a new stof(), stoi(), stoul() to replace the clunkier template syntax of from_string<>. As explained, these are hard-coded to "C" locale, not the native/global locale. * Much more attention to string_view safety, finding a number of subtle bugs (including Strutil::parse_int, etc.) where it was definitely not safe to assume that the string_view passed was the null-terminated edge of the string, and so, for example if a string_view pointed to the characters "12" but the very next characters were "345" parse_int would return 12345 instead of the correct 12. * Strutil::format and ustring::format (and our copy of tinyformat underneath) were touched up so that the versions of format() that return strings will use the "C" locale, versions added that return strings and use an explicitly-passed locale, and the versions that append their results to existing streams continue to operate as before by honoring the existing locale conventions of the streams they are using. We may revise this later depending on what the upstream tinyformat chooses to do about the issue. * Several places where we assembled strings using std::ostringstream, I forced the stream to use classic "C" locale in cases where it was likely to have any floating-point data output. * For maketx and oiiotool, globally set the locale to classic. In these cases, we control the whole app, so this is safe to do. If we ever decide that we want versions of these functions that take explicit locales, it's not hard to add.
1 parent 198aec9 commit 07a65b0

File tree

20 files changed

+631
-156
lines changed

20 files changed

+631
-156
lines changed

.travis.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,7 @@ addons:
3232
- libavformat-dev
3333
- libswscale-dev
3434
- libavutil-dev
35+
- locales
3536
# - qtbase5-dev # FIXME: enable Qt5 on Linux
3637
# - bzip2
3738
# - libtinyxml-dev

src/fits.imageio/fitsinput.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -348,7 +348,7 @@ FitsInput::add_to_spec (const std::string &keyname, const std::string &value)
348348
// converting string to float or integer
349349
bool isNumSign = (value[0] == '+' || value[0] == '-' || value[0] == '.');
350350
if (isdigit (value[0]) || isNumSign) {
351-
float val = atof (value.c_str ());
351+
float val = Strutil::stof (value);
352352
if (val == (int)val)
353353
m_spec.attribute (keyname, (int)val);
354354
else

src/include/OpenImageIO/optparser.h

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,7 @@
4040
#define OPENIMAGEIO_OPTPARSER_H
4141

4242
#include <string>
43+
#include <OpenImageIO/strutil.h>
4344

4445
OIIO_NAMESPACE_BEGIN
4546

@@ -67,9 +68,9 @@ optparse1 (C &system, const std::string &opt)
6768
char v = value.size() ? value[0] : ' ';
6869
if ((v >= '0' && v <= '9') || v == '+' || v == '-') { // numeric
6970
if (strchr (value.c_str(), '.')) // float
70-
return system.attribute (name.c_str(), (float)atof(value.c_str()));
71+
return system.attribute (name, Strutil::stof(value));
7172
else // int
72-
return system.attribute (name.c_str(), (int)atoi(value.c_str()));
73+
return system.attribute (name, Strutil::stoi(value));
7374
}
7475
// otherwise treat it as a string
7576

src/include/OpenImageIO/strutil.h

Lines changed: 81 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -97,6 +97,8 @@ void OIIO_API sync_output (std::ostream &file, string_view str);
9797
///
9898
/// Uses the tinyformat library underneath, so it's fully type-safe, and
9999
/// works with any types that understand stream output via '<<'.
100+
/// The formatting of the string will always use the classic "C" locale
101+
/// conventions (in particular, '.' as decimal separator for float values).
100102
template<typename... Args>
101103
inline std::string format (string_view fmt, const Args&... args)
102104
{
@@ -105,23 +107,25 @@ inline std::string format (string_view fmt, const Args&... args)
105107

106108

107109
/// Output formatted string to stdout, type-safe, and threads can't clobber
108-
/// one another.
110+
/// one another. This will force the classic "C" locale.
109111
template<typename... Args>
110112
inline void printf (string_view fmt, const Args&... args)
111113
{
112114
sync_output (stdout, format(fmt, args...));
113115
}
114116

117+
115118
/// Output formatted string to an open FILE*, type-safe, and threads can't
116-
/// clobber one another.
119+
/// clobber one another. This will force classic "C" locale conventions.
117120
template<typename... Args>
118121
inline void fprintf (FILE *file, string_view fmt, const Args&... args)
119122
{
120123
sync_output (file, format(fmt, args...));
121124
}
122125

126+
123127
/// Output formatted string to an open ostream, type-safe, and threads can't
124-
/// clobber one another.
128+
/// clobber one another. This will force classic "C" locale conventions.
125129
template<typename... Args>
126130
inline void fprintf (std::ostream &file, string_view fmt, const Args&... args)
127131
{
@@ -254,50 +258,99 @@ std::string OIIO_API repeat (string_view str, int n);
254258
std::string OIIO_API replace (string_view str, string_view pattern,
255259
string_view replacement, bool global=false);
256260

257-
// Helper template to test if a string is a generic type
258-
template<typename T>
259-
inline bool string_is (string_view /*s*/) {
260-
return false; // Generic: assume there is an explicit specialization
261-
}
262-
// Special case for int
263-
template <> inline bool string_is<int> (string_view s) {
264-
if (s.empty())
265-
return false;
266-
char *endptr = 0;
267-
strtol (s.data(), &endptr, 10);
268-
return (s.data() + s.size() == endptr);
269-
}
270-
// Special case for float
271-
template <> inline bool string_is<float> (string_view s) {
272-
if (s.empty())
273-
return false;
274-
char *endptr = 0;
275-
strtod (s.data(), &endptr);
276-
return (s.data() + s.size() == endptr);
261+
262+
/// strtod/strtof equivalents that are "locale-independent", always using
263+
/// '.' as the decimal separator. This should be preferred for I/O and other
264+
/// situations where you want the same standard formatting regardless of
265+
/// locale.
266+
float OIIO_API strtof (const char *nptr, char **endptr = nullptr);
267+
double OIIO_API strtod (const char *nptr, char **endptr = nullptr);
268+
269+
270+
// stoi() returns the int conversion of text from a string.
271+
// No exceptions or errors -- parsing errors just return 0, over/underflow
272+
// gets clamped to int range. No locale consideration.
273+
OIIO_API int stoi (string_view s, size_t* pos=0, int base=10);
274+
275+
// stoui() returns the unsigned int conversion of text from a string.
276+
// No exceptions or errors -- parsing errors just return 0. Negative
277+
// values are cast, overflow is clamped. No locale considerations.
278+
inline unsigned int stoui (string_view s, size_t* pos=0, int base=10) {
279+
return static_cast<unsigned int>(stoi (s, pos, base));
277280
}
278281

282+
/// stof() returns the float conversion of text from several string types.
283+
/// No exceptions or errors -- parsing errors just return 0.0. These always
284+
/// use '.' for the decimal mark (versus atof and std::strtof, which are
285+
/// locale-dependent).
286+
OIIO_API float stof (string_view s, size_t* pos=0);
287+
#define OIIO_STRUTIL_HAS_STOF 1 /* be able to test this */
288+
289+
// Temporary fix: allow separate std::string and char* versions, to avoid
290+
// string_view allocation on some platforms. This will be deprecated once
291+
// we can count on all supported compilers using short string optimization.
292+
OIIO_API float stof (const std::string& s, size_t* pos=0);
293+
OIIO_API float stof (const char* s, size_t* pos=0);
294+
// N.B. For users of ustring, there's a stof(ustring) defined in ustring.h.
295+
296+
OIIO_API double stod (string_view s, size_t* pos=0);
297+
OIIO_API double stod (const std::string& s, size_t* pos=0);
298+
OIIO_API double stod (const char* s, size_t* pos=0);
299+
279300

280301

281-
// Helper template to convert from generic type to string
302+
/// Return true if the string is exactly (other than leading and trailing
303+
/// whitespace) a valid int.
304+
OIIO_API bool string_is_int (string_view s);
305+
306+
/// Return true if the string is exactly (other than leading or trailing
307+
/// whitespace) a valid float. This operations in a locale-independent
308+
/// manner, i.e., it assumes '.' as the decimal mark.
309+
OIIO_API bool string_is_float (string_view s);
310+
311+
312+
313+
// Helper template to convert from generic type to string. Used when you
314+
// want stoX but you're in a template. Rigged to use "C" locale.
282315
template<typename T>
283316
inline T from_string (string_view s) {
284317
return T(s); // Generic: assume there is an explicit converter
285318
}
286319
// Special case for int
287320
template<> inline int from_string<int> (string_view s) {
288-
return s.size() ? strtol (s.c_str(), NULL, 10) : 0;
321+
return Strutil::stoi(s);
289322
}
290323
// Special case for uint
291324
template<> inline unsigned int from_string<unsigned int> (string_view s) {
292-
return s.size() ? strtoul (s.c_str(), NULL, 10) : (unsigned int)0;
325+
return Strutil::stoui(s);
293326
}
294-
// Special case for float
327+
// Special case for float -- note that by using Strutil::strtof, this
328+
// always treats '.' as the decimal mark.
295329
template<> inline float from_string<float> (string_view s) {
296-
return s.size() ? (float)strtod (s.c_str(), NULL) : 0.0f;
330+
return Strutil::stof(s);
297331
}
298332

299333

300334

335+
// Helper template to test if a string is a generic type. Used instead of
336+
// string_is_X, but when you're inside templated code.
337+
template<typename T>
338+
inline bool string_is (string_view /*s*/) {
339+
return false; // Generic: assume there is an explicit specialization
340+
}
341+
// Special case for int
342+
template <> inline bool string_is<int> (string_view s) {
343+
return string_is_int (s);
344+
}
345+
// Special case for float. Note that by using Strutil::stof, this always
346+
// treats '.' as the decimal character.
347+
template <> inline bool string_is<float> (string_view s) {
348+
return string_is_float (s);
349+
}
350+
351+
352+
353+
301354
/// Given a string containing values separated by a comma (or optionally
302355
/// another separator), extract the individual values, placing them into
303356
/// vals[] which is presumed to already contain defaults. If only a single

src/include/OpenImageIO/tinyformat.h

Lines changed: 31 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -266,6 +266,7 @@ template<typename T>
266266
inline void formatTruncated(std::ostream& out, const T& value, int ntrunc)
267267
{
268268
std::ostringstream tmp;
269+
tmp.imbue (out.getloc());
269270
tmp << value;
270271
std::string result = tmp.str();
271272
out.write(result.c_str(), (std::min)(ntrunc, static_cast<int>(result.size())));
@@ -800,6 +801,7 @@ inline void formatImpl(std::ostream& out, const char* fmt,
800801
// it crudely by formatting into a temporary string stream and
801802
// munging the resulting string.
802803
std::ostringstream tmpStream;
804+
tmpStream.imbue (out.getloc());
803805
tmpStream.copyfmt(out);
804806
tmpStream.setf(std::ios::showpos);
805807
arg.format(tmpStream, fmt, fmtEnd, ntrunc);
@@ -931,6 +933,7 @@ TINYFORMAT_FOREACH_ARGNUM(TINYFORMAT_MAKE_MAKEFORMATLIST)
931933
#endif
932934

933935
/// Format list of arguments to the stream according to the given format string.
936+
/// This honors the stream's existing locale conventions.
934937
///
935938
/// The name vformat() is chosen for the semantic similarity to vprintf(): the
936939
/// list of format arguments is held in a single function argument.
@@ -943,23 +946,40 @@ inline void vformat(std::ostream& out, const char* fmt, FormatListRef list)
943946
#ifdef TINYFORMAT_USE_VARIADIC_TEMPLATES
944947

945948
/// Format list of arguments to the stream according to given format string.
949+
/// This honors the stream's existing locale conventions.
946950
template<typename... Args>
947951
void format(std::ostream& out, const char* fmt, const Args&... args)
948952
{
949953
vformat(out, fmt, makeFormatList(args...));
950954
}
951955

952956
/// Format list of arguments according to the given format string and return
953-
/// the result as a string.
957+
/// the result as a string, using classic "C" locale conventions (e.g.,
958+
/// using '.' as decimal mark).
954959
template<typename... Args>
955960
std::string format(const char* fmt, const Args&... args)
956961
{
957962
std::ostringstream oss;
963+
oss.imbue (std::locale::classic()); // force "C" locale with '.' decimal
964+
format(oss, fmt, args...);
965+
return oss.str();
966+
}
967+
968+
/// Format list of arguments according to the given format string and return
969+
/// the result as a string, using an explicit locale. Passing loc as a
970+
/// default-constructed std::locale will result in adhering to the current
971+
/// "native" locale set by the OS.
972+
template<typename... Args>
973+
std::string format(const std::locale& loc, const char* fmt, const Args&... args)
974+
{
975+
std::ostringstream oss;
976+
oss.imbue (loc);
958977
format(oss, fmt, args...);
959978
return oss.str();
960979
}
961980

962981
/// Format list of arguments to std::cout, according to the given format string
982+
/// This honors std::out's existing locale conventions.
963983
template<typename... Args>
964984
void printf(const char* fmt, const Args&... args)
965985
{
@@ -1011,6 +1031,16 @@ template<TINYFORMAT_ARGTYPES(n)> \
10111031
std::string format(const char* fmt, TINYFORMAT_VARARGS(n)) \
10121032
{ \
10131033
std::ostringstream oss; \
1034+
oss.imbue (std::locale::classic()); \
1035+
format(oss, fmt, TINYFORMAT_PASSARGS(n)); \
1036+
return oss.str(); \
1037+
} \
1038+
\
1039+
template<TINYFORMAT_ARGTYPES(n)> \
1040+
std::string format(const std::locale& loc, const char* fmt, TINYFORMAT_VARARGS(n)) \
1041+
{ \
1042+
std::ostringstream oss; \
1043+
oss.imbue (loc); \
10141044
format(oss, fmt, TINYFORMAT_PASSARGS(n)); \
10151045
return oss.str(); \
10161046
} \

src/include/OpenImageIO/ustring.h

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -626,6 +626,8 @@ class OIIO_API ustring {
626626
/// something like:
627627
/// ustring s = ustring::format ("blah %d %g", (int)foo, (float)bar);
628628
/// The argument list is fully typesafe.
629+
/// The formatting of the string will always use the classic "C" locale
630+
/// conventions (in particular, '.' as decimal separator for float values).
629631
template<typename... Args>
630632
static ustring format (string_view fmt, const Args&... args)
631633
{
@@ -751,6 +753,12 @@ inline bool iequals (const std::string &a, ustring b) {
751753
}
752754

753755

756+
757+
// ustring variant stof from OpenImageIO/strutil.h
758+
namespace Strutil {
759+
inline int stof (ustring s) { return Strutil::stof (s.string()); }
760+
} // end namespace Strutil
761+
754762
OIIO_NAMESPACE_END
755763

756764
#endif // OPENIMAGEIO_USTRING_H

src/iv/imageviewer.cpp

Lines changed: 3 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -117,12 +117,9 @@ ImageViewer::ImageViewer ()
117117
{
118118
readSettings (false);
119119

120-
const char *gamenv = getenv ("GAMMA");
121-
if (gamenv) {
122-
float g = atof (gamenv);
123-
if (g >= 0.1 && g <= 5)
124-
m_default_gamma = g;
125-
}
120+
float gam = Strutil::stof (Sysutil::getenv ("GAMMA"));
121+
if (gam >= 0.1 && gam <= 5)
122+
m_default_gamma = gam;
126123
// FIXME -- would be nice to have a more nuanced approach to display
127124
// color space, in particular knowing whether the display is sRGB.
128125
// Also, some time in the future we may want a real 3D LUT for

src/libOpenImageIO/formatspec.cpp

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -936,6 +936,7 @@ spec_to_xml (const ImageSpec &spec, ImageSpec::SerialVerbose verbose)
936936
}
937937

938938
std::ostringstream result;
939+
result.imbue (std::locale::classic()); // force "C" locale with '.' decimal
939940
doc.print (result, "");
940941
return result.str();
941942
}

src/libOpenImageIO/maketexture.cpp

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1386,6 +1386,7 @@ make_texture_impl (ImageBufAlgo::MakeTextureMode mode,
13861386
// (such as filtering information) needs to be manually added into the
13871387
// hash.
13881388
std::ostringstream addlHashData;
1389+
addlHashData.imbue (std::locale::classic()); // Force "C" locale with '.' decimal
13891390
addlHashData << filtername << " ";
13901391
float sharpen = configspec.get_float_attribute ("maketx:sharpen", 0.0f);
13911392
if (sharpen != 0.0f) {
@@ -1417,6 +1418,7 @@ make_texture_impl (ImageBufAlgo::MakeTextureMode mode,
14171418

14181419
if (isConstantColor) {
14191420
std::ostringstream os; // Emulate a JSON array
1421+
os.imbue (std::locale::classic()); // Force "C" locale with '.' decimal
14201422
for (int i = 0; i < dstspec.nchannels; ++i) {
14211423
if (i!=0) os << ",";
14221424
os << (i<(int)constantColor.size() ? constantColor[i] : 0.0f);
@@ -1436,6 +1438,7 @@ make_texture_impl (ImageBufAlgo::MakeTextureMode mode,
14361438

14371439
if (compute_average_color) {
14381440
std::ostringstream os; // Emulate a JSON array
1441+
os.imbue (std::locale::classic()); // Force "C" locale with '.' decimal
14391442
for (int i = 0; i < dstspec.nchannels; ++i) {
14401443
if (i!=0) os << ",";
14411444
os << (i<(int)pixel_stats.avg.size() ? pixel_stats.avg[i] : 0.0f);

src/libtexture/imagecache.cpp

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1610,6 +1610,7 @@ ImageCacheImpl::onefile_stat_line (const ImageCacheFileRef &file,
16101610
{
16111611
// FIXME -- make meaningful stat printouts for multi-image textures
16121612
std::ostringstream out;
1613+
out.imbue (std::locale::classic()); // Force "C" locale with '.' decimal
16131614
const ImageSpec &spec (file->spec(0,0));
16141615
const char *formatcode = "u8";
16151616
switch (spec.format.basetype) {
@@ -1737,6 +1738,7 @@ ImageCacheImpl::getstats (int level) const
17371738
}
17381739

17391740
std::ostringstream out;
1741+
out.imbue (std::locale::classic()); // Force "C" locale with '.' decimal
17401742
if (level > 0) {
17411743
out << "OpenImageIO ImageCache statistics (";
17421744
{

0 commit comments

Comments
 (0)