Skip to content

Commit 8cd64b5

Browse files
committed
More principled handling of float<->string
This refactor comes after a belated realization that our numeric conversion routines (principally from_string<>) are "locale-dependent" via their use of strtod, as well as atof() scattered through the code. In particular, this means that when software using this code is running with a locale in which the decimal mark is something other than '.' (e.g., many European locales that use ',' for decimal mark), these functions will incorrectly parse text floating point numbers that use the standard period as decimal mark. For example, if the locale is "fr_FR", atof("123.45") will return 123.0, rather than the correct 123.45. At the core of our fix is the introduction of Strutil::strtof/strtod which by default is "locale-independent" (i.e. always uses '.') and also has varieties that take a std::locale& explicitly. The implementation uses strtod_l/strtof_l (nonstandard, but available on Linux, OSX, BSD, and a differently-named but equivalent function on Windows), with a gross but simple fallback for any straggler platforms (MINGW?) that copies the string and replaces the '.' with the native locale's mark. This is pretty inexpensive: on Linux, the locale-independent version of our new Strutil::strtof has the same cost as the system locale-dependent atof/strtof, and the explicit locale version is only about 20% slower. On OSX, atof is remarkably fast, so our version that uses strtof_l is about 2x slower, but that seems acceptable given that it gives us all the control we want, and in comparison, another strategy involving istringstream was 10x slower. Unless this is shown to be an important bottleneck, I'm rejecting the alternative of embedding a full custom strtod implementation as being unnecessarily inviting maintenance costs and potential bugs. The approach we chose is minimally invasive and relies on system libraries for the heavy lifting. Additionally: * Added a new stof(), stoi(), stoul() to replace the clunkier template syntax of from_string<>. The stof(), like our new strtof(), has a locale-independent as well as an explicit-local versions. * A handful of places that still had old calls to std atof and strtod were changed to Strutil::stof. * Strutil::format and ustring::format (and our copy of tinyformat underneath) were touched up so that the versions of format() that return strings will use the "C" locale, versions added that return strings and use an explicitly-passed locale, and the versions that append their results to existing streams continue to operate as before by honoring the existing locale conventions of the streams they are using. * Several places where we assembled strings using std::ostringstream, I forced the stream to use classic "C" locale in cases where it was likely to have any floating-point data output. * For maketx and oiiotool, globally set the locale to classic. In these cases, we control the whole app, so this is safe to do. TIP going forward: For any I/O that must be persistent (files saved or read), or always be formatted identically regardless of the computer being used, always take care to use locale-independent (i.e. classic "C" locale) formatting functions and also making sure to initialize any std::ostringstream with stream.imbue(std::locale::classic()). For ephemeral I/O or UI elements that you want to display correctly internationalized for the user's country, then use the versions with explicit locale std::locale() and use the default initialization of output streams. As an aside, I do wish that C/C++ had adopted the convention of default using the classic (globally uniform) locale for all functions that don't take an explicit locale, and leave the explicit locale functions just for programs that are intentionally trying to do country-by-country localization. This is all just a huge mess, and I hate that our library can have subtle I/O bugs when used by someone in another country because of a particular global locale setting that we have no control over and can't modify simply without potentially breaking the surrounding app.
1 parent e950829 commit 8cd64b5

File tree

19 files changed

+408
-33
lines changed

19 files changed

+408
-33
lines changed

.travis.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,7 @@ addons:
3333
- libavformat-dev
3434
- libswscale-dev
3535
- libavutil-dev
36+
- locales
3637
# - qtbase5-dev # FIXME: enable Qt5 on Linux
3738
# - bzip2
3839
# - libtinyxml-dev

src/fits.imageio/fitsinput.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -348,7 +348,7 @@ FitsInput::add_to_spec (const std::string &keyname, const std::string &value)
348348
// converting string to float or integer
349349
bool isNumSign = (value[0] == '+' || value[0] == '-' || value[0] == '.');
350350
if (isdigit (value[0]) || isNumSign) {
351-
float val = atof (value.c_str ());
351+
float val = Strutil::stof (value);
352352
if (val == (int)val)
353353
m_spec.attribute (keyname, (int)val);
354354
else

src/include/OpenImageIO/optparser.h

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,7 @@
4040
#define OPENIMAGEIO_OPTPARSER_H
4141

4242
#include <string>
43+
#include <OpenImageIO/strutil.h>
4344

4445
OIIO_NAMESPACE_BEGIN
4546

@@ -67,9 +68,9 @@ optparse1 (C &system, const std::string &opt)
6768
char v = value.size() ? value[0] : ' ';
6869
if ((v >= '0' && v <= '9') || v == '+' || v == '-') { // numeric
6970
if (strchr (value.c_str(), '.')) // float
70-
return system.attribute (name.c_str(), (float)atof(value.c_str()));
71+
return system.attribute (name, Strutil::stof(value));
7172
else // int
72-
return system.attribute (name.c_str(), (int)atoi(value.c_str()));
73+
return system.attribute (name, Strutil::stoi(value));
7374
}
7475
// otherwise treat it as a string
7576

src/include/OpenImageIO/strutil.h

Lines changed: 92 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -97,35 +97,46 @@ void OIIO_API sync_output (std::ostream &file, string_view str);
9797
///
9898
/// Uses the tinyformat library underneath, so it's fully type-safe, and
9999
/// works with any types that understand stream output via '<<'.
100+
/// The formatting of the string will always use the classic "C" locale
101+
/// conventions (in particular, '.' as decimal separator for float values).
100102
template<typename... Args>
101103
inline std::string format (string_view fmt, const Args&... args)
102104
{
103105
return tinyformat::format (fmt.c_str(), args...);
104106
}
105107

108+
/// A version of Strutil::format() that uses an explicit locale.
109+
template<typename... Args>
110+
inline std::string format (const std::locale& loc, string_view fmt,
111+
const Args&... args)
112+
{
113+
return tinyformat::format (loc, fmt.c_str(), args...);
114+
}
115+
106116

107117
/// Output formatted string to stdout, type-safe, and threads can't clobber
108-
/// one another.
118+
/// one another. This will honor the existing locale conventions of std::cout.
109119
template<typename... Args>
110120
inline void printf (string_view fmt, const Args&... args)
111121
{
112-
sync_output (stdout, format(fmt, args...));
122+
sync_output (stdout, format(std::cout.getloc(), fmt, args...));
113123
}
114124

115125
/// Output formatted string to an open FILE*, type-safe, and threads can't
116-
/// clobber one another.
126+
/// clobber one another. This will force classic "C" locale conventions.
117127
template<typename... Args>
118128
inline void fprintf (FILE *file, string_view fmt, const Args&... args)
119129
{
120130
sync_output (file, format(fmt, args...));
121131
}
122132

123133
/// Output formatted string to an open ostream, type-safe, and threads can't
124-
/// clobber one another.
134+
/// clobber one another. This will honor the existing locale conventions of
135+
/// the ostream.
125136
template<typename... Args>
126137
inline void fprintf (std::ostream &file, string_view fmt, const Args&... args)
127138
{
128-
sync_output (file, format(fmt, args...));
139+
sync_output (file, format(file.getloc(), fmt, args...));
129140
}
130141

131142

@@ -254,6 +265,23 @@ std::string OIIO_API repeat (string_view str, int n);
254265
std::string OIIO_API replace (string_view str, string_view pattern,
255266
string_view replacement, bool global=false);
256267

268+
269+
/// strtod/strtof equivalents that are "locale-independent", always using
270+
/// '.' as the decimal separator. This should be preferred for I/O and other
271+
/// situations where you want the same standard formatting regardless of
272+
/// locale.
273+
float OIIO_API strtof (const char *nptr, char **endptr = nullptr);
274+
double OIIO_API strtod (const char *nptr, char **endptr = nullptr);
275+
276+
/// strtod/strtof equivalents that take an explicit locale. This may be
277+
/// useful for UI or I/O that you specifically want to use the conventions
278+
/// of a localized country.
279+
double OIIO_API strtod (const char *nptr, char **endptr, const std::locale& loc);
280+
float OIIO_API strtof (const char *nptr, char **endptr, const std::locale& loc);
281+
#define OIIO_STRUTIL_HAS_STRTOF 1 /* be able to test this */
282+
283+
284+
257285
// Helper template to test if a string is a generic type
258286
template<typename T>
259287
inline bool string_is (string_view /*s*/) {
@@ -267,33 +295,85 @@ template <> inline bool string_is<int> (string_view s) {
267295
strtol (s.data(), &endptr, 10);
268296
return (s.data() + s.size() == endptr);
269297
}
270-
// Special case for float
298+
// Special case for float. Note that by using Strutil::strtof, this always
299+
// treats '.' as the decimal character.
271300
template <> inline bool string_is<float> (string_view s) {
272301
if (s.empty())
273302
return false;
274303
char *endptr = 0;
275-
strtod (s.data(), &endptr);
304+
Strutil::strtof (s.data(), &endptr);
276305
return (s.data() + s.size() == endptr);
277306
}
278307

279308

280309

281-
// Helper template to convert from generic type to string
310+
// stoi() returns the int conversion of text from several string types.
311+
// No exceptions or errors -- parsing errors just return 0.
312+
inline int stoi (const char* s) {
313+
return s && s[0] ? strtol (s, nullptr, 10) : 0;
314+
}
315+
inline int stoi (const std::string& s) { return Strutil::stoi(s.c_str()); }
316+
inline int stoi (string_view s) { return Strutil::stoi (std::string(s)); }
317+
318+
319+
320+
// stoul() returns the unsigned int conversion of text from several string
321+
// types. No exceptions or errors -- parsing errors just return 0.
322+
inline unsigned int stoul (const char* s) {
323+
return s && s[0] ? strtoul (s, nullptr, 10) : 0;
324+
}
325+
inline unsigned int stoul (const std::string& s) { return Strutil::stoul(s.c_str()); }
326+
inline unsigned int stoul (string_view s) { return Strutil::stoul (std::string(s)); }
327+
328+
329+
330+
/// stof() returns the float conversion of text from several string types.
331+
/// No exceptions or errors -- parsing errors just return 0.0. These always
332+
/// use '.' for the decimal mark (versus atof and std::strtof, which are
333+
/// locale-dependent).
334+
inline float stof (const std::string& s) {
335+
return s.size() ? Strutil::strtof (s.c_str(), nullptr) : 0.0f;
336+
}
337+
inline float stof (const char* s) {
338+
return Strutil::strtof (s, nullptr);
339+
}
340+
inline float stof (string_view s) {
341+
return Strutil::strtof (std::string(s).c_str(), nullptr);
342+
}
343+
344+
// stof() version that takes an explicit locale (for example, if you pass a
345+
// default-constructed std::locale, it will use the current native locale's
346+
// decimal conventions).
347+
inline float stof (const std::string& s, const std::locale& loc) {
348+
return s.size() ? Strutil::strtof (s.c_str(), nullptr, loc) : 0.0f;
349+
}
350+
inline float stof (const char* s, const std::locale& loc) {
351+
return Strutil::strtof (s, nullptr, loc);
352+
}
353+
inline float stof (string_view s, const std::locale& loc) {
354+
return Strutil::strtof (std::string(s).c_str(), nullptr, loc);
355+
}
356+
357+
358+
359+
// Helper template to convert from generic type to string (using default
360+
// locale).
282361
template<typename T>
283362
inline T from_string (string_view s) {
284363
return T(s); // Generic: assume there is an explicit converter
285364
}
286365
// Special case for int
287366
template<> inline int from_string<int> (string_view s) {
288-
return s.size() ? strtol (s.c_str(), NULL, 10) : 0;
367+
return s.size() ? strtol (s.c_str(), nullptr, 10) : 0;
289368
}
290369
// Special case for uint
291370
template<> inline unsigned int from_string<unsigned int> (string_view s) {
292-
return s.size() ? strtoul (s.c_str(), NULL, 10) : (unsigned int)0;
371+
return s.size() ? strtoul (s.c_str(), nullptr, 10) : (unsigned int)0;
293372
}
294-
// Special case for float
373+
// Special case for float -- note that by using Strutil::strtof, this
374+
// always treats '.' as the decimal mark.
295375
template<> inline float from_string<float> (string_view s) {
296-
return s.size() ? (float)strtod (s.c_str(), NULL) : 0.0f;
376+
return s.size() ? Strutil::strtof (s.c_str(), nullptr) : 0.0f;
297377
}
298378

299379

src/include/OpenImageIO/tinyformat.h

Lines changed: 31 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -266,6 +266,7 @@ template<typename T>
266266
inline void formatTruncated(std::ostream& out, const T& value, int ntrunc)
267267
{
268268
std::ostringstream tmp;
269+
tmp.imbue (out.getloc());
269270
tmp << value;
270271
std::string result = tmp.str();
271272
out.write(result.c_str(), (std::min)(ntrunc, static_cast<int>(result.size())));
@@ -800,6 +801,7 @@ inline void formatImpl(std::ostream& out, const char* fmt,
800801
// it crudely by formatting into a temporary string stream and
801802
// munging the resulting string.
802803
std::ostringstream tmpStream;
804+
tmpStream.imbue (out.getloc());
803805
tmpStream.copyfmt(out);
804806
tmpStream.setf(std::ios::showpos);
805807
arg.format(tmpStream, fmt, fmtEnd, ntrunc);
@@ -931,6 +933,7 @@ TINYFORMAT_FOREACH_ARGNUM(TINYFORMAT_MAKE_MAKEFORMATLIST)
931933
#endif
932934

933935
/// Format list of arguments to the stream according to the given format string.
936+
/// This honors the stream's existing locale conventions.
934937
///
935938
/// The name vformat() is chosen for the semantic similarity to vprintf(): the
936939
/// list of format arguments is held in a single function argument.
@@ -943,23 +946,40 @@ inline void vformat(std::ostream& out, const char* fmt, FormatListRef list)
943946
#ifdef TINYFORMAT_USE_VARIADIC_TEMPLATES
944947

945948
/// Format list of arguments to the stream according to given format string.
949+
/// This honors the stream's existing locale conventions.
946950
template<typename... Args>
947951
void format(std::ostream& out, const char* fmt, const Args&... args)
948952
{
949953
vformat(out, fmt, makeFormatList(args...));
950954
}
951955

952956
/// Format list of arguments according to the given format string and return
953-
/// the result as a string.
957+
/// the result as a string, using classic "C" locale conventions (e.g.,
958+
/// using '.' as decimal mark).
954959
template<typename... Args>
955960
std::string format(const char* fmt, const Args&... args)
956961
{
957962
std::ostringstream oss;
963+
oss.imbue (std::locale::classic()); // force "C" locale with '.' decimal
964+
format(oss, fmt, args...);
965+
return oss.str();
966+
}
967+
968+
/// Format list of arguments according to the given format string and return
969+
/// the result as a string, using an explicit locale. Passing loc as a
970+
/// default-constructed std::locale will result in adhering to the current
971+
/// "native" locale set by the OS.
972+
template<typename... Args>
973+
std::string format(const std::locale& loc, const char* fmt, const Args&... args)
974+
{
975+
std::ostringstream oss;
976+
oss.imbue (loc);
958977
format(oss, fmt, args...);
959978
return oss.str();
960979
}
961980

962981
/// Format list of arguments to std::cout, according to the given format string
982+
/// This honors std::out's existing locale conventions.
963983
template<typename... Args>
964984
void printf(const char* fmt, const Args&... args)
965985
{
@@ -1011,6 +1031,16 @@ template<TINYFORMAT_ARGTYPES(n)> \
10111031
std::string format(const char* fmt, TINYFORMAT_VARARGS(n)) \
10121032
{ \
10131033
std::ostringstream oss; \
1034+
oss.imbue (std::locale::classic()); \
1035+
format(oss, fmt, TINYFORMAT_PASSARGS(n)); \
1036+
return oss.str(); \
1037+
} \
1038+
\
1039+
template<TINYFORMAT_ARGTYPES(n)> \
1040+
std::string format(const std::locale& loc, const char* fmt, TINYFORMAT_VARARGS(n)) \
1041+
{ \
1042+
std::ostringstream oss; \
1043+
oss.imbue (loc); \
10141044
format(oss, fmt, TINYFORMAT_PASSARGS(n)); \
10151045
return oss.str(); \
10161046
} \

src/include/OpenImageIO/ustring.h

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -626,12 +626,22 @@ class OIIO_API ustring {
626626
/// something like:
627627
/// ustring s = ustring::format ("blah %d %g", (int)foo, (float)bar);
628628
/// The argument list is fully typesafe.
629+
/// The formatting of the string will always use the classic "C" locale
630+
/// conventions (in particular, '.' as decimal separator for float values).
629631
template<typename... Args>
630632
static ustring format (string_view fmt, const Args&... args)
631633
{
632634
return ustring (Strutil::format (fmt, args...));
633635
}
634636

637+
/// ustring formatting with an explicit locale.
638+
template<typename... Args>
639+
static ustring format (const std::locale& loc, string_view fmt,
640+
const Args&... args)
641+
{
642+
return ustring (Strutil::format (loc, fmt, args...));
643+
}
644+
635645
/// Generic stream output of a ustring.
636646
///
637647
friend std::ostream & operator<< (std::ostream &out, const ustring &str) {
@@ -751,6 +761,16 @@ inline bool iequals (const std::string &a, ustring b) {
751761
}
752762

753763

764+
765+
// add ustring variants of stoi and stof from OpenImageIO/strutil.h
766+
namespace Strutil {
767+
inline int stoi (ustring s) { return Strutil::stoi (s.c_str()); }
768+
inline int stof (ustring s) { return Strutil::strtof (s.c_str()); }
769+
inline int stof (ustring s, const std::locale& loc) {
770+
return Strutil::strtof (s.c_str(), nullptr, loc);
771+
}
772+
} // end namespace Strutil
773+
754774
OIIO_NAMESPACE_END
755775

756776
#endif // OPENIMAGEIO_USTRING_H

src/iv/imageviewer.cpp

Lines changed: 3 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -117,12 +117,9 @@ ImageViewer::ImageViewer ()
117117
{
118118
readSettings (false);
119119

120-
const char *gamenv = getenv ("GAMMA");
121-
if (gamenv) {
122-
float g = atof (gamenv);
123-
if (g >= 0.1 && g <= 5)
124-
m_default_gamma = g;
125-
}
120+
float gam = Strutil::stof (Sysutil::getenv ("GAMMA"));
121+
if (gam >= 0.1 && gam <= 5)
122+
m_default_gamma = gam;
126123
// FIXME -- would be nice to have a more nuanced approach to display
127124
// color space, in particular knowing whether the display is sRGB.
128125
// Also, some time in the future we may want a real 3D LUT for

src/libOpenImageIO/formatspec.cpp

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -924,6 +924,7 @@ spec_to_xml (const ImageSpec &spec, ImageSpec::SerialVerbose verbose)
924924
}
925925

926926
std::ostringstream result;
927+
result.imbue (std::locale::classic()); // force "C" locale with '.' decimal
927928
doc.print (result, "");
928929
return result.str();
929930
}

src/libOpenImageIO/maketexture.cpp

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1389,6 +1389,7 @@ make_texture_impl (ImageBufAlgo::MakeTextureMode mode,
13891389
// (such as filtering information) needs to be manually added into the
13901390
// hash.
13911391
std::ostringstream addlHashData;
1392+
addlHashData.imbue (std::locale::classic()); // Force "C" locale with '.' decimal
13921393
addlHashData << filtername << " ";
13931394
float sharpen = configspec.get_float_attribute ("maketx:sharpen", 0.0f);
13941395
if (sharpen != 0.0f) {
@@ -1420,6 +1421,7 @@ make_texture_impl (ImageBufAlgo::MakeTextureMode mode,
14201421

14211422
if (isConstantColor) {
14221423
std::ostringstream os; // Emulate a JSON array
1424+
os.imbue (std::locale::classic()); // Force "C" locale with '.' decimal
14231425
for (int i = 0; i < dstspec.nchannels; ++i) {
14241426
if (i!=0) os << ",";
14251427
os << (i<(int)constantColor.size() ? constantColor[i] : 0.0f);
@@ -1439,6 +1441,7 @@ make_texture_impl (ImageBufAlgo::MakeTextureMode mode,
14391441

14401442
if (compute_average_color) {
14411443
std::ostringstream os; // Emulate a JSON array
1444+
os.imbue (std::locale::classic()); // Force "C" locale with '.' decimal
14421445
for (int i = 0; i < dstspec.nchannels; ++i) {
14431446
if (i!=0) os << ",";
14441447
os << (i<(int)pixel_stats.avg.size() ? pixel_stats.avg[i] : 0.0f);

src/libtexture/imagecache.cpp

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1610,6 +1610,7 @@ ImageCacheImpl::onefile_stat_line (const ImageCacheFileRef &file,
16101610
{
16111611
// FIXME -- make meaningful stat printouts for multi-image textures
16121612
std::ostringstream out;
1613+
out.imbue (std::locale::classic()); // Force "C" locale with '.' decimal
16131614
const ImageSpec &spec (file->spec(0,0));
16141615
const char *formatcode = "u8";
16151616
switch (spec.format.basetype) {
@@ -1737,6 +1738,7 @@ ImageCacheImpl::getstats (int level) const
17371738
}
17381739

17391740
std::ostringstream out;
1741+
out.imbue (std::locale::classic()); // Force "C" locale with '.' decimal
17401742
if (level > 0) {
17411743
out << "OpenImageIO ImageCache statistics (";
17421744
{

0 commit comments

Comments
 (0)