Skip to content

Sort out locales for OSL #795

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 28, 2017
Merged

Conversation

lgritz
Copy link
Collaborator

@lgritz lgritz commented Oct 12, 2017

I got quite the education this week, realizing just how much of C++'s I/O functionality implicitly takes the system's "locale" into consideration. Of particular concern is the conversion of text to floating point numeric values and vice versa, for which much C++ functionality will happily swap the role of '.' and, say, ',' (comma) for the decimal mark, depending on the user's country and whether they are savvy enough to set their locale to the classic "C" (essentially US-style) version. On the other hand, who can blame somebody for not blindly accepting US hegemony these days?

But dammit, OSL is a programming language, and so '.' has a specific role in numeric constants and ',' does something quite different (comma operator!), and if you start mixing up the two when parsing programs, it will all end in tears. All other major programming languages just adopt this convention as well.

Now, if you were writing a full app, you could set the locale to the classic one globally (to that process) upon startup, and then never worry about it. Which we can do for oslc itself. But much of OSL is actually a library that can be embedded into other libraries or apps, so globally setting the locale can mess up UI internationalization or the intentional localization of other app I/O. So we have to handle this deftly throughout.

The fix comes in three parts:

  1. I've submitted a separate review to OIIO here: More principled handling of float<->string OpenImageIO#1785 which tries to straighten up that library, and in particular the various Strutil and ustring utilities that we liberally use throughout OSL. And patrol some of the places where we directly call the locale-sensitive C++ functions like atof and strtod, and replace them with the now-safe OIIO::Strutil equivalents. We'll be able to count on all this working for OIIO >= 1.8.6.

  2. In a variety of places that don't rely on OIIO (for example, various outputs via std::ostringstream), we need to take care to explicitly set the stream's locale to produce the output we expect.

  3. For people building OSL against a too-old version of OIIO to contain the fixes described above, when compiling a buffer of OSL source code, do the gross but necessary safeguard of saving the prior locale, globally changing to "C" locale while we compile the OSL, then restoring the locale. And HOPE that the surrounding application isn't simultaneously counting on the original locale-based internationalization still being in effect for something it is doing in another thread that's unaware that we've temporarily altered a global setting. That's inherently unsafe, but remember it only affects OSL builds against pre-1.8 OIIO (once that patch is applied on the OIIO side), and only will be symptomatic for systems using a non-. locale (which is, empirically speaking, rare enough that nobody noticed this issue until this week).

@aconty
Copy link
Contributor

aconty commented Oct 12, 2017

LGTM, but why not using the safe float parsing and serializing function everywhere and bypass locale entirely? I still have nightmares with the locale thing from my early days. If the problem is ostreams, can those safe functions be used there somehow as well?

@lgritz
Copy link
Collaborator Author

lgritz commented Oct 12, 2017

Using the safe version "everywhere" means that you can never write this natural thing:

// ostream out;
out << my_float;

Instead, you have to remember to do this awkward thing:

out << Strutil::format("%g", my_float);

That seems very error prone, so in addition to making sure we always use the "safe" versions of string->float and float->string, we also initialize every stream upon creation, so that ordinary stream output also works properly. That's really all this patch is doing.

@aconty
Copy link
Contributor

aconty commented Oct 12, 2017

Ok, fair enough. Still LGTM

@lgritz
Copy link
Collaborator Author

lgritz commented Oct 12, 2017

This whole issue makes me want to tear my hair out!

Looking through the tickets of some other languages (Python, Julia, etc.) it seems they all have to grapple with this mess at some point.

I got quite the education this week, realizing just how much of C++'s
I/O functionality implicitly takes the system's "locale" into
consideration.  Of particular concern is the conversion of text to
floating point numeric values and vice versa, for which much C++
functionality will happily swap the role of '.' and, say, ',' (comma)
for the decimal mark, depending on the user's country and whether they
are savvy enough to set their locale to the classic "C" (essentially
US-style) version. On the other hand, who can blame somebody for not
blindly accepting US hegemony these days?

But dammit, OSL is a programming language, and so '.' has a specific
role in numeric constants and ',' does something quite different (comma
operator!), and if you start mixing up the two when parsing programs,
it will all end in tears. All other major programming languages just
adopt this convention as well.

Now, if you were writing a full app, you could set the locale to the
classic one globally (to that process) upon startup, and then never
worry about it. Which we can do for oslc itself. But much of OSL is
actually a library that can be embedded into other libraries or apps,
so globally setting the locale can mess up UI internationalization or
the intentional localization of other app I/O. So we have to handle
this deftly throughout.

The fix comes in three parts:

1. I've submitted a separate review to OIIO here:
AcademySoftwareFoundation/OpenImageIO#1785 which tries to straighten
up that library, and in particular the various Strutil and ustring
utilities that we liberally use throughout OSL. And patrol some of the
places where we directly call the locale-sensitive C++ functions like
atof and strtod, and replace them with the now-safe OIIO::Strutil
equivalents. We'll be able to count on all this working for OIIO >= 1.8.6.

2. In a variety of places that don't rely on OIIO (for example, various
outputs via std::ostringstream), we need to take care to explicitly set
the stream's locale to produce the output we expect.

3. For people building OSL against a too-old version of OIIO to contain
the fixes described above, when compiling a buffer of OSL source code,
do the gross but necessary safeguard of saving the prior locale,
globally changing to "C" locale while we compile the OSL, then restoring
the locale.  And *HOPE* that the surrounding application isn't
simultaneously counting on the original locale-based
internationalization still being in effect for something it is doing in
another thread that's unaware that we've temporarily altered a global
setting. That's inherently unsafe, but remember it only affects OSL builds
against pre-1.8 OIIO (once that patch is applied on the OIIO side), and
only will be symptomatic for systems using a non-. locale (which is,
empirically speaking, rare enough that nobody noticed this issue until
this week).
@lgritz lgritz merged commit 0f33917 into AcademySoftwareFoundation:master Oct 28, 2017
@lgritz lgritz deleted the lg-locale branch October 28, 2017 03:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants