-
Notifications
You must be signed in to change notification settings - Fork 377
Sort out locales for OSL #795
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
LGTM, but why not using the safe float parsing and serializing function everywhere and bypass locale entirely? I still have nightmares with the locale thing from my early days. If the problem is ostreams, can those safe functions be used there somehow as well? |
Using the safe version "everywhere" means that you can never write this natural thing:
Instead, you have to remember to do this awkward thing:
That seems very error prone, so in addition to making sure we always use the "safe" versions of string->float and float->string, we also initialize every stream upon creation, so that ordinary stream output also works properly. That's really all this patch is doing. |
Ok, fair enough. Still LGTM |
This whole issue makes me want to tear my hair out! Looking through the tickets of some other languages (Python, Julia, etc.) it seems they all have to grapple with this mess at some point. |
I got quite the education this week, realizing just how much of C++'s I/O functionality implicitly takes the system's "locale" into consideration. Of particular concern is the conversion of text to floating point numeric values and vice versa, for which much C++ functionality will happily swap the role of '.' and, say, ',' (comma) for the decimal mark, depending on the user's country and whether they are savvy enough to set their locale to the classic "C" (essentially US-style) version. On the other hand, who can blame somebody for not blindly accepting US hegemony these days? But dammit, OSL is a programming language, and so '.' has a specific role in numeric constants and ',' does something quite different (comma operator!), and if you start mixing up the two when parsing programs, it will all end in tears. All other major programming languages just adopt this convention as well. Now, if you were writing a full app, you could set the locale to the classic one globally (to that process) upon startup, and then never worry about it. Which we can do for oslc itself. But much of OSL is actually a library that can be embedded into other libraries or apps, so globally setting the locale can mess up UI internationalization or the intentional localization of other app I/O. So we have to handle this deftly throughout. The fix comes in three parts: 1. I've submitted a separate review to OIIO here: AcademySoftwareFoundation/OpenImageIO#1785 which tries to straighten up that library, and in particular the various Strutil and ustring utilities that we liberally use throughout OSL. And patrol some of the places where we directly call the locale-sensitive C++ functions like atof and strtod, and replace them with the now-safe OIIO::Strutil equivalents. We'll be able to count on all this working for OIIO >= 1.8.6. 2. In a variety of places that don't rely on OIIO (for example, various outputs via std::ostringstream), we need to take care to explicitly set the stream's locale to produce the output we expect. 3. For people building OSL against a too-old version of OIIO to contain the fixes described above, when compiling a buffer of OSL source code, do the gross but necessary safeguard of saving the prior locale, globally changing to "C" locale while we compile the OSL, then restoring the locale. And *HOPE* that the surrounding application isn't simultaneously counting on the original locale-based internationalization still being in effect for something it is doing in another thread that's unaware that we've temporarily altered a global setting. That's inherently unsafe, but remember it only affects OSL builds against pre-1.8 OIIO (once that patch is applied on the OIIO side), and only will be symptomatic for systems using a non-. locale (which is, empirically speaking, rare enough that nobody noticed this issue until this week).
I got quite the education this week, realizing just how much of C++'s I/O functionality implicitly takes the system's "locale" into consideration. Of particular concern is the conversion of text to floating point numeric values and vice versa, for which much C++ functionality will happily swap the role of '.' and, say, ',' (comma) for the decimal mark, depending on the user's country and whether they are savvy enough to set their locale to the classic "C" (essentially US-style) version. On the other hand, who can blame somebody for not blindly accepting US hegemony these days?
But dammit, OSL is a programming language, and so '.' has a specific role in numeric constants and ',' does something quite different (comma operator!), and if you start mixing up the two when parsing programs, it will all end in tears. All other major programming languages just adopt this convention as well.
Now, if you were writing a full app, you could set the locale to the classic one globally (to that process) upon startup, and then never worry about it. Which we can do for oslc itself. But much of OSL is actually a library that can be embedded into other libraries or apps, so globally setting the locale can mess up UI internationalization or the intentional localization of other app I/O. So we have to handle this deftly throughout.
The fix comes in three parts:
I've submitted a separate review to OIIO here: More principled handling of float<->string OpenImageIO#1785 which tries to straighten up that library, and in particular the various Strutil and ustring utilities that we liberally use throughout OSL. And patrol some of the places where we directly call the locale-sensitive C++ functions like atof and strtod, and replace them with the now-safe OIIO::Strutil equivalents. We'll be able to count on all this working for OIIO >= 1.8.6.
In a variety of places that don't rely on OIIO (for example, various outputs via std::ostringstream), we need to take care to explicitly set the stream's locale to produce the output we expect.
For people building OSL against a too-old version of OIIO to contain the fixes described above, when compiling a buffer of OSL source code, do the gross but necessary safeguard of saving the prior locale, globally changing to "C" locale while we compile the OSL, then restoring the locale. And HOPE that the surrounding application isn't simultaneously counting on the original locale-based internationalization still being in effect for something it is doing in another thread that's unaware that we've temporarily altered a global setting. That's inherently unsafe, but remember it only affects OSL builds against pre-1.8 OIIO (once that patch is applied on the OIIO side), and only will be symptomatic for systems using a non-. locale (which is, empirically speaking, rare enough that nobody noticed this issue until this week).