Environment Variables Library?
StackOverflow regarding the (non) standardisation of the "setenv" function, does Boost provide a library to interact with environment variables? http://stackoverflow.com/questions/30292642/c-standard-library-stdsetenv-vs-... Michael
On May 18, 2015 4:40:13 AM EDT, Michael Ainsworth
StackOverflow regarding the (non) standardisation of the "setenv" function, does Boost provide a library to interact with environment variables?
http://stackoverflow.com/questions/30292642/c-standard-library-stdsetenv-vs-...
I'm not aware of one, which isn't the same as saying there isn't one. What would you expect it to do? ::setenv() is already standardized by POSIX, so I assume you are looking for some support on non-POSIX systems, but specifics would be helpful. ___ Rob (Sent from my portable computation engine)
Two primary purposes, two ideas for additional possible uses. Primary purposes: 1. To provide cross platform (e.g., POSIX plus Windows) equivalents of setenv, getenv, "unsetenv" and "issetenv". 2. To allow iteration over all defined environment variables. Additional possible uses: 1. To create a new environment to pass to a new process. 2. To provide non "char" based environment variables (e.g., std::wstring). All input appreciated. Michael
I'm not aware of one, which isn't the same as saying there isn't one. What would you expect it to do? ::setenv() is already standardized by POSIX, so I assume you are looking for some support on non-POSIX systems, but specifics would be helpful.
___ Rob
(Sent from my portable computation engine)
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
On Mon, 18 May 2015 11:37:00 +0200, Michael Ainsworth
Two primary purposes, two ideas for additional possible uses.
Primary purposes:
1. To provide cross platform (e.g., POSIX plus Windows) equivalents of setenv, getenv, "unsetenv" and "issetenv". 2. To allow iteration over all defined environment variables.
Additional possible uses:
1. To create a new environment to pass to a new process. 2. To provide non "char" based environment variables (e.g., std::wstring).
All input appreciated.
There is a so-called environment iterator in Boost.ProgramOptions (see http://www.boost.org/doc/libs/1_58_0/doc/html/boost/environment_iterator.htm...). And the library that should have become Boost.Process has support for passing environment variables to a new process (including std::wstring; see http://www.highscore.de/boost/process0.5/boost_process/tutorial.html#boost_p...). Boris
[...]
There is a so-called environment iterator in Boost.ProgramOptions (see http://www.boost.org/doc/libs/1_58_0/doc/html/boost/environment_iterator.htm...). And the library that should have become Boost.Process has support for passing environment variables to a new process (including std::wstring; see http://www.highscore.de/boost/process0.5/boost_process/tutorial.html#boost_p...).
Boris
Thanks for the pointers. Such a library would be reasonable small, but do you think it would be a useful addition to Boost? I'm still shaking my head that environment variables aren't better supported by standard C++.
On 19/05/2015 09:23, Michael Ainsworth wrote:
Such a library would be reasonable small, but do you think it would be a useful addition to Boost? I'm still shaking my head that environment variables aren't better supported by standard C++.
They're more popular in POSIX; but at least on Windows, environment variables are considered old-fashioned and discouraged from use without significant compelling reason.
On 19 May 2015, at 9:27 am, Gavin Lambert
wrote: They're more popular in POSIX; but at least on Windows, environment variables are considered old-fashioned and discouraged from use without significant compelling reason.
Are you saying Windows doesn't use environment variables to the extent that POSIX systems do, therefore investing the time into writing such a library is not worthwhile? (Genuine question, not sarcastic. I'm just trying to see what the need is. I'm writing some code where I have to iterate over environment variables in a cross platform way, including Windows, so I know I need this. I just want to know if C++/Boost needs/wants this.)
On 19/05/2015 17:47, Michael Ainsworth wrote:
On 19 May 2015, at 9:27 am, Gavin Lambert wrote:
They're more popular in POSIX; but at least on Windows, environment variables are considered old-fashioned and discouraged from use without significant compelling reason.
Are you saying Windows doesn't use environment variables to the extent that POSIX systems do, therefore investing the time into writing such a library is not worthwhile?
Yes to the first part, not necessarily to the second. I was trying to offer an explanation as to why someone else may not have attempted this before now, not trying to discourage you from doing so yourself if you perceive a need for it.
On Tue, May 19, 2015 at 2:27 AM, Gavin Lambert
On 19/05/2015 09:23, Michael Ainsworth wrote:
Such a library would be reasonable small, but do you think it would be a useful addition to Boost? I'm still shaking my head that environment variables aren't better supported by standard C++.
They're more popular in POSIX; but at least on Windows, environment variables are considered old-fashioned and discouraged from use without significant compelling reason.
Is there a reference for such a discouragement? Sure, most Windows-only programs just use registry but I wouldn't say environment variables are discouraged from being used. For instance, most development software, including MSVC, use environment for configuration, and registry is not seen as a replacement.
On Tue, May 19, 2015 at 8:38 AM, Andrey Semashev
On Tue, May 19, 2015 at 2:27 AM, Gavin Lambert
wrote: On 19/05/2015 09:23, Michael Ainsworth wrote:
Such a library would be reasonable small, but do you think it would be a useful addition to Boost? I'm still shaking my head that environment variables aren't better supported by standard C++.
They're more popular in POSIX; but at least on Windows, environment variables are considered old-fashioned and discouraged from use without significant compelling reason.
Is there a reference for such a discouragement? Sure, most Windows-only programs just use registry but I wouldn't say environment variables are discouraged from being used. For instance, most development software, including MSVC, use environment for configuration, and registry is not seen as a replacement.
I'm not sure for other devs but my personal guideline so far is: 1. Don't use the register except if you must (mostly because of requirements from the kind of application you are making). 2. Don't use environment variables except if you must (mostly for dev tools helping locating each other). 3. Prefer user-specific or shared directories to put config data in (config files or databases). The reasons for the first two are mainly to avoid issues like "pollution" of the system with montains of unuseful data (in particular with computers used by non-developers). Part of the source of the issue is the lack of common good uninstallation system, part of it is just historical scars itching (from windows98/me era). There is also the cross-platform code issue: if you want to provide data to the whole system in a cross-platform way, the environment variables is the only way to do so, because there is no register on non-windows platforms (that I know of). The environment variables in windows have a limited size which is small enough to be hit very quickly if you abuse it. For example the famous PATH can be filled with paths to tools like git,hg,svn, python, ruby, your favorite C++ compilers and hit the point where you can't add anything anymore. Of course, this can be worked aroudn but it's still an annoying limitation. Finally there is the "it's a public global state" issue, which I kind of like public global non-const variables in a program. It's just problematic. I'm not a specialist of the windows specifics so any of my assumptions might be wrong.
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
On 19 May 2015, at 7:18 pm, Klaim - Joël Lamotte
wrote:
I'm not sure for other devs but my personal guideline so far is:
1. Don't use the register except if you must (mostly because of requirements from the kind of application you are making). 2. Don't use environment variables except if you must (mostly for dev tools helping locating each other). 3. Prefer user-specific or shared directories to put config data in (config files or databases).
The reasons for the first two are mainly to avoid issues like "pollution" of the system with montains of unuseful data (in particular with computers used by non-developers). Part of the source of the issue is the lack of common good uninstallation system, part of it is just historical scars itching (from windows98/me era).
There is also the cross-platform code issue: if you want to provide data to the whole system in a cross-platform way, the environment variables is the only way to do so, because there is no register on non-windows platforms (that I know of).
The environment variables in windows have a limited size which is small enough to be hit very quickly if you abuse it. For example the famous PATH can be filled with paths to tools like git,hg,svn, python, ruby, your favorite C++ compilers and hit the point where you can't add anything anymore. Of course, this can be worked aroudn but it's still an annoying limitation.
Finally there is the "it's a public global state" issue, which I kind of like public global non-const variables in a program. It's just problematic.
I'm not a specialist of the windows specifics so any of my assumptions might be wrong.
Your comparison of environment variables in an operating system to global non-const variables in a software library is a good one - it's usage appears contrary to encapsulation principles. I can think of two generalised uses of environment variables: 1. It provides users with the ability to pass configuration information to a process, such the $CXX variable used in Makefiles, or $GIT_DIR for Git. 2. It provides processes with the ability to communicate with other processes (a rudimentary form of inter process communication), such as an HTTP server implementing the CGI specification, or SVN executing $EDITOR for commit messages. It'd be nice for an alternative that provides a cleaner implementation with less "global namespace pollution", but regardless, it appears that in the current situation: 1. Windows does not use environmental variables as much as POSIX, but still does use them. 2. There is not great support for environmental variables in the curren C++ standard. 3. Therefore, there is a need for such a library. Would you agree with the above from your background and experience?
-----Original Message----- From: Boost [mailto:boost-bounces@lists.boost.org] On Behalf Of Michael Ainsworth Sent: 19 May 2015 10:51 To: boost@lists.boost.org Subject: Re: [boost] Environment Variables Library?
On 19 May 2015, at 7:18 pm, Klaim - Joël Lamotte
wrote: I'm not sure for other devs but my personal guideline so far is:
1. Don't use the register except if you must (mostly because of requirements from the kind of application you are making). 2. Don't use environment variables except if you must (mostly for dev tools helping locating each other). 3. Prefer user-specific or shared directories to put config data in (config files or databases).
The reasons for the first two are mainly to avoid issues like "pollution" of the system with montains of unuseful data (in particular with computers used by non-developers). Part of the source of the issue is the lack of common good uninstallation system, part of it is just historical scars itching (from windows98/me era).
There is also the cross-platform code issue: if you want to provide data to the whole system in a cross-platform way, the environment variables is the only way to do so, because there is no register on non-windows platforms (that I know of).
The environment variables in windows have a limited size which is small enough to be hit very quickly if you abuse it. For example the famous PATH can be filled with paths to tools like git,hg,svn, python, ruby, your favorite C++ compilers and hit the point where you can't add anything anymore. Of course, this can be worked aroudn but it's still an annoying limitation.
Finally there is the "it's a public global state" issue, which I kind of like public global non-const variables in a program. It's just problematic.
I'm not a specialist of the windows specifics so any of my assumptions might be wrong.
Your comparison of environment variables in an operating system to global non-const variables in a software library is a good one - it's usage appears contrary to encapsulation principles. I can think of two generalised uses of environment variables:
1. It provides users with the ability to pass configuration information to a process, such the $CXX variable used in Makefiles, or $GIT_DIR for Git. 2. It provides processes with the ability to communicate with other processes (a rudimentary form of inter process communication), such as an HTTP server implementing the CGI specification, or SVN executing $EDITOR for commit messages.
It'd be nice for an alternative that provides a cleaner implementation with less "global namespace pollution", but regardless, it appears that in the current situation:
1. Windows does not use environmental variables as much as POSIX, but still does use them. 2. There is not great support for environmental variables in the curren C++ standard. 3. Therefore, there is a need for such a library.
Would you agree with the above from your background and experience?
Yes. Bad in theory, useful in practice. Paul --- Paul A. Bristow Prizet Farmhouse Kendal UK LA8 8AB +44 (0) 1539 561830
On Tue, May 19, 2015 at 12:54 PM, Paul A. Bristow
-----Original Message----- From: Boost [mailto:boost-bounces@lists.boost.org] On Behalf Of Michael Ainsworth Sent: 19 May 2015 10:51 To: boost@lists.boost.org Subject: Re: [boost] Environment Variables Library?
On 19 May 2015, at 7:18 pm, Klaim - Joël Lamotte
wrote: I'm not sure for other devs but my personal guideline so far is:
1. Don't use the register except if you must (mostly because of requirements from the kind of application you are making). 2. Don't use environment variables except if you must (mostly for dev tools helping locating each other). 3. Prefer user-specific or shared directories to put config data in (config files or databases).
The reasons for the first two are mainly to avoid issues like
"pollution"
of the system with montains of unuseful data (in particular with computers used by non-developers). Part of the source of the issue is the lack of common good uninstallation system, part of it is just historical scars itching (from windows98/me era).
There is also the cross-platform code issue: if you want to provide data to the whole system in a cross-platform way, the environment variables is the only way to do so, because there is no register on non-windows platforms (that I know of).
The environment variables in windows have a limited size which is small enough to be hit very quickly if you abuse it. For example the famous PATH can be filled with paths to tools like git,hg,svn, python, ruby, your favorite C++ compilers and hit the point where you can't add anything anymore. Of course, this can be worked aroudn but it's still an annoying limitation.
Finally there is the "it's a public global state" issue, which I kind of like public global non-const variables in a program. It's just problematic.
I'm not a specialist of the windows specifics so any of my assumptions might be wrong.
Your comparison of environment variables in an operating system to global non-const variables in a software library is a good one - it's usage appears contrary to encapsulation principles. I can think of two generalised uses of environment variables:
1. It provides users with the ability to pass configuration information to a process, such the $CXX variable used in Makefiles, or $GIT_DIR for Git. 2. It provides processes with the ability to communicate with other processes (a rudimentary form of inter process communication), such as an HTTP server implementing the CGI specification, or SVN executing $EDITOR for commit messages.
It'd be nice for an alternative that provides a cleaner implementation with less "global namespace pollution", but regardless, it appears that in the current situation:
1. Windows does not use environmental variables as much as POSIX, but still does use them. 2. There is not great support for environmental variables in the curren C++ standard. 3. Therefore, there is a need for such a library.
Would you agree with the above from your background and experience?
Yes.
Bad in theory, useful in practice.
Exactly.
Paul
--- Paul A. Bristow Prizet Farmhouse Kendal UK LA8 8AB +44 (0) 1539 561830
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
On Tue, May 19, 2015 at 11:51 AM, Michael Ainsworth < michael@michaelainsworth.id.au> wrote:
On 19 May 2015, at 7:18 pm, Klaim - Joël Lamotte
wrote: I'm not sure for other devs but my personal guideline so far is:
1. Don't use the register except if you must (mostly because of requirements from the kind of application you are making). 2. Don't use environment variables except if you must (mostly for dev
tools
helping locating each other). 3. Prefer user-specific or shared directories to put config data in (config files or databases).
The reasons for the first two are mainly to avoid issues like "pollution" of the system with montains of unuseful data (in particular with computers used by non-developers). Part of the source of the issue is the lack of common good uninstallation system, part of it is just historical scars itching (from windows98/me era).
There is also the cross-platform code issue: if you want to provide data to the whole system in a cross-platform way, the environment variables is the only way to do so, because there is no register on non-windows platforms (that I know of).
The environment variables in windows have a limited size which is small enough to be hit very quickly if you abuse it. For example the famous PATH can be filled with paths to tools like git,hg,svn, python, ruby, your favorite C++ compilers and hit the point where you can't add anything anymore. Of course, this can be worked aroudn but it's still an annoying limitation.
Finally there is the "it's a public global state" issue, which I kind of like public global non-const variables in a program. It's just problematic.
I'm not a specialist of the windows specifics so any of my assumptions might be wrong.
Your comparison of environment variables in an operating system to global non-const variables in a software library is a good one - it's usage appears contrary to encapsulation principles. I can think of two generalised uses of environment variables:
1. It provides users with the ability to pass configuration information to a process, such the $CXX variable used in Makefiles, or $GIT_DIR for Git. 2. It provides processes with the ability to communicate with other processes (a rudimentary form of inter process communication), such as an HTTP server implementing the CGI specification, or SVN executing $EDITOR for commit messages.
It'd be nice for an alternative that provides a cleaner implementation with less "global namespace pollution", but regardless, it appears that in the current situation:
1. Windows does not use environmental variables as much as POSIX, but still does use them. 2. There is not great support for environmental variables in the curren C++ standard. 3. Therefore, there is a need for such a library.
Would you agree with the above from your background and experience?
Yes, I agree.
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
On May 19, 2015 5:51:14 AM EDT, Michael Ainsworth
It'd be nice for an alternative that provides a cleaner implementation with less "global namespace pollution", but regardless, it appears that in the current situation:
1. Windows does not use environmental variables as much as POSIX, but still does use them. 2. There is not great support for environmental variables in the curren C++ standard. 3. Therefore, there is a need for such a library.
In my experience, non-developer tools on Windows only use PATH and, perhaps, a few system e-vars like SystemRoot. Any tools, including scripts, I've written that use the environment have done nothing more than the equivalent of getenv() and setenv(). In the end, I don't see value in the library you suggest, but there may well be enough others who do to justify it. ___ Rob (Sent from my portable computation engine)
On 19/05/2015 18:38, Andrey Semashev wrote:
On Tue, May 19, 2015 at 2:27 AM, Gavin Lambert wrote:
They're more popular in POSIX; but at least on Windows, environment variables are considered old-fashioned and discouraged from use without significant compelling reason.
Is there a reference for such a discouragement? Sure, most Windows-only programs just use registry but I wouldn't say environment variables are discouraged from being used. For instance, most development software, including MSVC, use environment for configuration, and registry is not seen as a replacement.
There's a distinction between command line and GUI tools. Environment variables are more likely to be used by command line tools, as they tend to be more script-friendly. It's rarer for environment variables to be used by GUI tools, and it's discouraged for a GUI tool to add itself to the global PATH on install, for example. (Even command line tools are encouraged to make shortcuts or scripts that temporarily add themselves to the PATH or set other variables for a particular command line session, rather than doing so globally.)
On Tue, May 19, 2015 at 7:01 PM, Gavin Lambert
On 19/05/2015 18:38, Andrey Semashev wrote:
On Tue, May 19, 2015 at 2:27 AM, Gavin Lambert wrote:
They're more popular in POSIX; but at least on Windows, environment variables are considered old-fashioned and discouraged from use without significant compelling reason.
Is there a reference for such a discouragement? Sure, most Windows-only programs just use registry but I wouldn't say environment variables are discouraged from being used. For instance, most development software, including MSVC, use environment for configuration, and registry is not seen as a replacement.
There's a distinction between command line and GUI tools. Environment variables are more likely to be used by command line tools, as they tend to be more script-friendly.
Although it is common for GUI applications written in Java to use the environment for setup and startup, IIRC. -- -- Rene Rivera -- Grafik - Don't Assume Anything -- Robot Dreams - http://robot-dreams.net -- rrivera/acm.org (msn) - grafikrobot/aim,yahoo,skype,efnet,gmail
On 18. mai 2015 11:37, Michael Ainsworth wrote:
Two primary purposes, two ideas for additional possible uses.
Primary purposes:
1. To provide cross platform (e.g., POSIX plus Windows) equivalents of setenv, getenv, "unsetenv" and "issetenv". 2. To allow iteration over all defined environment variables.
Additional possible uses:
1. To create a new environment to pass to a new process. 2. To provide non "char" based environment variables (e.g., std::wstring).
I am interested. I think at least the primary purposes should be in the standard. So why not start with a small Boost library that as a minimum provide that, try out the response on this list then extend services based on demand and your preferences. I am surprised of the focus by others on why environment variables are bad or out of fashion in some environments. While this may be true, I think it is rather irrelevant. Environment variables are a useful service provided in the runtime environment of C++ programs and should be standardized. Posix sadly does not reach all C++ programs, so it would be useful to have similar facilities inn the C++ standard. There are valid and sensible use-cases for environment variables, and I do not see them going away any time soon. Just as with regular C++ variables they may be used in less optimal ways, possibly even insecure ways. However, note that Posix setenv(...) and Windows SetEnvironmentVariable(...) does not pollute the global system or user environment. They do not have global side effects outside the process or possibly its subsequently created child processes. Thus processes (C++ programs) provides a sort of scope for environment variables. Arguments that what we normally need is simple enough with Posix or Windows APIs just hints that this may be simple to to provide as standardized facilities so it become trivial to write portable standard C++ code using environment variables with no #ifdefs or other ugliness. It may prove to be less simple than we expect, multi-byte characters and OS size limitations comes to my mind as obstacles, but I think this is worth trying out. The facilities for manipulating the environment for child processes may be a better fit for a separate library, e.g. Boost.Process, but there may be a case for something simple in the proposed library that can be managed in the process and passed when creating child processes as alternative to inheriting the current environment. One facility that could be nice is portable methods to convert environment variable values into basic types as well as some more complex types, in particular file system path and a sequence of file system paths come to my mind. E.g.: namespace env = boost::environment; namespace fs = boost::filesystem; std::vectorfs::path paths = env::get_filesystem_paths("PATH"); fs::path home = env::get("HOME"); if(fs::is_directory(home)) { paths.push_back(home / "bin"; env::set_filesystem_paths("PATH", paths); } may conveniently get and set the variable correctly using the various correct path separators depending on operating system conventions. Beyond the simple generic handling of environment variables above, a library could provide truly portable accessors to some system wide and user environment settings. This may deal with variable name variations, and recommended use of platform API for each runtime environment, see: http://en.wikipedia.org/wiki/Environment_variable Such pure access methods, like fs::path home = env::user_home_directory(); could provide values from $HOME on Unix, %HOMEDIR% on DOS, %USERPROFILE% Windows or values from whatever environment variable or runtime API is appropriate for the value in the actual runtime environment. Some other similar settings that come to my mind as missing pure C++ solutions are listed below: // user login data std::string user_name(); // return login std::string user_domain(); // return login domain where appropriate // common storage locations fs:path user_download_directory(); fs:path user_documents_directory(); fs:path user_pictures_directory(); fs:path user_music_directory(); fs:path user_movies_directory(); fs:path user_desktop_directory(); fs:path temp_directory(); // paths to configuration files fs:path user_application_data_directory(); fs:path common_application_data_directory(); // executable directory, useful to access installed data fs:path executable_directory(); // path to running program file https://msdn.microsoft.com/en-us/library/windows/desktop/bb762181%28v=vs.85%... http://doc.qt.io/qt-4.8/qdesktopservices.html http://standards.freedesktop.org/basedir-spec/basedir-spec-latest.html -- Bjørn
On 21 May 2015, at 2:07 pm, Bjørn Roald
wrote: <snip>
Hi Bjorn, Thank you for a well thought out response. I appreciate the time you’ve given it.
I think at least the primary purposes should be in the standard. So why not start with a small Boost library that as a minimum provide that, try out the response on this list then extend services based on demand and your preferences.
I agree.
I am surprised of the focus by others on why environment variables are bad or out of fashion in some environments. While this may be true, I think it is rather irrelevant. Environment variables are a useful service provided in the runtime environment of C++ programs and should be standardized. Posix sadly does not reach all C++ programs, so it would be useful to have similar facilities inn the C++ standard.
While I agree with others that the use and appropriateness of environment variables varies across systems, I think it’s still common enough to warrant such a library, especially given that ISO standard C++ provides a partial implementation (e.g., std::getenv in <cstdlib>, but not ::unsetenv, ::setenv, etc, which are found in
However, note that Posix setenv(...) and Windows SetEnvironmentVariable(...) does not pollute the global system or user environment. They do not have global side effects outside the process or possibly its subsequently created child processes. Thus processes (C++ programs) provides a sort of scope for environment variables.
That’s a good point, which I hadn’t considered.
Arguments that what we normally need is simple enough with Posix or Windows APIs just hints that this may be simple to to provide as standardized facilities so it become trivial to write portable standard C++ code using environment variables with no #ifdefs or other ugliness. It may prove to be less simple than we expect, multi-byte characters and OS size limitations comes to my mind as obstacles, but I think this is worth trying out.
I agree. The library would be rather small but it would encourage DRY principles and be cross-platform (to the extent that systems are supported).
The facilities for manipulating the environment for child processes may be a better fit for a separate library, e.g. Boost.Process, but there may be a case for something simple in the proposed library that can be managed in the process and passed when creating child processes as alternative to inheriting the current environment.
1. You might be right, but my first instinct was that environment variables and processes “fit” together. 2. On the other hand, a process management library (a Boost.Process revision?) that made use of the environment variable library would create a library dependency.
One facility that could be nice is portable methods to convert environment variable values into basic types as well as some more complex types, in particular file system path and a sequence of file system paths come to my mind. E.g.:
namespace env = boost::environment; namespace fs = boost::filesystem; std::vectorfs::path paths = env::get_filesystem_paths("PATH"); fs::path home = env::get("HOME"); if(fs::is_directory(home)) { paths.push_back(home / "bin"; env::set_filesystem_paths("PATH", paths); }
The same arguments we’ve raised regarding the integration of environment-variables/processes would also apply here. That is: Filesystems are logically independent on environment variables - the only commonality is the use of strings to denote file system locations (e.g., $PATH or %PATH%). Integrating the two creates an (IMO) unnecessary dependency. I do like the idea of providing dedicated functions for specific environment variables. For example, “boost::environment::get_path()”. Regarding the PATH variable (and other “array-like” character-delimited environment variables), I was thinking of implementing this with iterators. For example: namespace env = boost::environment; environment& current = env::current_environment(); env::variable& path = current.get(“PATH”); for (env::variable::iterator it = path.begin(), end = path.end(); it != end; ++it) { std::cout << *it << std::endl; } // Possible output: // /bin // /usr/bin // /usr/local/bin // /root/bin path += "/var/lib/pgsql-9.4/bin";
Beyond the simple generic handling of environment variables above, a library could provide truly portable accessors to some system wide and user environment settings. This may deal with variable name variations, and recommended use of platform API for each runtime environment, see:
http://en.wikipedia.org/wiki/Environment_variable http://en.wikipedia.org/wiki/Environment_variable
Agreed. I thought of accessing other “environment” variables as well, such as the operating system category, name and version (e.g., “windows”, “Windows XP” and “Windows XP Service Pack 3 Build 12345”). Once again, thank you for your time in thinking the matter over. Michael.
On 21. mai 2015 12:59, Michael Ainsworth wrote:
On 21 May 2015, at 2:07 pm, Bjørn Roald
wrote: The facilities for manipulating the environment for child processes may be a betterfit for a separate library, e.g. Boost.Process, but there may be a case for something simple in the proposed library that can be managed in the process and passed when creating child processes as alternative to inheriting the current environment. 1. You might be right, but my first instinct was that environment variables and processes “fit” together.
agree
2. On the other hand, a process management library (a Boost.Process revision?) that made use of the environment variable library would create a library dependency.
right, it may just be a header only declaration of a few common types that are used in both libraries, or some other way of compatible or shared types for transfeer of data. With some care there is no real dependency either way.
One facility that could be nice is portable methods to convert environment variable values into basic types as well as some more complex types, in particular file system path and a sequence of file system paths come to my mind. E.g.:
namespace env = boost::environment; namespace fs = boost::filesystem; std::vectorfs::path paths = env::get_filesystem_paths("PATH"); fs::path home = env::get("HOME"); if(fs::is_directory(home)) { paths.push_back(home / "bin"; env::set_filesystem_paths("PATH", paths); }
The same arguments we’ve raised regarding the integration of environment-variables/processes would also apply here. That is:
Filesystems are logically independent on environment variables - the only commonality is the use of strings to denote file system locations (e.g., $PATH or %PATH%). Integrating the two creates an (IMO) unnecessary dependency.
Maybe, there certainly is a cost and it may not be worth it, however in 2+ years we will have C++17 with std::filesystem::path so that may make the dependency less concerning. Sure users that do not use the path class should not pay the overhead of a dependency to the filesystem library.
I do like the idea of providing dedicated functions for specific environment variables. For example, “boost::environment::get_path()”.
Regarding the PATH variable (and other “array-like” character-delimited environment variables), I was thinking of implementing this with iterators. For example:
namespace env = boost::environment; environment& current = env::current_environment();
env::variable& path = current.get(“PATH”); for (env::variable::iterator it = path.begin(), end = path.end(); it != end; ++it) { std::cout << *it << std::endl; }
// Possible output: // /bin // /usr/bin // /usr/local/bin // /root/bin
This could be nice, but how does the iterator or environment container determine the colon separator for Unix and the semicolon for Windows. This sort of assumes the iterator only is used for PATH like variables. This may be OK, however the name should reflect it.
path += "/var/lib/pgsql-9.4/bin";
// would you not need platform specific code here? If nothing else for // the colon separator on Posix and semicolon separator on Windows #ifdef POSIX path += ":/var/lib/pgsql-9.4/bin"; #elif Windows path += ";C:\Program Files\pgsql-9.4\bin"; #else #error platform not supported #endif Not all that great, could be a portabe one liner: path,add_path( system_program_dir() / "pgsql-9.4/bin" ); or something like that. -- Bjørn
On 21/05/2015 22:59, Michael Ainsworth wrote:
Regarding the PATH variable (and other “array-like” character-delimited environment variables), I was thinking of implementing this with iterators. For example:
namespace env = boost::environment; environment& current = env::current_environment();
env::variable& path = current.get(“PATH”); for (env::variable::iterator it = path.begin(), end = path.end(); it != end; ++it) { std::cout << *it << std::endl; } [...] path += "/var/lib/pgsql-9.4/bin";
I would recommend against this particular style. Variables like PATH might be accessed either as a complete string (in which case += should do a normal string append without any separator, and the iterator would be the string's character iterator) or as individual components; it should be obvious which of these is being done. If the goal is to have a cross-platform library, then I think that one of the following should be chosen: 1. A standalone library that only provides strings and makes no attempt to separate PATH into components. 2. A library that depends on Filesystem and provides both full-string access as well as a cross-platform way to split PATH (and other arbitrary variables) into Filesystem path components. I don't think a half-measure (splitting paths but not using Filesystem) is desirable because it compromises the goal of being a portable library, since the path separators are different.
I would recommend against this particular style. Variables like PATH might be accessed either as a complete string (in which case += should do a normal string append without any separator, and the iterator would be the string's character iterator) or as individual components; it should be obvious which of these is being done.
It sounds like you're assuming the iterator to point to a string. The iterator could actually point to a 'variable' type which has knowledge of the path separator built in.
If the goal is to have a cross-platform library, then I think that one of the following should be chosen:
1. A standalone library that only provides strings and makes no attempt to separate PATH into components.
2. A library that depends on Filesystem and provides both full-string access as well as a cross-platform way to split PATH (and other arbitrary variables) into Filesystem path components.
I don't think a half-measure (splitting paths but not using Filesystem) is desirable because it compromises the goal of being a portable library, since the path separators are different
On 22/05/2015 13:09, Michael Ainsworth wrote:
I would recommend against this particular style. Variables like PATH might be accessed either as a complete string (in which case += should do a normal string append without any separator, and the iterator would be the string's character iterator) or as individual components; it should be obvious which of these is being done.
It sounds like you're assuming the iterator to point to a string. The iterator could actually point to a 'variable' type which has knowledge of the path separator built in.
To clarify, I was saying that since most environment variables are not multi-path format, just simple strings, then the "common" method of accessing environment variables should return a simple string, or at most a wrapper object that has the simple string as the most easily accessible content. Having that wrapper object also contain an iterator that is *not* a char-iterator of that simple string seems confusing at best. (The wrapper object could have a split_path() method that returns something else that has iterators though -- the simplest choice being a vector.) Also note that for the environment variables that do contain multi-paths, there are two separate delimiters in play -- one is the separator between entire paths, and the other is the separator between components of a single path. Both of them are platform-specific. It seems reasonable (though not essential) for an Environment library to take care of the former for you by providing split/rejoin methods that hide the difference between platforms. It seems out of scope for it to do the latter, as this is already handled by the Filesystem library, and duplicating it seems wasteful. Hence the suggestion to have Environment provide split/join methods that use Filesystem path objects as a basis (ie. vector<path>), as this allows the most platform abstraction. And as others have already pointed out, Filesystem has been standardised already.
On Friday 22 May 2015 14:52:30 Gavin Lambert wrote:
Also note that for the environment variables that do contain multi-paths, there are two separate delimiters in play -- one is the separator between entire paths, and the other is the separator between components of a single path. Both of them are platform-specific.
It seems reasonable (though not essential) for an Environment library to take care of the former for you by providing split/rejoin methods that hide the difference between platforms.
+1
It seems out of scope for it to do the latter, as this is already handled by the Filesystem library, and duplicating it seems wasteful.
Hence the suggestion to have Environment provide split/join methods that use Filesystem path objects as a basis (ie. vector<path>), as this allows the most platform abstraction. And as others have already pointed out, Filesystem has been standardised already.
Note that not only paths can be present in multi-element env. variables. For example, LS_COLORS contains a list of color specifications for ls command. I think, it's better for the environment library to perform a split to a sequence of strings and let the user interpret those strings as filesystem paths or something else.
Note that not only paths can be present in multi-element env. variables. For example, LS_COLORS contains a list of color specifications for ls command. I think, it's better for the environment library to perform a split to a sequence of strings and let the user interpret those strings as filesystem paths or something else.
Exactly. $PATH isn't the only "array-like" variable, hence my reasoning for encapsulating the raw name/value string pair in a 'variable' class that can iterate over the substrings in the value string. It may be confusing though. Will think more on it. Do all variables on the one platform have the same separator?
On Friday 22 May 2015 18:35:09 Michael Ainsworth wrote:
Note that not only paths can be present in multi-element env. variables. For example, LS_COLORS contains a list of color specifications for ls command. I think, it's better for the environment library to perform a split to a sequence of strings and let the user interpret those strings as filesystem paths or something else.
Exactly. $PATH isn't the only "array-like" variable, hence my reasoning for encapsulating the raw name/value string pair in a 'variable' class that can iterate over the substrings in the value string. It may be confusing though. Will think more on it.
I think it's better to have a separate interface for splitting/joining - preferably, an algorithm (a free function). I'm not sure a 'variable' class which contains both name and value is a good thing, at least as the main interface. When I get a variable, I expect to have to pass only its name and receive only its value, like: std::string value = env["FOO"]; Also, common operators become either unclear or confusing on the 'variable' class: variable var1, var2; var1 = var2; // assigns value or both name and value? var1 == var2; // compares value or both name and value? And certainly I don't think it's a good idea to collect algorithms in the 'variable' interface, like it's done in std::string.
Do all variables on the one platform have the same separator?
There are application-specific variables that contain different separators but I don't think you have to support them in your library. You can, however, provide the separator argument to the splitting/joining algorithms, which by default would be the system-specific default. It would be nice if that default could be visible by user (e.g. as a character constant) if he wants to do some string processing himself.
By the way, what would be the encoding of the strings returned by or passed to the Environment library?
Given that std::getenv returns a char*, I think the library should work with std::string, although we did discuss supporting std::wstring using templates. Whether std::string is encoded in ASCII or UTF8 would be an OS specific thing I imagine. Someone with more experience with character encodings might want to weigh in here. Michael
On 22 May 2015, at 8:21 pm, Klaim - Joël Lamotte
wrote: By the way, what would be the encoding of the strings returned by or passed to the Environment library?
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
On 23. mai 2015 02:18, Michael Ainsworth wrote:
On 22 May 2015, at 8:21 pm, Klaim - Joël Lamotte
wrote: By the way, what would be the encoding of the strings returned by or passed to the Environment library?
Given that std::getenv returns a char*, I think the library should work with std::string, although we did discuss supporting std::wstring using templates. Whether std::string is encoded in ASCII or UTF8 would be an OS specific thing I imagine.
Someone with more experience with character encodings might want to weigh in here.
[Michael, I took the liberty of rearranging you response a bit as you are top posting, see http://www.boost.org/community/policy.html] Disclaimer: I am no character encoding expert, so take care to verify claims by me here. I think encoding is going to be a challenge. On Posix I think you are right that one can assume the character encoding is defined by the system and that may be a multi or a single byte character strings, whatever is defined in the locale. As the Posix getenv, setenv functions are simply char* based with no statements on encoding, it is possible to let the system determine the encoding. UTF-8 will likely be used for UNICODE support, as other options make little sense. On Windows however there are variants of the windows API for environment variables: BOOL WINAPI SetEnvironmentVariable( _In_ LPCTSTR lpName, _In_opt_ LPCTSTR lpValue ); Unicode and ANSI names SetEnvironmentVariableW (Unicode) and SetEnvironmentVariableA (ANSI) The regular SetEnvironmentVariable use LPCTSTR, and according to https://msdn.microsoft.com/en-us/library/windows/desktop/aa383751%28v=vs.85%... LPCTSTR is an LPCWSTR if UNICODE is defined, an LPCSTR otherwise. #ifdef UNICODE typedef LPCWSTR LPCTSTR; #else typedef LPCSTR LPCTSTR; #endif File paths in Windows are stored in double byte character strings encoded as UCS-2 which is fixed width 2 byte predecessor of UTF-16. Other string data may not be double byte character strings, and ASCII and ANSI strings will certainly exist in C++ code. Nevertheless it seems the conversions should happen when the API is setting or getting the variables. I am not sure how these Unicode and ANSI name variants of the API functions interact with the actual storage of the variables in the environment block, but it make sense that code need to use them to convert when needed from program code when a conversion is needed. A standard C++ library need to facilitate for these conversions as well. I am not sure how that is best done, but I can imagine the Boost.Filesystem library have considered options for a very similar problem. As the UNICODE macro determine if your Windows program have single or double byte characters in its environment block with ANSI or UNICODE UCS-2 value encoding respectively, a conversion may be needed when creating child processes. The CreateProcess function seems to support that, see the section on the lpEnvironment argument here https://msdn.microsoft.com/en-us/library/windows/desktop/ms682425%28v=vs.85%... It is annoying that Microsoft ended up using UCS-2. Other operating systems waited a bit longer to decide how to support UNICODE I think and thus had a better option available with UTF-8. But the situation is what it is and we have to deal with it. -- Bjørn
Bjørn Roald wrote:
I think encoding is going to be a challenge.
On Posix I think you are right that one can assume the character encoding is defined by the system and that may be a multi or a single byte character strings, whatever is defined in the locale.
On POSIX, the system doesn't care about encodings. You get from getenv exactly the byte string you passed to setenv.
File paths in Windows are stored in double byte character strings encoded as UCS-2 which is fixed width 2 byte predecessor of UTF-16.
No, file paths on Windows are UTF-16. I'm not quite sure how SetEnvironmentVariableA and SetEnvironmentVariableW interact though, I don't see it documented. The typical behavior for an A/W pair is for the A function to be implemented in terms of the W one, using the current system code page for converting the strings. The C runtime getenv/_putenv functions actually maintain two separate copies of the environment, one narrow, one wide. https://msdn.microsoft.com/en-us/library/tehxacec.aspx The problem therefore is that it's not quite possible to provide a portable interface. On POSIX, programs have to use the char* functions, because they don't encode/decode and therefore guarantee a perfect round-trip. Using wchar_t* may fail if the contents of the environment do not correspond to the encoding that the library uses. On Windows, programs have to use the wchar_t* versions, for the same reason. Using char* may give you a mangled result in the case the environment contains a file name that cannot be represented in the current encoding. (If the library uses the C runtime getenv/_putenv functions, those will likely guarantee a perfect round-trip, but this will not solve the problem with a preexisting wide environment that is not representable.) Many people - me included - have adopted a programming model in which char[] strings are assumed to be UTF-8 on Windows, and the char[] API calls the wide Windows API internally, then converts between UTF-16 and UTF-8 as appropriate. Since the OS X POSIX API is UTF-8 based and most Linux systems are transitioning or have already transitioned to UTF-8 as default, using UTF-8 and char[] results in reasonably portable programs. This however doesn't appeal to people who prefer to use another encoding, and makes the char[] API not correspond to the Windows char[] API (the A functions) as those use the "ANSI code page" which can't be UTF-8. Boost.Filesystem sidesteps the problem by letting you choose whatever encoding you wish. I don't particularly like this approach.
The C runtime getenv/_putenv functions actually maintain two separate copies of the environment, one narrow, one wide.
... and those appear to be distinct from the OS environment block that SetEnvironmentVariable modifies. So that makes three (or maybe four) copies in total. It's fun for the whole family.
On 23. mai 2015 15:50, Peter Dimov wrote:
Bjørn Roald wrote:
I think encoding is going to be a challenge.
On Posix I think you are right that one can assume the character encoding is defined by the system and that may be a multi or a single byte character strings, whatever is defined in the locale.
On POSIX, the system doesn't care about encodings. You get from getenv exactly the byte string you passed to setenv.
File paths in Windows are stored in double byte character strings encoded as UCS-2 which is fixed width 2 byte predecessor of UTF-16.
No, file paths on Windows are UTF-16.
OK, in that case that is good. It seems it is stated that UTF-16 has been supported since Windows 2000 in one reference I found. I must have based my misled mind on some pretty dated information then. Possibly also mixed up with the fact that the two encodings are so similar for normal use that UCS-2 is often mistakenly referred to as UTF-16., So it is hard to know for sure what statements to trust without testing. I am glad I put a disclaimer at the top.
I'm not quite sure how SetEnvironmentVariableA and SetEnvironmentVariableW interact though, I don't see it documented. The typical behavior for an A/W pair is for the A function to be implemented in terms of the W one, using the current system code page for converting the strings.
The C runtime getenv/_putenv functions actually maintain two separate copies of the environment, one narrow, one wide.
https://msdn.microsoft.com/en-us/library/tehxacec.aspx
The problem therefore is that it's not quite possible to provide a portable interface.
One possible, but certainly not perfect approach is to convert in the interface as needed from an external to the internal encoding. The external encoding is explicitly requested by the user, or UTF-8 is assumed. The internal encoding will always use the UTF-16 on Windows and UTF-8 on Posix. How bad would that be? If the Windows implementation convert to/from UTF-16 when needed and then use Set/GetEnvironmentVariableW, then the windows back-end is taken care of, simple enough. However, with this scheme, on Posix systems it is harder to assure a formal guaranty of correctness. But it is hard to see how just assuming stored environment variables are UTF-8 are any are worst than alternatives unless you know the variable producer used another encoding. If you know, it is should be possible to convert anyway. Non UTF-8 variables will likely be a less and less common problem with time. You will still have the same abilities to recover as before with the current Posix char* interface with no statements of expected encoding. The external encoding (used in API parameters) can depend on the width of the character type used in the API, the library could have functions using both char and wchar_t based strings. The char based string parameters assume UTF-8 and wchar_t based parameters assume UTF-16 or UTF-32 depending on how many bit wchar_t is on the platform.
On POSIX, programs have to use the char* functions, because they don't encode/decode and therefore guarantee a perfect round-trip.
Right, but I question how much value that perfect round-trip has if the consumer have to guess the encoding. That is basically saying that I kept the encoding, therefore I am happy even if I may have lost the correct value.
Using wchar_t* may fail if the contents of the environment do not correspond to the encoding that the library uses.
On Windows, programs have to use the wchar_t* versions, for the same reason. Using char* may give you a mangled result in the case the environment contains a file name that cannot be represented in the current encoding.
(If the library uses the C runtime getenv/_putenv functions, those will likely guarantee a perfect round-trip, but this will not solve the problem with a preexisting wide environment that is not representable.)
Many people - me included - have adopted a programming model in which char[] strings are assumed to be UTF-8 on Windows, and the char[] API calls the wide Windows API internally, then converts between UTF-16 and UTF-8 as appropriate. Since the OS X POSIX API is UTF-8 based and most Linux systems are transitioning or have already transitioned to UTF-8 as default, using UTF-8 and char[] results in reasonably portable programs.
I have also followed this pattern for portable code in the past, and I think it is a good pattern to support in a new library.
This however doesn't appeal to people who prefer to use another encoding, and makes the char[] API not correspond to the Windows char[] API (the A functions) as those use the "ANSI code page" which can't be UTF-8.
I though at least some ANSI and ISO code pages where ASCII based, are they not? Given that all values in the range 0 though 127 are the same as in ASCII, then those encodings are just as much UTF-8 as pure ASCII texts.
Boost.Filesystem sidesteps the problem by letting you choose whatever encoding you wish. I don't particularly like this approach.
I guess it adds complexity to the API that possibly could discourage users that only need 1 or 2 common UTF encoding(s). A separate string conversion library could do the rest of the job when the odd encodings are needed. Are there any other disadvantages? -- Bjørn
Bjørn Roald wrote:
On POSIX, programs have to use the char* functions, because they don't encode/decode and therefore guarantee a perfect round-trip.
Right, but I question how much value that perfect round-trip has if the consumer have to guess the encoding. That is basically saying that I kept the encoding, therefore I am happy even if I may have lost the correct value.
You don't need to guess the encoding. When you get the contents of an environment variable that identifies a file, you can then pass the string exactly as-is to fopen, and it works regardless of encoding. That's because POSIX functions are completely encoding-agnostic; they use null-terminated byte sequences and do not interpret them as characters. Of course, if you want to print the string, you need to guess an encoding.
On 24. mai 2015 20:08, Peter Dimov wrote:
Bjørn Roald wrote:
On POSIX, programs have to use the char* functions, because they don't > encode/decode and therefore guarantee a perfect round-trip.
Right, but I question how much value that perfect round-trip has if the consumer have to guess the encoding. That is basically saying that I kept the encoding, therefore I am happy even if I may have lost the correct value.
You don't need to guess the encoding. When you get the contents of an environment variable that identifies a file, you can then pass the string exactly as-is to fopen, and it works regardless of encoding. That's because POSIX functions are completely encoding-agnostic; they use null-terminated byte sequences and do not interpret them as characters.
Good point
Of course, if you want to print the string, you need to guess an encoding.
That is probably not a big problem either as locale are likely to be defined with correct encoding - normally UTF-8. Given this situation with very different requirements on encoding on Windows and POSIX it is tricky to make an interface providing simple means of writing portable code. I do not see a strait forward approach unless we make some assumptions on encoding in POSIX environments. -- Bjørn
On May 22, 2015 4:59:13 AM EDT, Andrey Semashev
Note that not only paths can be present in multi-element env. variables. For example, LS_COLORS contains a list of color specifications for ls command. I think, it's better for the environment library to
split to a sequence of strings and let the user interpret those strings as filesystem paths or something else.
Exactly. $PATH isn't the only "array-like" variable, hence my reasoning for encapsulating the raw name/value string pair in a 'variable' class
On Friday 22 May 2015 18:35:09 Michael Ainsworth wrote: perform a that can
iterate over the substrings in the value string. It may be confusing though. Will think more on it.
I think it's better to have a separate interface for splitting/joining - preferably, an algorithm (a free function).
+1 I was going to say the same.
I'm not sure a 'variable' class which contains both name and value is a good thing, at least as the main interface. When I get a variable, I expect to have to pass only its name and receive only its value, like:
std::string value = env["FOO"];
+1 ___ Rob (Sent from my portable computation engine)
participants (10)
-
Andrey Semashev
-
Bjørn Roald
-
Boris Schäling
-
Gavin Lambert
-
Klaim - Joël Lamotte
-
Michael Ainsworth
-
Paul A. Bristow
-
Peter Dimov
-
Rene Rivera
-
Rob Stewart