[filesystem] revised function to determine available file system space

I'm planning to add a filesystem function to determine available disk space. Perhaps something like: struct space { boost::uintmax_t available; // bytes available to user boost::uintmax_t total; // total bytes on volume }; space filesystem_space( const path & p ); Comments? How safe is the assumption that uintmax_t is large enough nowadays for any modern compiler on a system supporting large disks. Better suggestions for the names? --Beman PS: this is slightly revised from a post on the users list.

On 9/12/05, Beman Dawes <bdawes@acm.org> wrote:
I'm planning to add a filesystem function to determine available disk space. Perhaps something like:
struct space { boost::uintmax_t available; // bytes available to user boost::uintmax_t total; // total bytes on volume };
space filesystem_space( const path & p );
Comments?
Seems like a perfectly usable API. How safe is the assumption that uintmax_t is large enough nowadays for any
modern compiler on a system supporting large disks.
Well, you need 50 bits for a petabyte, I don't think people are using/exceeding that much these days, so platforms supporting long long should keep you in space for few years. I'd think you'll find the answer to that as you write platform specific versions. As long as uintmax_t is larger than the parameter to the function, you'll be sitting pretty. I suppose a platform supporting insane amounts of diskspace could report with a low-longlong and hi-longlong, but I'm sure we'll be able to ifdef apm_unsigned by then.
Better suggestions for the names?
I think filesystem::filesystem_space is a bit redundant. If we want to be forward looking we should avoid the word disk since flash HD's are becoming a reality. This information will return free space on the partition path lives on. Unix often calls partitions volumes. I'm not really brainstorming myself into any great names. space usage(path const&); //kinda like it drive_space drive_usage(path const&); //bad - not info for the whole drive volume_info stats(path const&); //bad - too confusing with stat Oh well, whatever the name, it'll be a good addition. Tom

"Thomas Matelich" <matelich@gmail.com> wrote in message news:3944d458050912191464699fcb@mail.gmail.com...
On 9/12/05, Beman Dawes <bdawes@acm.org> wrote:
I'm planning to add a filesystem function to determine available disk space. Perhaps something like:
struct space { boost::uintmax_t available; // bytes available to user boost::uintmax_t total; // total bytes on volume };
space filesystem_space( const path & p );
Comments?
Seems like a perfectly usable API.
How safe is the assumption that uintmax_t is large enough nowadays for any
modern compiler on a system supporting large disks.
Well, you need 50 bits for a petabyte, I don't think people are using/exceeding that much these days, so platforms supporting long long should keep you in space for few years. I'd think you'll find the answer to that as you write platform specific versions. As long as uintmax_t is larger than the parameter to the function, you'll be sitting pretty.
I suppose a platform supporting insane amounts of diskspace could report with a low-longlong and hi-longlong, but I'm sure we'll be able to ifdef apm_unsigned by then.
I've now looked at the config_info results for every compiler reported in the Boost regression tests. All support a 64-bit type, so we are home free with uintmax_t.
Better suggestions for the names?
I think filesystem::filesystem_space is a bit redundant. If we want to be forward looking we should avoid the word disk since flash HD's are becoming a reality. This information will return free space on the partition path lives on. Unix often calls partitions volumes. I'm not really brainstorming myself into any great names.
space usage(path const&); //kinda like it drive_space drive_usage(path const&); //bad - not info for the whole drive volume_info stats(path const&); //bad - too confusing with stat
Oh well, whatever the name, it'll be a good addition.
Or maybe: space_info space((path const&); Thanks for the feedback; I'll go ahead and implement something. --Beman

Beman Dawes wrote:
How safe is the assumption that uintmax_t is large enough nowadays for any modern compiler on a system supporting large disks.
Under IRIX, you would use statfs() to get the free space from a path, that returns a structure composed of 2 relavent parts both are of type long: long f_bsize; /* Block size */ long f_bfree; /* Count of free blocks */ where a long may be either 32 or 64 bits wide... but an unsigend int is only 32 bits so your assumption doesn't hold more than 4GB as a byte count ... which in my business is not enough (~50 MB per frame 24 frames per second, 60 seconds per minute... 2-3 hours per finished film, 10 times that for raw footage, even at 1/2 resolution its ~12MB/frame). So I'd favour something a little larger... ssize_t keeps pace with the size of a long Our largest single volume is above 6TB, admittedly our *free* space is never that large :-) Kevin -- | Kevin Wheatley, Cinesite (Europe) Ltd | Nobody thinks this | | Senior Technology | My employer for certain | | And Network Systems Architect | Not even myself |

"Kevin Wheatley" <hxpro@cinesite.co.uk> wrote in message news:43269DAA.3667BE36@cinesite.co.uk...
Beman Dawes wrote:
How safe is the assumption that uintmax_t is large enough nowadays for any modern compiler on a system supporting large disks.
Under IRIX, you would use statfs() to get the free space from a path, that returns a structure composed of 2 relavent parts both are of type long:
long f_bsize; /* Block size */ long f_bfree; /* Count of free blocks */
where a long may be either 32 or 64 bits wide... but an unsigend int is only 32 bits so your assumption doesn't hold more than 4GB as a byte count ... which in my business is not enough (~50 MB per frame 24 frames per second, 60 seconds per minute... 2-3 hours per finished film, 10 times that for raw footage, even at 1/2 resolution its ~12MB/frame).
So I'd favour something a little larger... ssize_t keeps pace with the size of a long
For all practical purposes, uintmax_t is 64-bits, so that should be enough.
Our largest single volume is above 6TB, admittedly our *free* space is never that large :-)
Don't we wish:-) Thanks, --Beman

Hi, statfs is a BSD creature, provided by BSD & BSD-derived (like Darwin) alike. Maybe use the long f_bavail; /* free blocks avail to non-superuser */ field rather than the long f_bfree; /* free blocks in fs */ Best wishes, Kon On Sep 13, 2005, at 4:46 AM, Beman Dawes wrote:
"Kevin Wheatley" <hxpro@cinesite.co.uk> wrote in message news:43269DAA.3667BE36@cinesite.co.uk...
Beman Dawes wrote:
How safe is the assumption that uintmax_t is large enough nowadays for any modern compiler on a system supporting large disks.
Under IRIX, you would use statfs() to get the free space from a path, that returns a structure composed of 2 relavent parts both are of type long:
long f_bsize; /* Block size */ long f_bfree; /* Count of free blocks */
<snip>

Kon Lovett wrote:
Maybe use the
long /* free blocks avail to non-superuser */
field rather than the
long f_bfree; /* free blocks in fs */
f_bavail; is not a documented feature of IRIX's statfs but actually thinking about it it would be better to use statvfs() which does (http://www.opengroup.org/onlinepubs/009695399/basedefs/sys/statvfs.h.html) However do all OS's have a concept of two different answers to how much free space there is, and what would an application developer want to know, if I'm writing a 'df' style application then I might want both. I guess the available space for the user is more interesting.. but then you get into quota's etc. Kevin -- | Kevin Wheatley, Cinesite (Europe) Ltd | Nobody thinks this | | Senior Technology | My employer for certain | | And Network Systems Architect | Not even myself |

On Sep 14, 2005, at 2:14 AM, Kevin Wheatley wrote:
Kon Lovett wrote:
Maybe use the
long /* free blocks avail to non-superuser */
field rather than the
long f_bfree; /* free blocks in fs */
f_bavail; is not a documented feature of IRIX's statfs but actually thinking about it it would be better to use statvfs() which does (http://www.opengroup.org/onlinepubs/009695399/basedefs/sys/ statvfs.h.html)
Sorry, I see only thru my tiny window. No IRIX exposure, except SGI demos.
However do all OS's have a concept of two different answers to how much free space there is, and what would an application developer want to know, if I'm writing a 'df' style application then I might want both. I guess the available space for the user is more interesting.. but then you get into quota's etc.
Yes, I wanted to point out the different answers issue. For server & desktop OS, probably yes, >1 answer to the space question, depending on who is asking the question. For embedded OS, w/ backing store of some kind, running a specialized, contained, app, maybe only 1 answer. I think the free space answer will range in exactness - race issues, quota's (as you point out), etc. Possibly one can only rely on the value as a hint, and just use it to postpone failsafe.
Kevin
-- | Kevin Wheatley, Cinesite (Europe) Ltd | Nobody thinks this | | Senior Technology | My employer for certain | | And Network Systems Architect | Not even myself | _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Best wishes, Kon

| -----Original Message----- | From: boost-bounces@lists.boost.org | [mailto:boost-bounces@lists.boost.org] On Behalf Of Kevin Wheatley | Sent: 13 September 2005 10:37 | To: boost@lists.boost.org | Subject: Re: [boost] [filesystem] revised function to | determine available filesystem space | | Our largest single volume is above 6TB, admittedly our *free* space is | never that large :-) Please, this is Boost, not Boast ;-) Paul

From: "Beman Dawes" <bdawes@acm.org>
I'm planning to add a filesystem function to determine available disk space.
Great!
Perhaps something like:
struct space { boost::uintmax_t available; // bytes available to user boost::uintmax_t total; // total bytes on volume };
space filesystem_space( const path & p );
As previously mentioned, the "filesystem_" prefix is redundant. Also, I wonder about the value of returning both values simultaneously. I don't expect that all platforms, if any do, provide both values in a single API call. Thus, your one call would frequently result in multiple API calls, in my imagination. I'll grant you that your function wouldn't be called with high frequency, so that's not too much of a concern. I propose this interface: boost::uintmax_t free_space(path const &); boost::uintmax_t capacity(path const &); That leaves open the possibility of additional queries in the future without having to change the return type. Also, the return type could grow too large to like returning it by value, so you might wind up changing the interface altogether when the space type grows too large. Another approach is to create a device type that can provide member functions to query a variety of attributes. There could even be a hierarchy of types. Since this kind of information isn't queried with high frequency, there shouldn't be any complaints about virtual functions, for example. The hierarchy would permit derivates to encapsulate the logic that no doubt differs for the various devices one would like to query. Windows probably uses a different API for every kind of device, for example. So, I'm thinking of something like the following list of types: device disk : device hard_disk : disk floppy_disk : disk network_disk : disk usb_drive : device pccard : device etc. Whether all of those (or any, for that matter) are needed is debatable. -- Rob Stewart stewart@sig.com Software Engineer http://www.sig.com Susquehanna International Group, LLP using std::disclaimer;

"Rob Stewart" <stewart@sig.com> wrote in message news:200509131320.j8DDKJ8K031913@shannonhoon.balstatdev.susq.com...
From: "Beman Dawes" <bdawes@acm.org>
I'm planning to add a filesystem function to determine available disk space.
Great!
Perhaps something like:
struct space { boost::uintmax_t available; // bytes available to user boost::uintmax_t total; // total bytes on volume };
space filesystem_space( const path & p );
As previously mentioned, the "filesystem_" prefix is redundant.
Yep. The implementation (committed to CVS branch i18n yesterday) uses space().
Also, I wonder about the value of returning both values simultaneously. I don't expect that all platforms, if any do, provide both values in a single API call. Thus, your one call would frequently result in multiple API calls, in my imagination. I'll grant you that your function wouldn't be called with high frequency, so that's not too much of a concern.
I propose this interface:
boost::uintmax_t free_space(path const &); boost::uintmax_t capacity(path const &);
Interesting. Windows and POSIX supply the result via a single call. But other systems might have separate calls. Separate functions are a bit easier to specify.
That leaves open the possibility of additional queries in the future without having to change the return type. Also, the return type could grow too large to like returning it by value, so you might wind up changing the interface altogether when the space type grows too large.
Excellent points.
Another approach is to create a device type that can provide member functions to query a variety of attributes. There could even be a hierarchy of types. Since this kind of information isn't queried with high frequency, there shouldn't be any complaints about virtual functions, for example.
The hierarchy would permit derivates to encapsulate the logic that no doubt differs for the various devices one would like to query. Windows probably uses a different API for every kind of device, for example.
So, I'm thinking of something like the following list of types:
device disk : device hard_disk : disk floppy_disk : disk network_disk : disk usb_drive : device pccard : device etc.
Whether all of those (or any, for that matter) are needed is debatable.
While we've often talked about a more flexible way to deal with attributes, particularly operating system specific attributes, I've not yet been convince any proposed scheme is worth the effort. I'll give your suggestion regarding separate free_space() and capacity() functions some more thought tomorrow morning when my mind is functioning a bit better than it does at night (Of all the things I've ever lost, I miss my mind the most:-) --Beman

"Rob Stewart" <stewart@sig.com> wrote in message news:200509131320.j8DDKJ8K031913@shannonhoon.balstatdev.susq.com...
I propose this interface:
boost::uintmax_t free_space(path const &); boost::uintmax_t capacity(path const &);
In looking at this a bit further, both POSIX and windows distinguish between total free space and free space available "to nonprivileged process" (POSIX) or "to caller" (Windows). To accommodate this, we might have: boost::uintmax_t capacity(path const &); boost::uintmax_t total_free_space(path const &); boost::uintmax_t available_free_space(path const &); Comments? --Beman
participants (6)
-
Beman Dawes
-
Kevin Wheatley
-
Kon Lovett
-
Paul A Bristow
-
Rob Stewart
-
Thomas Matelich