[filesystem] function to determine available space

I'm planning to add a filesystem function to determine available disk space. Perhaps something like: struct space_status { boost::uintmax_t available; // free space available to user boost::uintmax_t total; // total space on volume }; space_status filesystem_space( const path & p ); Comments? Better suggestions for the names? --Beman

"Beman Dawes" wrote:
I'm planning to add a filesystem function to determine available disk space. Perhaps something like:
struct space_status { boost::uintmax_t available; // free space available to user boost::uintmax_t total; // total space on volume };
space_status filesystem_space( const path & p );
The single uintmax_t may not be able to capture all those free gigabytes. Better use two values. The name may be "available_disk_space". /Pavel

"Pavel Vozenilek"
"Beman Dawes" wrote:
I'm planning to add a filesystem function to determine available disk space. Perhaps something like:
struct space_status { boost::uintmax_t available; // free space available to user boost::uintmax_t total; // total space on volume };
space_status filesystem_space( const path & p );
The single uintmax_t may not be able to capture all those free gigabytes. Better use two values.
Hum... Are there any systems anymore that support disk drives that don't have 64-bit uintmax_t?
The name may be "available_disk_space".
But lots of file systems aren't disks. Flash memory for example. Thanks, --Beman

"Beman Dawes" wrote:
The single uintmax_t may not be able to capture all those free gigabytes. Better use two values.
Hum... Are there any systems anymore that support disk drives that don't have 64-bit uintmax_t?
You are right, all the compilers I could find do have such support. /Pavel

Hello Beman Dawes, Beman Dawes wrote:
"Pavel Vozenilek"
wrote in message news:dg4gv5$mji$1@sea.gmane.org... "Beman Dawes" wrote:
I'm planning to add a filesystem function to determine available disk space. Perhaps something like:
struct space_status { boost::uintmax_t available; // free space available to user boost::uintmax_t total; // total space on volume };
space_status filesystem_space( const path & p );
The single uintmax_t may not be able to capture all those free gigabytes. Better use two values.
Hum... Are there any systems anymore that support disk drives that don't have 64-bit uintmax_t?
Although I don't see **currently** a problem with usual uintmax_t definitions, I have a more defensive view on this: It has happend in history that hardware development has been quicker than software development and it was not unusual that suddenly a system needed to replace their "atomic" type used representation of system related properties (e.g. address space of memory) by means of a "conglomerated" type with internal logic to handle this situation. This might happen quicker than we believe if a sudden technological jump invalidates current interpolations of disk-sizes. A safer method would be to provide a simple type(def), e.g. disk_size_t with some described operations which are guaranteed to be provided (at least EqualityComparable and LessComparable ;-)). This makes it easier to handle special OPs and costs nothing, if in the moment its simply a typedef to uintmax_t. Just my 0.02 Euro cent Daniel Krügler

On 15/09/05, Daniel Krügler
Although I don't see **currently** a problem with usual uintmax_t definitions, I have a more defensive view on this:
It has happend in history that hardware development has been quicker than software development and it was not unusual that suddenly a system needed to replace their "atomic" type used representation of system related properties (e.g. address space of memory) by means of a "conglomerated" type with internal logic to handle this situation.
This might happen quicker than we believe if a sudden technological jump invalidates current interpolations of disk-sizes.
A safer method would be to provide a simple type(def), e.g.
disk_size_t
with some described operations which are guaranteed to be provided (at least EqualityComparable and LessComparable ;-)). This makes it easier to handle special OPs and costs nothing, if in the moment its simply a typedef to uintmax_t.
Just my 0.02 Euro cent
Daniel Krügler
Note that this issue is also handled elegantly by the double option, perhaps more so, because if suddenly you install a YottaByte drive, you'll just lose precision--no recompiling would be necessary. Some further numbers about the double precision: Using the 24-bit mantissa floats, that means +/- 16 KB for a 16 GB measurement, or +/- 16 MB on a 16 TB measurement. For the 53-bit mantissa double, that same 16-TB measurement could be accurate to +/-0.001953125 bytes :P The one drawback would be that summing entries could be victim to round-off error, of course. Once at that 16 GB mark with the 24-bit floats, no matter how many 4 KB amounts you add, the number wont change. I'd argue that this oughtn't be too much of an issue, since it seems as though the lib is concerned with free, used, and total space of a path::root() and not with individual files. Perhaps, however, it should also provide that information for arbitrary paths for which is_directory returns true, although I don't think it would be possible to provide any meaningful performance guarantees if that path is not just a root. Scott McMurray

me22 wrote:
Some further numbers about the double precision: Using the 24-bit mantissa floats, that means +/- 16 KB for a 16 GB measurement, or +/- 16 MB on a 16 TB measurement. For the 53-bit mantissa double, that same 16-TB measurement could be accurate to +/-0.001953125 bytes :P
The one drawback would be that summing entries could be victim to round-off error, of course.
I wonder why do people want to be able to find out how much disk space is available, programmatically. Presumably the intent is to find out whether there will be enough space to create the files they intend to. In that case, round-off entry is an unimportant source of error, compared to: 1. The existence of quotas that limit disk usage further 2. Overheads for file metadata 3. Other concurrent processes using or freeing space on the volume 4. Resizing of the volume I'm really unconvinced that this function is useful as-is in portable programs. It seems to me that it would be more useful to provide a function that makes a best guess as to whether files of given names/paths and sizes could be created successfully. That could take into account 1 and 2 on at least some platforms. Ben.

On 16/09/05, Ben Hutchings
1. The existence of quotas that limit disk usage further 2. Overheads for file metadata 3. Other concurrent processes using or freeing space on the volume 4. Resizing of the volume
I think simply having the structure contain free, used, and total space can take into account 1 quite easily, so long as we don't specify free+used=total, which would make one of them extraneous in any case. Total could of course also be dropped entirely, since I can't think of a use case where used and free wouldn't be sufficient ( especially since any total space above free+used would logically be inaccessible anyways ). As for point 2, including a function for "how much space is the tree rooted at this path" would take into account that already, which could be useful as a quick sanity check before beginning a large copy operation, for example. This information is also often a nice convenience for users of a program, especially in the case of the possibility of writing to USB flash drives, for example, in a music program. 3 is not an issue to be overly concerned about. As with the rest of the library, the function would simply make no assurances. There is the precedent for functions like this though, such as is_symbolic_link. I also don't think 4 would be an issue, because it's rare that a volume can be resized while still being writable. Additionally it's not a common operation and anyone with the knowledge to do it should have the knowledge to not be surprised about temporarily odd measurements. One important issue, however, is that there is a need for listing which directories are their own roots, or at least figuring out if 2 directories are part of the same tree. Single-root operating systems will often have a vastly different amount of free space on /mnt/ and /mnt/cdrom/ if a CD is mounted there. ( Which raises another race condition possibility. ) - Scott McMurray

I'm planning to add a filesystem function to determine available disk space. Perhaps something like:
struct space_status { boost::uintmax_t available; // free space available to user boost::uintmax_t total; // total space on volume };
space_status filesystem_space( const path & p );
Comments?
Better suggestions for the names?
I think the word "status" in space_status is misleading since available/total space is not really a status. Maybe space_stats or simply space would be more appropriate? -delfin

"Delfin Rojas"
I'm planning to add a filesystem function to determine available disk space. Perhaps something like:
struct space_status { boost::uintmax_t available; // free space available to user boost::uintmax_t total; // total space on volume };
space_status filesystem_space( const path & p );
Comments?
Better suggestions for the names?
I think the word "status" in space_status is misleading since available/total space is not really a status. Maybe space_stats or simply space would be more appropriate?
I agree with you, but am worried that "space" is a very common name. OTOH, the use is in a sub-namespace, so no harm done by using a common name. Thanks, --Beman

I think the word "status" in space_status is misleading since available/total space is not really a status.
What makes you say that?
Ben.
It is just my interpretation of the word "status": the status of a patient (stable, critical, etc.), the status of an electronic device (powered, running, connected, disconnected, etc.). If I see a function or structure that is called filesystem_status I would think it represents one or more states in which the files system can be, like empty/full or online/offline. I guess each possible total_size / used_size combination is a possible filesystem status but still it doesn't sound good to me. At the end of the day it is up to whoever implements the function to decide, I was just giving my suggestion because I think it is important programming constructs have a name from which their usage/meaning can be easily guessed or deduced. -delfin
participants (6)
-
Beman Dawes
-
Ben Hutchings
-
Daniel Krügler
-
Delfin Rojas
-
me22
-
Pavel Vozenilek