[filesystem] Some questions about string use

Ion Gaztañaga

4 Mar 2006 4 Mar '06

5:21 p.m.

Hi to all, While revising a bit the boost::filesystem error/exception use for looking a model for interprocess library and after reading the N1934 "Filesystem Library Proposal for TR2 (Revision 2)" I'm a bit concerned about the heavy use of std::string/string_type in the library. The following functions return some kind of path as value type: // observers const string_type string() const; const string_type file_string() const; const string_type directory_string() const; const external_string_type external_file_string() const; const external_string_type external_directory_string() const; string_type root_name() const; string_type root_directory() const; basic_path root_path() const; basic_path relative_path() const; string_type leaf() const; basic_path branch_path() const; I know that syntactically this is nicer and we can have RVO: string_type str = path.string(); But when iterating through directories I find that that returning a temporary object that must allocate/copy/deallocate hurts my performance paranoia. Even with move semantics we have a an overhead: std::vector<std::path> paths; //fill with paths std::path::iterator beg = paths.begin(), end = paths.end(); for(; beg != end; ++it){ std::path::string_type str = it->root_name();//temporary created str += "append some data"; std::cout << str; } Couldn't be better (although uglier) to take a reference to a string that will be filled? void fill_root_name(string_type &root_name) const; ... //////////////////////////////////////////////////////////////////// std::vector<std::path> paths; //fill with paths std::path::string_type root_name; root_name.reserve(PATH_LENGTH); std::path::iterator beg = paths.begin(), end = paths.end(); for(; beg != end; ++it){ it->fill_root_name(root_name); str += "append some data"; std::cout << str; } This way we only allocate memory if we don't have enough place in our string. We can also reserve it beforehand to speed up code. Apart from this I see that path::iterator has a string member. dereference will return a reference to that member but an iterator is supposed to be a "lightweight" pointer-like abstraction, which is value-copied between functions. A string member, in my opinion, converts an iterator in a heavy class (that depends on the string length, but an small string optimization of 16 bytes is not going to help much). Now that filesystem is proposed for the standard I would like to ask boosters (and Beman, of course) if they find these performance concerns serious enough. Regards, Ion

Show replies by date

Caleb Epstein

4 Mar 4 Mar

6:36 p.m.

On 3/4/06, Ion Gaztañaga <igaztanaga@gmail.com> wrote:

...

Now that filesystem is proposed for the standard I would like to ask boosters (and Beman, of course) if they find these performance concerns serious enough.

Perhaps if they were accompanied by some comparative performance benchmarks or profile analysis? -- Caleb Epstein caleb dot epstein at gmail dot com

Caleb Epstein

6 Mar 6 Mar

3:14 p.m.

On 3/4/06, Caleb Epstein <caleb.epstein@gmail.com> wrote:

...

On 3/4/06, Ion Gaztañaga <igaztanaga@gmail.com> wrote:

...
Now that filesystem is proposed for the standard I would like to ask boosters (and Beman, of course) if they find these performance concerns serious enough.

Perhaps if they were accompanied by some comparative performance benchmarks or profile analysis?

In the interests of science, I wrote a small "file finder" using Boost.Filesystem and a comparable version using POSIX functions (e.g. stat, readdir, etc). The POSIX version runs MANY times faster than the Boost.Filesystem version (code attached). Note that the Boost version makes use of the "status" member of the directory iterator which is in CVS and is aimed at reducing the number of operating system calls that the library needs to make. The test programs take an optional -R (recursive) flag and a list of directories and filename extensions on the command line. They walks each of the directories (recursively) searching for files with matching extensions and tally their number and size. At the end, they generate a summary report by extension. Here is some sample output after priming the buffer cache by running each of the programs several times (yes, I have a lot of music): [9:48] cae @ tela 740% time ~/finder-fs -R /raid/shn .mp3 .flac .shn ~/src/c++ .flac: 5550 files, 243.918 GiB .mp3: 30364 files, 232.695 GiB .shn: 152 files, 6.90744 GiB ~/finder-fs -R /raid/shn .mp3 .flac .shn 0.31s user 3.97s system 97% cpu 4.369 total Once the buffer cache has been primed, the results do not vary much. All runs are on the order of 4.3 seconds. This is using an optimized version of the filesystem library and my code compiled with -g -O2 -pg. Removing the profiling options reduces the runtime to approximately 4.1 seconds so the profiling overhead is relatively small. Here's the output from the POSIX version: [9:49] cae @ tela 741% time ~/finder-posix -R /raid/shn .mp3 .flac .shn .flac: 5550 files, 243.918 GiB .mp3: 30364 files, 232.695 GiB .shn: 152 files, 6.90744 GiB ~/finder-posix -R /raid/shn .mp3 .flac .shn 0.18s user 0.64s system 99% cpu 0.832 total Looking at the profiling output from the "finder-fs" program, it appears a bulk of the time is spent in fs::basic_path::operator/=() which may bear out Ion's fears. The profiling output for "finder-posix" shows that the bulk of the time ( 66.6%) is in std::map::find and the finder function. In the Boost.Filesystem version, this amounts to only 12.5% of the runtime. -- Caleb Epstein caleb dot epstein at gmail dot com

Ion Gaztañaga

7:41 p.m.

...

In the interests of science, I wrote a small "file finder" using Boost.Filesystem and a comparable version using POSIX functions (e.g. stat, readdir, etc).

Wow, that's a fast test implementation!

...

Looking at the profiling output from the "finder-fs" program, it appears a bulk of the time is spent in fs::basic_path::operator/=() which may bear out Ion's fears.

Even if the string manipulation was not the most heavy part of the test, I would like to point out two ideas: -> Even if the filesystem is slower than returning a temporary object, that doesn't mean that you shouldn't optimize it. Your program is not alone in the computer, so when your program is waiting for I/O (surely via DMA) other processes are taking CPU, but when you create a lot of temporary strings you are stressing the memory allocator (so you can have more fragmentation in a long live program) and degrading overall system performance. Maybe your program is not much slower, but the CPU consumption is much higher. -> The good old C++ principle: "Pass heavy objects always by reference" is a very good advice to avoid performance problems. You can pass strings by value, but you never do that. But in my opinion this advice is also valid for the inverse sense: "Obtain heavy values always by reference". You never know how your code will be used, how many times your functions will be called. If you want your library to be useful even in situations that you don't predict, it's always helpful to use references. The difference is that passing parameters by reference does not change syntax whereas obtaining by values by reference is uglier. Caleb, I don't know if this is too much work but, can we know how many calls to "new" this application makes or even better, how many std::strings it constructs? Thanks and great job! Ion

Caleb Epstein

10:35 p.m.

On 3/6/06, Ion Gaztañaga <igaztanaga@gmail.com> wrote:

...

Caleb, I don't know if this is too much work but, can we know how many calls to "new" this application makes or even better, how many std::strings it constructs?

Any suggestions on how to accomplish that? I've tried using the Google "tcmalloc" library, but the results show the #1 heap-user is some function that can't be determined so I'm not sure how helpful this information is. In the case that it may be, I'm attaching the output from "pprof --pdf --alloc_objects" for the Boost.Filesystem "finder" program. Google tcmalloc library: http://goog-perftools.sourceforge.net/doc/tcmalloc.html TCMalloc Heap Profiler: http://goog-perftools.sourceforge.net/doc/heap_profiler.html Some docs that might help interpreting the output: http://goog-perftools.sourceforge.net/doc/cpu_profiler.html -- Caleb Epstein caleb dot epstein at gmail dot com

Ion Gaztañaga

7 Mar 7 Mar

6:03 a.m.

Can you send the code of your "finder-fs" so that I can check it in visual 7.1 (non-reference counted implementation 16 small string optimization)? I suspect string/creation will be higher since you surely won't modify strings returned by path and reference counting will improve that. It's a shame I haven't got your 200 GB mp3 collection! Regards, Ion

Caleb Epstein

3:03 p.m.

On 3/6/06, Caleb Epstein <caleb.epstein@gmail.com> wrote:

...

On 3/4/06, Caleb Epstein <caleb.epstein@gmail.com> wrote:

...
On 3/4/06, Ion Gaztañaga <igaztanaga@gmail.com > wrote:

...
Now that filesystem is proposed for the standard I would like to ask boosters (and Beman, of course) if they find these performance concerns serious enough.

Perhaps if they were accompanied by some comparative performance benchmarks or profile analysis?

In the interests of science, I wrote a small "file finder" using Boost.Filesystem and a comparable version using POSIX functions (e.g. stat, readdir, etc). The POSIX version runs MANY times faster than the Boost.Filesystem version (code attached). Note that the Boost version makes use of the "status" member of the directory iterator which is in CVS and is aimed at reducing the number of operating system calls that the library needs to make.

It seems that this profiling output might be slightly misleading. I don't think it is taking into account the time spent in system calls. Using the Google CPU Profiler, I reran my tests on the Boost.Filesystemversion of the file finder and I find that the bulk of the runtime is spent not in string manipulation, but in calls to "statfs". It appears that this is coming from basic_directory_iterator::m_init, where it is calling pathconf: #0 0x4020d9c0 in statfs () from /lib/tls/libc.so.6 #1 0x401dfb72 in pathconf () from /lib/tls/libc.so.6 #2 0x40022f89 in boost::filesystem::detail::dir_itr_first (handle=@0x8051184, buffer=@0x8051188, dir=@0xbfd8f79c, target=@0xbfd8f798) at libs/filesystem/src/operations.cpp:1178 #3 0x0804d231 in boost::filesystem::basic_directory_iterator<boost::filesystem::basic_path<std::string, boost::filesystem::path_traits> >::m_init ( this=0xbfd8f85c, dir_path=@0x80510d0) at operations.hpp:881 #4 0x0804d903 in basic_directory_iterator (this=0xbfd8f85c, dir_path=@0x80510d0) at operations.hpp:911 #5 0x0804ad72 in finder (root=@0xbfd8f57c, stats=@0xbfd8f8e0, recursive=true) at finder-fs.cpp:39 #6 0x0804b07d in main (argc=6, argv=0xbfd8f9b4) at finder-fs.cpp:90 This call accounts for 75% of the program's runtime according to Google's profiler (see attached PDF) If I change the code in dir_itr_first to use the constant value NAME_MAX instead of retrieving this value via pathconf, the runtime for the Boost.Filesytem test and my "pure POSIX" version are nearly identical. Both run in about 0.8 seconds once the buffer cache has been seeded. I understand why pathconf is being used for portability's sake, but it seems like its a real performance killer. Perhaps the NAME_MAX or MAXNAMLEN value could be used on platforms where it is defined? -- Caleb Epstein caleb dot epstein at gmail dot com

Caleb Epstein

20 Apr 20 Apr

7:21 p.m.

Beman, is this on your radar screen at all? I last tried to ping you about it in March. To refresh, Boost.Filesystem seems to use the POSIX pathconf call excessively, which hurts performance when doing recursive operations like UNIX's find(1) command. On 3/7/06, Caleb Epstein <caleb.epstein@gmail.com> wrote:

...

On 3/6/06, Caleb Epstein <caleb.epstein@gmail.com> wrote:

...
On 3/4/06, Caleb Epstein <caleb.epstein@gmail.com > wrote:

...
On 3/4/06, Ion Gaztañaga <igaztanaga@gmail.com > wrote:

...
Now that filesystem is proposed for the standard I would like to ask boosters (and Beman, of course) if they find these performance concerns serious enough.

Perhaps if they were accompanied by some comparative performance benchmarks or profile analysis?

In the interests of science, I wrote a small "file finder" using Boost.Filesystem and a comparable version using POSIX functions ( e.g. stat, readdir, etc). The POSIX version runs MANY times faster than the Boost.Filesystem version (code attached). Note that the Boost version makes use of the "status" member of the directory iterator which is in CVS and is aimed at reducing the number of operating system calls that the library needs to make.

It seems that this profiling output might be slightly misleading. I don't think it is taking into account the time spent in system calls.

Using the Google CPU Profiler, I reran my tests on the Boost.Filesystemversion of the file finder and I find that the bulk of the runtime is spent not in string manipulation, but in calls to "statfs". It appears that this is coming from basic_directory_iterator::m_init, where it is calling pathconf:

#0 0x4020d9c0 in statfs () from /lib/tls/libc.so.6 #1 0x401dfb72 in pathconf () from /lib/tls/libc.so.6 #2 0x40022f89 in boost::filesystem::detail::dir_itr_first (handle=@0x8051184, buffer=@0x8051188, dir=@0xbfd8f79c, target=@0xbfd8f798) at libs/filesystem/src/operations.cpp:1178 #3 0x0804d231 in boost::filesystem::basic_directory_iterator<boost::filesystem::basic_path<std::string, boost::filesystem::path_traits> >::m_init ( this=0xbfd8f85c, dir_path=@0x80510d0) at operations.hpp:881 #4 0x0804d903 in basic_directory_iterator (this=0xbfd8f85c, dir_path=@0x80510d0) at operations.hpp:911 #5 0x0804ad72 in finder (root=@0xbfd8f57c, stats=@0xbfd8f8e0, recursive=true) at finder-fs.cpp:39 #6 0x0804b07d in main (argc=6, argv=0xbfd8f9b4) at finder-fs.cpp:90

This call accounts for 75% of the program's runtime according to Google's profiler (see attached PDF)

If I change the code in dir_itr_first to use the constant value NAME_MAX instead of retrieving this value via pathconf, the runtime for the Boost.Filesytem test and my "pure POSIX" version are nearly identical. Both run in about 0.8 seconds once the buffer cache has been seeded.

I understand why pathconf is being used for portability's sake, but it seems like its a real performance killer. Perhaps the NAME_MAX or MAXNAMLEN value could be used on platforms where it is defined?

-- Caleb Epstein caleb dot epstein at gmail dot com

-- Caleb Epstein caleb dot epstein at gmail dot com

Chris Frey

29 Apr 29 Apr

2:17 a.m.

On Thu, Apr 20, 2006 at 03:21:47PM -0400, Caleb Epstein wrote:

...

Beman, is this on your radar screen at all? I last tried to ping you about it in March.

To refresh, Boost.Filesystem seems to use the POSIX pathconf call excessively, which hurts performance when doing recursive operations like UNIX's find(1) command.

Performance has been a problem with Boost::Filesystem for at least a year. Back when I mentioned it, the "party line" :-) was that the API was top priority and performance would be something that could be looked at later. With the standards committees looking at including Boost::Filesystem in TR2, I think performance is imporant enough to be looked at by now. You can read some early testing from June 2005 here: http://lists.boost.org/Archives/boost/2005/06/87696.php Using POSIX calls or native directory calls isn't that hard, and people can code this themselves with greater speed than Boost::Filesystem. I know that I did this for my application, which I think is unfortunate, since a fast, cross-platform filesystem API would be useful, especially in the standard. But it loses its utility when it can't keep up with find(1). - Chris

Beman Dawes

8:28 p.m.

"Chris Frey" <cdfrey@foursquare.net> wrote in message news:20060429021716.GA26100@foursquare.net...

...

On Thu, Apr 20, 2006 at 03:21:47PM -0400, Caleb Epstein wrote:

...
Beman, is this on your radar screen at all? I last tried to ping you about it in March.

To refresh, Boost.Filesystem seems to use the POSIX pathconf call excessively, which hurts performance when doing recursive operations like UNIX's find(1) command.

Performance has been a problem with Boost::Filesystem for at least a year. Back when I mentioned it, the "party line" :-) was that the API was top priority and performance would be something that could be looked at later.

With the standards committees looking at including Boost::Filesystem in TR2, I think performance is imporant enough to be looked at by now.

AFAIK, all performance issues have been resolved. The pathconf issues was only resolved recently, but the need for directory iteration to cache status was fixed quite a while ago, as were interface changes so that several status queries regarding the same file don't result in disk accesses unless so desired. So you may want to checkout the current CVS head to see if whatever performance problem you were seeing is still an issue. Part of the problem may be that the current Boost release process is so lengthy that changes made to CVS last Fall still haven't appeared in an official Boost release. I've got a proposal about ready to address that problem, and will post something in 10 days or so. --Beman

Chris Frey

30 Apr 30 Apr

2:43 a.m.

On Sat, Apr 29, 2006 at 04:28:39PM -0400, Beman Dawes wrote:

...

AFAIK, all performance issues have been resolved. The pathconf issues was only resolved recently, but the need for directory iteration to cache status was fixed quite a while ago, as were interface changes so that several status queries regarding the same file don't result in disk accesses unless so desired.

So you may want to checkout the current CVS head to see if whatever performance problem you were seeing is still an issue.

I happily stand corrected. I checked out the CVS tree, and ran a modified recursive version of the simple_ls.cpp example, and ran it against the boost source tree. The resulting run times on my system were on par with find(1). Nice work, and thank you. - Chris

Caleb Epstein

1 May 1 May

1:32 p.m.

On 4/29/06, Beman Dawes <bdawes@acm.org> wrote:

...

"Chris Frey" <cdfrey@foursquare.net> wrote in message

...
With the standards committees looking at including Boost::Filesystem in TR2, I think performance is imporant enough to be looked at by now.

So you may want to checkout the current CVS head to see if whatever performance problem you were seeing is still an issue.

I have also found the current CVS code to be on par with my "100% pure POSIX" recursive file-finder. The run-times and of the two implementatoins are identical. In fact, I even found the Boost.Filesystem implementation to be more optimal in certain cases since it makes use of the dirent.d_type member to detect the is_directory-ness of an entry, where my naive POSIX code was always calling stat(2). This means fewer syscalls for the Boost version, which is a win. Thanks for the great library, Beman! -- Caleb Epstein caleb dot epstein at gmail dot com

Beman Dawes

4 Mar 4 Mar

10:09 p.m.

"Ion Gaztañaga" <igaztanaga@gmail.com> wrote in message news:4409CCA4.6050107@gmail.com...

...

Hi to all,

While revising a bit the boost::filesystem error/exception use for looking a model for interprocess library and after reading the N1934 "Filesystem Library Proposal for TR2 (Revision 2)" I'm a bit concerned about the heavy use of std::string/string_type in the library.

The following functions return some kind of path as value type:

// observers const string_type string() const; const string_type file_string() const; const string_type directory_string() const;

const external_string_type external_file_string() const; const external_string_type external_directory_string() const;

They are specified to be able to return either a const value or const ref. The Boost implementation is more concerned with proving the interface than blind efficiency, so it returns by const value. A POSIX narrow character paths implementation, for example, need perform no conversion at all, so it could use const ref returns.

...

string_type root_name() const; string_type root_directory() const; basic_path root_path() const; basic_path relative_path() const; string_type leaf() const; basic_path branch_path() const;

I know that syntactically this is nicer and we can have RVO:

string_type str = path.string();

But when iterating through directories I find that that returning a temporary object that must allocate/copy/deallocate hurts my performance paranoia.

As Caleb points out, it is premature optimizaton to talk about "hurting performance" in the absence of timings in realistic use scenarios. That said, if you can come up with a realistic use case that really does show significant slow-down compared to some alternate interface, it would be worth talking about.

...

...

Apart from this I see that path::iterator has a string member. dereference will return a reference to that member but an iterator is supposed to be a "lightweight" pointer-like abstraction, which is value-copied between functions. A string member, in my opinion, converts an iterator in a heavy class (that depends on the string length, but an small string optimization of 16 bytes is not going to help much).

That's an implementation detail. It isn't required by the spec, although that may be the most obvious way to implement the spec. An alternate implementation would be to keep a pool of directory entry objects and recycle them if performance was a concern. It would be great if Boost had a cache library to make such a strategy trivial to implement. If there is a way to modify the interface to make it easier to create high-performance implementations, that would be of interest as long as it didn't do too much violence to the usual STL ideoms.

...

Now that filesystem is proposed for the standard I would like to ask boosters (and Beman, of course) if they find these performance concerns serious enough.

The more scrutiny the better! Thanks for the comments, --Beman

Ion Gaztañaga

5 Mar 5 Mar

10:18 a.m.

Hi Beman,

...

As Caleb points out, it is premature optimizaton to talk about "hurting performance" in the absence of timings in realistic use scenarios.

That said, if you can come up with a realistic use case that really does show significant slow-down compared to some alternate interface, it would be worth talking about.

Ok. I see it like "premature pessimization", but you are right about a realistic use case. Scanning recursively a directory looking for files that have an extension (say for example, looking for mp3 files) is in my opinion a realistic use case. Obviously, looking for files will be slower than returning "path.leaf()" (although maybe the OS catches directory entries in memory) but apart from the speed, I think that the important point is the memory stress you force creating a temporary every time you want to obtain the name of the file. The filesystem operations are maybe slower but surely the OS is carefully avoiding heap fragmentation using internal pools, while the user is creating a lot of temporaries. I will try to implement this use case if you agree. The "path" class is also a class not related with disk operations (by the way, we can mount a filesystem in memory so operations can be fast, or it can represent a shared memory, following "Everything is a file" UNIX philosophy). Is it realistic to store a lot of path objects in a containers and request operations like leaf(), root(), etc...? I don't know. But I see the path class as a pseudo-container of strings representing a hierarchy. Path could represent a file or any other hierarchy in the operating system, because is quite generic.

...

...
Apart from this I see that path::iterator has a string member. dereference will return a reference to that member but an iterator is supposed to be a "lightweight" pointer-like abstraction, which is value-copied between functions. A string member, in my opinion, converts an iterator in a heavy class (that depends on the string length, but an small string optimization of 16 bytes is not going to help much).

That's an implementation detail. It isn't required by the spec, although that may be the most obvious way to implement the spec. An alternate implementation would be to keep a pool of directory entry objects and recycle them if performance was a concern. It would be great if Boost had a cache library to make such a strategy trivial to implement.

You are right. I need to concentrate on the interface. As a comment looking the code, since iterator returns a const string reference, you could also add a vector of strings to the path class, so that the iterator could be the const_iterator of the vector. You could avoid the string member and have trivial increment/copy operations. You are requesting more memory to the path class, though.

...

Thanks for the comments,

Thanks for your quick reply, Ion

Beman Dawes

6 Mar 6 Mar

2:35 a.m.

"Ion Gaztañaga" <igaztanaga@gmail.com> wrote in message news:440ABAE3.8080007@gmail.com...

...

Hi Beman,

...
As Caleb points out, it is premature optimizaton to talk about "hurting performance" in the absence of timings in realistic use scenarios.

That said, if you can come up with a realistic use case that really does show significant slow-down compared to some alternate interface, it would be worth talking about.

Ok. I see it like "premature pessimization", but you are right about a realistic use case. Scanning recursively a directory looking for files that have an extension (say for example, looking for mp3 files) is in my opinion a realistic use case. Obviously, looking for files will be slower than returning "path.leaf()" (although maybe the OS catches directory entries in memory) but apart from the speed, I think that the important point is the memory stress you force creating a temporary every time you want to obtain the name of the file. The filesystem operations are maybe slower but surely the OS is carefully avoiding heap fragmentation using internal pools, while the user is creating a lot of temporaries. I will try to implement this use case if you agree.

That's a use case that I would be interested in. But also remember that objectives of the library including ease-of-use for script-like programs, and in general valuing clean design over the last iota of efficiency. I'd also like it to feel familiar to standard library users. I have a long personal history of adding kinky little bits and pieces to designs (mostly for efficiency) and then regretting it later.

...

The "path" class is also a class not related with disk operations (by the way, we can mount a filesystem in memory so operations can be fast, or it can represent a shared memory, following "Everything is a file" UNIX philosophy). Is it realistic to store a lot of path objects in a containers and request operations like leaf(), root(), etc...? I don't know. But I see the path class as a pseudo-container of strings representing a hierarchy. Path could represent a file or any other hierarchy in the operating system, because is quite generic.

...
...
Apart from this I see that path::iterator has a string member. dereference will return a reference to that member but an iterator is supposed to be a "lightweight" pointer-like abstraction, which is value-copied between functions. A string member, in my opinion, converts an iterator in a heavy class (that depends on the string length, but an small string optimization of 16 bytes is not going to help much).

That's an implementation detail. It isn't required by the spec, although that may be the most obvious way to implement the spec. An alternate implementation would be to keep a pool of directory entry objects and recycle them if performance was a concern. It would be great if Boost had a cache library to make such a strategy trivial to implement.

You are right. I need to concentrate on the interface. As a comment looking the code, since iterator returns a const string reference, you could also add a vector of strings to the path class, so that the iterator could be the const_iterator of the vector. You could avoid the string member and have trivial increment/copy operations. You are requesting more memory to the path class, though.

Yes, if you are willing to expend more memory in the path class itself, you can gain a lot of theoretical speed, particularly on non-POSIX implementations where the portable syntax and the native syntax differ. --Beman

John Maddock

5 Mar 5 Mar

10:29 a.m.

...

...
That's an implementation detail. It isn't required by the spec, although that may be the most obvious way to implement the spec. An alternate implementation would be to keep a pool of directory entry objects and recycle them if performance was a concern. It would be great if Boost had a cache library to make such a strategy trivial to implement.

What kind of cache did you have in mind? Regex has an "Object Cache" (see boost/regex/pending/object_cache.hpp) that I always meant to submit for full Boost status but never got around to :-( It simply maps a "key" to an instance of an object constructed from the key: if the object already exists it returns it, otherwise it constructs a new one and caches it. Objects are returned as a shared_ptr<const Object> (the shared_ptr is needed for memory management, since neither the cache not the client have exclusive ownership on the object). Useage is simply: shared_ptr<const Object> obj = object_cache<Key, Object>::get( my_key, max_number_of_objects_to_cache); To be honest usefulness is pretty limited, and threading issues add a small performance hurdle, that means that "Objects" have to be fairly expensive to construct before it becomes useful. Still, there you go ;-) John.

Beman Dawes

6 Mar 6 Mar

2:21 a.m.

"John Maddock" <john@johnmaddock.co.uk> wrote in message news:006a01c6403f$b0760640$caed1b52@fuji...

...

...
...
That's an implementation detail. It isn't required by the spec, although that may be the most obvious way to implement the spec. An alternate implementation would be to keep a pool of directory entry objects and recycle them if performance was a concern. It would be great if Boost had a cache library to make such a strategy trivial to implement.

What kind of cache did you have in mind? Regex has an "Object Cache" (see boost/regex/pending/object_cache.hpp) that I always meant to submit for full Boost status but never got around to :-(

It simply maps a "key" to an instance of an object constructed from the key: if the object already exists it returns it, otherwise it constructs a new one and caches it. Objects are returned as a shared_ptr<const Object> (the shared_ptr is needed for memory management, since neither the cache not the client have exclusive ownership on the object). Useage is simply:

shared_ptr<const Object> obj = object_cache<Key, Object>::get( my_key, max_number_of_objects_to_cache);

To be honest usefulness is pretty limited, and threading issues add a small performance hurdle, that means that "Objects" have to be fairly expensive to construct before it becomes useful.

Still, there you go ;-)

Rather than a cache that does A or a cache that does B, I'd rather see someone do a review of various cache features in the literature (or at least whatever Google can find), identify the most useful ones, and then design one or more classes (probably template based) that delivered that useful feature set. For example, "Least Recently Used", "Least Frequently Used", and "Least Recently Added" are common replacement policies, but there are probably some others that are occasionally useful. I've used a scheme where entries are never flushed as long as there are any live references to them. Another way to approach a cache library design would be to collect a set of use cases, and then see what feature set would be needed to satisfy those use cases. Caching is one of the fundamental ways to achieve improved performance. I'm a bit surprised it hasn't been more of a topic in Boost discussions, or in computer science in general. That isn't a criticism of your object_cache; it is just that I think the topic deserves more than a single-solution approach. --Beman

John Maddock

10:36 a.m.

...

...
Still, there you go ;-)

Rather than a cache that does A or a cache that does B, I'd rather see someone do a review of various cache features in the literature (or at least whatever Google can find), identify the most useful ones, and then design one or more classes (probably template based) that delivered that useful feature set.

For example, "Least Recently Used", "Least Frequently Used", and "Least Recently Added" are common replacement policies, but there are probably some others that are occasionally useful. I've used a scheme where entries are never flushed as long as there are any live references to them. Another way to approach a cache library design would be to collect a set of use cases, and then see what feature set would be needed to satisfy those use cases.

Caching is one of the fundamental ways to achieve improved performance. I'm a bit surprised it hasn't been more of a topic in Boost discussions, or in computer science in general.

That isn't a criticism of your object_cache; it is just that I think the topic deserves more than a single-solution approach.

Yep, I couldn't agree more: since I only needed one use case I only wrote one, and that's really why I haven't brought it forward for review. Or to put it another way I couldn't be bothered to go through all the possible caching policies ;-) John.

Nigel Stewart

5:25 p.m.

New subject: Caching in C++

...

...
Caching is one of the fundamental ways to achieve improved performance. I'm a bit surprised it hasn't been more of a topic in Boost discussions, or in computer science in general.

Here is an application for a C++ LRU datastructure in computer graphics: http://aim.adc.rmit.edu.au/phd/sgreuter/papers/graphite2003.pdf In a nutshell, the datastructure is a fairly generic arrangement of std::list and std::map, but the replacement policy is fairly application-oriented. Related to this was some activity towards a boost circular buffer container... Which is something I intend to revisit, some shiny day... Nigel

Joaquín Mª López Muñoz

6:17 p.m.

New subject: Caching in C++

Nigel Stewart ha escrito:

...

...
...
Caching is one of the fundamental ways to achieve improved performance. I'm a bit surprised it hasn't been more of a topic in Boost discussions, or in computer science in general.

Here is an application for a C++ LRU datastructure in computer graphics: http://aim.adc.rmit.edu.au/phd/sgreuter/papers/graphite2003.pdf

In a nutshell, the datastructure is a fairly generic arrangement of std::list and std::map, but the replacement policy is fairly application-oriented.

Have you considered using Boost.MultiIndex as the core of your lru class? There's a MRU list class example in the same spirit at http://boost.org/libs/multi_index/doc/examples.html#example9 Joaquín M López Muñoz Telefónica, Investigación y Desarrollo

Kim Barrett

7 Mar 7 Mar

6:35 p.m.

New subject: Caching in C++

At 7:17 PM +0100 3/6/06, Joaquín Mª López Muñoz wrote:

...

Have you considered using Boost.MultiIndex as the core of your lru class? There's a MRU list class example in the same spirit at

http://boost.org/libs/multi_index/doc/examples.html#example9

This paper should also be of interest if one is thinking about replacement policies for caches: General adaptive replacement policies Yannis Smaragdakis ISMM 2004 Proceedings http://www-static.cc.gatech.edu/~yannis/adaptive2.pdf

John Maddock

10:41 a.m.

New subject: Caching in C++

...

...
...
Caching is one of the fundamental ways to achieve improved performance. I'm a bit surprised it hasn't been more of a topic in Boost discussions, or in computer science in general.

Here is an application for a C++ LRU datastructure in computer graphics: http://aim.adc.rmit.edu.au/phd/sgreuter/papers/graphite2003.pdf

In a nutshell, the datastructure is a fairly generic arrangement of std::list and std::map, but the replacement policy is fairly application-oriented.

<Chuckles> That's exactly the same data structure I used to cache regex-traits classes and locales in boost/pending/object_cache.hpp. So it looks like at least two of us came up with the same idea... John.

Nigel Stewart

4:46 p.m.

New subject: Caching in C++

...

...
Here is an application for a C++ LRU datastructure in computer graphics: http://aim.adc.rmit.edu.au/phd/sgreuter/papers/graphite2003.pdf

In a nutshell, the datastructure is a fairly generic arrangement of std::list and std::map, but the replacement policy is fairly application-oriented.

<Chuckles> That's exactly the same data structure I used to cache regex-traits classes and locales in boost/pending/object_cache.hpp. So it looks like at least two of us came up with the same idea...

It's a fairly well cited paper now after three years, I don't recall boost having the MultiIndex stuff back when that work was being done. It would be nice revisit all that stuff with boost added to the toolchain... I'm also involved in applying graph-based optimisation to computer graphics, but the boost graph stuff turned up a little bit late for my purposes... :-) Glad we didn't do anything too strange for our LRU cache! Nigel

John Maddock

7:06 p.m.

New subject: Caching in C++

...

Glad we didn't do anything too strange for our LRU cache!

Nope, actually I vaguely recall reading about something similar years ago when the STL was first released to the world, but no longer have the reference :-( John.

Nigel Stewart

11:05 p.m.

New subject: Caching in C++

...

...
Glad we didn't do anything too strange for our LRU cache!

Nope, actually I vaguely recall reading about something similar years ago when the STL was first released to the world, but no longer have the reference :-(

Well, we have about 14 days before Stefan submits his PhD Thesis, so if anyones knows of relevant STL-based caching literature, speak now or forever hold your peace... :-) Nigel

Spencer Collyer

4:21 p.m.

New subject: Caching in C++

On Mon, 06 Mar 2006 11:25:49 -0600, Nigel Stewart wrote:

...

...
...
Caching is one of the fundamental ways to achieve improved performance. I'm a bit surprised it hasn't been more of a topic in Boost discussions, or in computer science in general.

Here is an application for a C++ LRU datastructure in computer graphics: http://aim.adc.rmit.edu.au/phd/sgreuter/papers/graphite2003.pdf

In a nutshell, the datastructure is a fairly generic arrangement of std::list and std::map, but the replacement policy is fairly application-oriented.

Related to this was some activity towards a boost circular buffer container... Which is something I intend to revisit, some shiny day...

Nigel

I developed a similar data structure for an LRU cache a while back. It is fairly basic - all you can do with it is check if a value exists, and add a value, but that was all that was required when I wrote it. The maximum size of the cache is determined at construction time. I did consider adding the ability to chain caches together, so you could have a small primary cache to test first, with a bigger secondary cache backing it up, but the amount of work involved compared to the prospective gain didn't seem worth it at the time. S> -- <<< Eagles may soar, but weasels don't get sucked into jet engines >>> 4:12pm up 82 days 23:54, 27 users, load average: 0.00, 0.00, 0.08 Registered Linux User #232457 | LFS ID 11703

Bo Persson

5 Mar 5 Mar

7:34 a.m.

"Ion Gaztañaga" <igaztanaga@gmail.com> skrev i meddelandet news:4409CCA4.6050107@gmail.com...

...

Hi to all,

While revising a bit the boost::filesystem error/exception use for looking a model for interprocess library and after reading the N1934 "Filesystem Library Proposal for TR2 (Revision 2)" I'm a bit concerned about the heavy use of std::string/string_type in the library.

The following functions return some kind of path as value type:

// observers const string_type string() const; const string_type file_string() const; const string_type directory_string() const;

const external_string_type external_file_string() const; const external_string_type external_directory_string() const;

string_type root_name() const; string_type root_directory() const; basic_path root_path() const; basic_path relative_path() const; string_type leaf() const; basic_path branch_path() const;

I know that syntactically this is nicer and we can have RVO:

string_type str = path.string();

But when iterating through directories I find that that returning a temporary object that must allocate/copy/deallocate hurts my performance paranoia. Even with move semantics we have a an overhead:

On the other hand, I would guess that the actual OS and hard drive / network level operations take a lot more time than copying a few strings.

...

std::vector<std::path> paths; //fill with paths

std::path::iterator beg = paths.begin(), end = paths.end();

for(; beg != end; ++it){ std::path::string_type str = it->root_name();//temporary created str += "append some data"; std::cout << str; }

or std::cout << it->root_name() << "append some data"; Who says you want to store the result? A temporary might be just fine.

...

Couldn't be better (although uglier) to take a reference to a string that will be filled?

void fill_root_name(string_type &root_name) const;

Really ugly! :-) It also assumes that the caller has somewhere to store the result. And the intention to keep it.

...

...

////////////////////////////////////////////////////////////////////

std::vector<std::path> paths; //fill with paths

std::path::string_type root_name;

root_name.reserve(PATH_LENGTH);

So now you are wasting space instead. :-)

...

This way we only allocate memory if we don't have enough place in our string. We can also reserve it beforehand to speed up code.

How do we know that speed is gained (and important), while space is cheaper? Bo Persson

Ion Gaztañaga

9:52 a.m.

Hi,

...

...
But when iterating through directories I find that that returning a temporary object that must allocate/copy/deallocate hurts my performance paranoia. Even with move semantics we have a an overhead:

On the other hand, I would guess that the actual OS and hard drive / network level operations take a lot more time than copying a few strings.

Yes. It's a wrong example, since a path has nothing to do with filesystem operations. A path (in the current implementation) is just a concatenation of strings representing a file or directory. Once path is formed there is no access to filesystem. For example, path.leaf() function is implemented as: return m_path.substr( leaf_pos( m_path, m_path.size() ) ); So path is a memory class. A "pseudo-container" of strings. Using path::iterator you can traverse those strings forming the path. Since you have to make conversions you can't return references like in normal containers in some cases, but returning values forces always a temporary string.

...

std::cout << it->root_name() << "append some data";

Who says you want to store the result? A temporary might be just fine.

But what if you want to store it?

...

...
Couldn't be better (although uglier) to take a reference to a string that will be filled?

void fill_root_name(string_type &root_name) const;

Really ugly! :-) It also assumes that the caller has somewhere to store the result. And the intention to keep it.

Couldn't we have both functions? Returning a copy forces always an extra allocation/deallocation and you can't avoid it. So maybe instead of replacing that function we can add a function saying we want to store that result in our string.

...

...
////////////////////////////////////////////////////////////////////

std::vector<std::path> paths; //fill with paths

std::path::string_type root_name;

root_name.reserve(PATH_LENGTH);

So now you are wasting space instead. :-)

...
This way we only allocate memory if we don't have enough place in our string. We can also reserve it beforehand to speed up code.

How do we know that speed is gained (and important), while space is cheaper?

Don't reserve anything, if you don't want to do it: std::path::string_type root_name; std::path::iterator beg = paths.begin(), end = paths.end(); for(; beg != end; ++it){ it->fill_root_name(root_name); } This will save (end-beg) string constructions and destructions. Is this a speed improvement. Well, that depends on: -> what ++it does (current boost path::iterator increment is expensive, but it does not need to, this is implementation detail) -> what it->does (currently returns the address of the internal string) -> the depth of the path But I think that filling a string instead of returning a temporary one is always faster. And there is another reason for avoiding allocation apart from speed: memory fragmentation. But as Beman has pointed out, I should not worry on implementation details, but on the interface. And the interface is forcing me to have temporaries in my code (for example like stringstream.str() function). How ofter will I call "root_name()", "leaf()" and others in my code? That will depend on the application type. My application maybe uses them once, but an antivirus scanning or a music jukebox constructing a song database will surely use them a lot. Regards, Ion

Nigel Stewart

6 Mar 6 Mar

5:30 p.m.

...

But I think that filling a string instead of returning a temporary one is always faster. And there is another reason for avoiding allocation apart from speed: memory fragmentation.

Something not yet mentioned in this thread - the Qt QString provides a reference-counted "shallow-copy", (deep-copy on modification) mechanism that makes QString a lot more efficient than std::string in this kind of situation. Copying a QString is cheap, and sizeof(QString)==sizeof(int). Has there been any boost proposal for alternative strings to std::string? Nigel

Caleb Epstein

6:18 p.m.

On 3/6/06, Nigel Stewart <ns@fluent.com> wrote:

...

...
But I think that filling a string instead of returning a temporary one is always faster. And there is another reason for avoiding allocation apart from speed: memory fragmentation.

Something not yet mentioned in this thread - the Qt QString provides a reference-counted "shallow-copy", (deep-copy on modification) mechanism that makes QString a lot more efficient than std::string in this kind of situation. Copying a QString is cheap, and sizeof(QString)==sizeof(int).

GNU's libstdc++'s std::string is reference counted as well. Has there been any boost proposal for alternative strings

...

to std::string?

There was a fixed_string library that was reviewed recently, but it was rejected. -- Caleb Epstein caleb dot epstein at gmail dot com

6980

Age (days ago)

7038

Last active (days ago)

List overview

Download

29 comments

10 participants

participants (10)

Beman Dawes
Bo Persson
Caleb Epstein
Chris Frey
Ion Gaztañaga
Joaquín Mª López Muñoz
John Maddock
Kim Barrett
Nigel Stewart
Spencer Collyer