why boost is so slow for file search?
I have 2 functions for read files list in one directory. One uses Win32 and one uses boost: void GetFilesWin32(std::string dir) { std::vectorstd::string vFiles; std::string f = dir + "*.*"; std::string file; WIN32_FIND_DATA findFileData; HANDLE h = FindFirstFile(f.c_str(), &findFileData); if(h != INVALID_HANDLE_VALUE) { do { if((findFileData.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY) != FILE_ATTRIBUTE_DIRECTORY) { file = findFileData.cFileName; vFiles.push_back( file); } } while( FindNextFile( h, &findFileData) != 0); } FindClose(h); } void GetFilesBoost(std::string dir) { namespace fs = boost::filesystem; std::vectorstd::string vFiles; fs::path path(dir); fs::directory_iterator end_dir; for(fs::directory_iterator it(path); it != end_dir; it++) { if(!(fs::is_directory(it->status()))) { vFiles.push_back(it->path().filename().string()); } } } Boost takes 1.00 ms while Win32 taks 0.15 ms for same directory. Why? -- View this message in context: http://boost.2283326.n4.nabble.com/why-boost-is-so-slow-for-file-search-tp46... Sent from the Boost - Users mailing list archive at Nabble.com.
Yes. same setting for both. -- View this message in context: http://boost.2283326.n4.nabble.com/why-boost-is-so-slow-for-file-search-tp46... Sent from the Boost - Users mailing list archive at Nabble.com.
On 12 June 2012 23:04, young
I have 2 functions for read files list in one directory. One uses Win32 and one uses boost: [...] Boost takes 1.00 ms while Win32 taks 0.15 ms for same directory. Why?
Would you bother posting complete and compilable program, than throwing a couple of dangling functions? Plus, compiler version + command line / compiler options you used to compile it. That would make it easier to closer reproduce your tests. Making your post complete would also save folks time/effort and avoid the 3 posts that followed yours asking/discussing obvious things. Best regards, -- Mateusz Loskot, http://mateusz.loskot.net
young wrote:
I have 2 functions for read files list in one directory. One uses Win32 and one uses boost:
void GetFilesWin32(std::string dir) { std::vectorstd::string vFiles; std::string f = dir + "*.*"; std::string file; WIN32_FIND_DATA findFileData; HANDLE h = FindFirstFile(f.c_str(), &findFileData); if(h != INVALID_HANDLE_VALUE) { do { if((findFileData.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY) != FILE_ATTRIBUTE_DIRECTORY) { file = findFileData.cFileName; vFiles.push_back( file); } } while( FindNextFile( h, &findFileData) != 0); } FindClose(h); }
void GetFilesBoost(std::string dir) { namespace fs = boost::filesystem; std::vectorstd::string vFiles; fs::path path(dir); fs::directory_iterator end_dir; for(fs::directory_iterator it(path); it != end_dir; it++) { if(!(fs::is_directory(it->status()))) { vFiles.push_back(it->path().filename().string()); } } }
Boost takes 1.00 ms while Win32 taks 0.15 ms for same directory. Why?
How about trying out the profiling facilites available with your development system. I"ve used both the ones available with gcc and recent versions of VC IDE and found them very useful for answering such questions. Robert Ramey
I could be wrong but it looks like you might be doing a wide -> narrow string conversion each time a string is added to the vector for the boost version. IIUC boost filesystem (v3 - the default) works with wide strings all the time on Windows, however the accessor you are using always returns a narrow string. Try using "native()" instead of "string()" for getting the string from the path and see if that makes things any better (obviously you will also need to change the type of strings you use elsewhere). HTH Iain On 12/06/2012 23:04, young wrote:
I have 2 functions for read files list in one directory. One uses Win32 and one uses boost:
void GetFilesWin32(std::string dir) { std::vectorstd::string vFiles; std::string f = dir + "*.*"; std::string file; WIN32_FIND_DATA findFileData; HANDLE h = FindFirstFile(f.c_str(), &findFileData); if(h != INVALID_HANDLE_VALUE) { do { if((findFileData.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY) != FILE_ATTRIBUTE_DIRECTORY) { file = findFileData.cFileName; vFiles.push_back( file); } } while( FindNextFile( h, &findFileData) != 0); } FindClose(h); }
void GetFilesBoost(std::string dir) { namespace fs = boost::filesystem; std::vectorstd::string vFiles; fs::path path(dir); fs::directory_iterator end_dir; for(fs::directory_iterator it(path); it != end_dir; it++) { if(!(fs::is_directory(it->status()))) { vFiles.push_back(it->path().filename().string()); } } }
Boost takes 1.00 ms while Win32 taks 0.15 ms for same directory. Why?
-- View this message in context: http://boost.2283326.n4.nabble.com/why-boost-is-so-slow-for-file-search-tp46... Sent from the Boost - Users mailing list archive at Nabble.com.
I have 2 functions for read files list in one directory. One uses Win32 and one uses boost:
FWIW, directory_iterator_increment() look like this: void directory_iterator_increment(directory_iterator& it, system::error_code* ec) { BOOST_ASSERT_MSG(it.m_imp.get(), "attempt to increment end iterator"); BOOST_ASSERT_MSG(it.m_imp->handle != 0, "internal program error"); path::string_type filename; file_status file_stat, symlink_file_stat; system::error_code temp_ec; for (;;) { temp_ec = dir_itr_increment(it.m_imp->handle, # if defined(BOOST_POSIX_API) it.m_imp->buffer, # endif filename, file_stat, symlink_file_stat); if (temp_ec) // happens if filesystem is corrupt, such as on a damaged optical disc { path error_path(it.m_imp->dir_entry.path().parent_path()); // fix ticket #5900 it.m_imp.reset(); if (ec == 0) BOOST_FILESYSTEM_THROW( filesystem_error("boost::filesystem::directory_iterator::operator++", error_path, error_code(BOOST_ERRNO, system_category()))); ec->assign(BOOST_ERRNO, system_category()); return; } else if (ec != 0) ec->clear(); if (it.m_imp->handle == 0) // eof, make end { it.m_imp.reset(); return; } if (!(filename[0] == dot // !(dot or dot-dot) && (filename.size()== 1 || (filename[1] == dot && filename.size()== 2)))) { it.m_imp->dir_entry.replace_filename( filename, file_stat, symlink_file_stat); return; } } } ...and the inner dir_itr_increment() function involves the following: perms make_permissions(const path& p, DWORD attr) { perms prms = fs::owner_read | fs::group_read | fs::others_read; if ((attr & FILE_ATTRIBUTE_READONLY) == 0) prms |= fs::owner_write | fs::group_write | fs::others_write; if (BOOST_FILESYSTEM_STRICMP(p.extension().string().c_str(), ".exe") == 0 || BOOST_FILESYSTEM_STRICMP(p.extension().string().c_str(), ".com") == 0 || BOOST_FILESYSTEM_STRICMP(p.extension().string().c_str(), ".bat") == 0 || BOOST_FILESYSTEM_STRICMP(p.extension().string().c_str(), ".cmd") == 0) prms |= fs::owner_exe | fs::group_exe | fs::others_exe; return prms; } ...where each of the 4 comparisons invokes conversion from wchar_t* to std::string. So, there's obviously a lot of overhead, but the question is whether it's really a bottleneck in a real-life application.
On 12 June 2012 23:04, young
Boost takes 1.00 ms while Win32 taks 0.15 ms for same directory. Why?
In spite of the overhead of the characters conversion mentioned by others, my are not that far apart (number of files in dir=2338, time in secs): 0------- win32: 0.00175283 2338 boost: 0.00527004 2338 1------- win32: 0.00159307 2338 boost: 0.00464673 2338 2------- win32: 0.00151897 2338 boost: 0.0045553 2338 3------- win32: 0.00154784 2338 boost: 0.00427396 2338 4------- win32: 0.00152827 2338 boost: 0.00400867 2338 5------- win32: 0.00155811 2338 boost: 0.00416361 2338 6------- win32: 0.00175219 2338 boost: 0.00411325 2338 7------- win32: 0.00153629 2338 boost: 0.00410458 2338 8------- win32: 0.00153501 2338 boost: 0.00405454 2338 9------- win32: 0.00155907 2338 boost: 0.0039984 2338 Best regards, -- Mateusz Loskot, http://mateusz.loskot.net
participants (6)
-
Iain Denniston
-
Igor R
-
Mateusz Loskot
-
Nathan Ridge
-
Robert Ramey
-
young