Discussion:
why boost is so slow for file search?
young
2012-06-12 22:04:36 UTC
Permalink
I have 2 functions for read files list in one directory. One uses Win32 and
one uses boost:

void GetFilesWin32(std::string dir)
{
std::vector<std::string> vFiles;
std::string f = dir + "*.*";
std::string file;
WIN32_FIND_DATA findFileData;
HANDLE h = FindFirstFile(f.c_str(), &findFileData);
if(h != INVALID_HANDLE_VALUE)
{
do
{
if((findFileData.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY) !=
FILE_ATTRIBUTE_DIRECTORY)
{
file = findFileData.cFileName;
vFiles.push_back( file);
}
} while( FindNextFile( h, &findFileData) != 0);
}
FindClose(h);
}

void GetFilesBoost(std::string dir)
{
namespace fs = boost::filesystem;
std::vector<std::string> vFiles;
fs::path path(dir);
fs::directory_iterator end_dir;
for(fs::directory_iterator it(path); it != end_dir; it++)
{
if(!(fs::is_directory(it->status())))
{
vFiles.push_back(it->path().filename().string());
}
}
}

Boost takes 1.00 ms while Win32 taks 0.15 ms for same directory. Why?


--
View this message in context: http://boost.2283326.n4.nabble.com/why-boost-is-so-slow-for-file-search-tp4631199.html
Sent from the Boost - Users mailing list archive at Nabble.com.
Nathan Ridge
2012-06-13 01:32:00 UTC
Permalink
Post by young
I have 2 functions for read files list in one directory. One uses Win32 and
[snip]
Boost takes 1.00 ms while Win32 taks 0.15 ms for same directory. Why?
Are you compiling with optimizations enabled?

Regards,
Nate
young
2012-06-13 13:32:52 UTC
Permalink
Yes. same setting for both.

--
View this message in context: http://boost.2283326.n4.nabble.com/why-boost-is-so-slow-for-file-search-tp4631199p4631219.html
Sent from the Boost - Users mailing list archive at Nabble.com.
Robert Ramey
2012-06-13 16:21:18 UTC
Permalink
Post by young
I have 2 functions for read files list in one directory. One uses
void GetFilesWin32(std::string dir)
{
std::vector<std::string> vFiles;
std::string f = dir + "*.*";
std::string file;
WIN32_FIND_DATA findFileData;
HANDLE h = FindFirstFile(f.c_str(), &findFileData);
if(h != INVALID_HANDLE_VALUE)
{
do
{
if((findFileData.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY) !=
FILE_ATTRIBUTE_DIRECTORY)
{
file = findFileData.cFileName;
vFiles.push_back( file);
}
} while( FindNextFile( h, &findFileData) != 0);
}
FindClose(h);
}
void GetFilesBoost(std::string dir)
{
namespace fs = boost::filesystem;
std::vector<std::string> vFiles;
fs::path path(dir);
fs::directory_iterator end_dir;
for(fs::directory_iterator it(path); it != end_dir; it++)
{
if(!(fs::is_directory(it->status())))
{
vFiles.push_back(it->path().filename().string());
}
}
}
Boost takes 1.00 ms while Win32 taks 0.15 ms for same directory. Why?
How about trying out the profiling facilites available with your development
system. I"ve used both the ones available with gcc and recent versions
of VC IDE and found them very useful for answering such questions.

Robert Ramey
Mateusz Loskot
2012-06-13 15:22:46 UTC
Permalink
Post by young
I have 2 functions for read files list in one directory. One uses Win32 and
[...]
Boost takes 1.00 ms while Win32 taks 0.15 ms for same directory. Why?
Would you bother posting complete and compilable program, than
throwing a couple of dangling functions?
Plus, compiler version + command line / compiler options you used to compile it.
That would make it easier to closer reproduce your tests.

Making your post complete would also save folks time/effort and avoid
the 3 posts that
followed yours asking/discussing obvious things.

Best regards,
--
Mateusz Loskot, http://mateusz.loskot.net
Iain Denniston
2012-06-13 17:32:47 UTC
Permalink
I could be wrong but it looks like you might be doing a wide -> narrow
string conversion each time a string is added to the vector for the
boost version.

IIUC boost filesystem (v3 - the default) works with wide strings all the
time on Windows, however the accessor you are using always returns a
narrow string.

Try using "native()" instead of "string()" for getting the string from
the path and see if that makes things any better (obviously you will
also need to change the type of strings you use elsewhere).

HTH

Iain
Post by young
I have 2 functions for read files list in one directory. One uses Win32 and
void GetFilesWin32(std::string dir)
{
std::vector<std::string> vFiles;
std::string f = dir + "*.*";
std::string file;
WIN32_FIND_DATA findFileData;
HANDLE h = FindFirstFile(f.c_str(), &findFileData);
if(h != INVALID_HANDLE_VALUE)
{
do
{
if((findFileData.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY) !=
FILE_ATTRIBUTE_DIRECTORY)
{
file = findFileData.cFileName;
vFiles.push_back( file);
}
} while( FindNextFile( h, &findFileData) != 0);
}
FindClose(h);
}
void GetFilesBoost(std::string dir)
{
namespace fs = boost::filesystem;
std::vector<std::string> vFiles;
fs::path path(dir);
fs::directory_iterator end_dir;
for(fs::directory_iterator it(path); it != end_dir; it++)
{
if(!(fs::is_directory(it->status())))
{
vFiles.push_back(it->path().filename().string());
}
}
}
Boost takes 1.00 ms while Win32 taks 0.15 ms for same directory. Why?
--
View this message in context: http://boost.2283326.n4.nabble.com/why-boost-is-so-slow-for-file-search-tp4631199.html
Sent from the Boost - Users mailing list archive at Nabble.com.
Igor R
2012-06-13 18:18:46 UTC
Permalink
Post by young
I have 2 functions for read files list in one directory. One uses Win32 and
FWIW, directory_iterator_increment() look like this:

void directory_iterator_increment(directory_iterator& it,
system::error_code* ec)
{
BOOST_ASSERT_MSG(it.m_imp.get(), "attempt to increment end iterator");
BOOST_ASSERT_MSG(it.m_imp->handle != 0, "internal program error");

path::string_type filename;
file_status file_stat, symlink_file_stat;
system::error_code temp_ec;

for (;;)
{
temp_ec = dir_itr_increment(it.m_imp->handle,
# if defined(BOOST_POSIX_API)
it.m_imp->buffer,
# endif
filename, file_stat, symlink_file_stat);

if (temp_ec) // happens if filesystem is corrupt, such as on a
damaged optical disc
{
path error_path(it.m_imp->dir_entry.path().parent_path()); //
fix ticket #5900
it.m_imp.reset();
if (ec == 0)
BOOST_FILESYSTEM_THROW(
filesystem_error("boost::filesystem::directory_iterator::operator++",
error_path,
error_code(BOOST_ERRNO, system_category())));
ec->assign(BOOST_ERRNO, system_category());
return;
}
else if (ec != 0) ec->clear();

if (it.m_imp->handle == 0) // eof, make end
{
it.m_imp.reset();
return;
}

if (!(filename[0] == dot // !(dot or dot-dot)
&& (filename.size()== 1
|| (filename[1] == dot
&& filename.size()== 2))))
{
it.m_imp->dir_entry.replace_filename(
filename, file_stat, symlink_file_stat);
return;
}
}
}



...and the inner dir_itr_increment() function involves the following:

perms make_permissions(const path& p, DWORD attr)
{
perms prms = fs::owner_read | fs::group_read | fs::others_read;
if ((attr & FILE_ATTRIBUTE_READONLY) == 0)
prms |= fs::owner_write | fs::group_write | fs::others_write;
if (BOOST_FILESYSTEM_STRICMP(p.extension().string().c_str(), ".exe") == 0
|| BOOST_FILESYSTEM_STRICMP(p.extension().string().c_str(), ".com") == 0
|| BOOST_FILESYSTEM_STRICMP(p.extension().string().c_str(), ".bat") == 0
|| BOOST_FILESYSTEM_STRICMP(p.extension().string().c_str(), ".cmd") == 0)
prms |= fs::owner_exe | fs::group_exe | fs::others_exe;
return prms;
}

...where each of the 4 comparisons invokes conversion from wchar_t* to
std::string.

So, there's obviously a lot of overhead, but the question is whether
it's really a bottleneck in a real-life application.
Mateusz Loskot
2012-06-13 21:45:12 UTC
Permalink
Post by young
Boost takes 1.00 ms while Win32 taks 0.15 ms for same directory. Why?
In spite of the overhead of the characters conversion mentioned by others,
my are not that far apart (number of files in dir=2338, time in secs):

0-------
win32: 0.00175283 2338
boost: 0.00527004 2338
1-------
win32: 0.00159307 2338
boost: 0.00464673 2338
2-------
win32: 0.00151897 2338
boost: 0.0045553 2338
3-------
win32: 0.00154784 2338
boost: 0.00427396 2338
4-------
win32: 0.00152827 2338
boost: 0.00400867 2338
5-------
win32: 0.00155811 2338
boost: 0.00416361 2338
6-------
win32: 0.00175219 2338
boost: 0.00411325 2338
7-------
win32: 0.00153629 2338
boost: 0.00410458 2338
8-------
win32: 0.00153501 2338
boost: 0.00405454 2338
9-------
win32: 0.00155907 2338
boost: 0.0039984 2338

Best regards,
--
Mateusz Loskot, http://mateusz.loskot.net
Loading...