Rush Manbert wrote:
I have written a class around the glob_iterator class that W. Richard Johnson put into the file vault last year. My class finds all the paths and/or files matching a multi-level wildcard pattern. (i.e. the matching expressions may appear in more than one place, such as "/Users/*/MySource/??foo/*.cpp")
I need to detect loops caused by links between directories, and I also want to detect multiple paths to the same file, and only keep one of them. The only method that I have come up with so far is to keep the paths that I have found in a collection and test each new path against all of them by calling equivalent() once for each collection member. Needless to say, this is a little slow when the number of files or directories becomes large. It could be faster if I only tested symlinks, but I think that the existence of hard links forces me to test everything.
My first question: Can anyone suggest a better method of achieving the duplicates detection?
Maybe you could use the graph library to build a graph of your directory structure and then work with that to find duplicate paths, loops, etc. I know next to nothing about the library, but this sounds like a graph-theoretic problem to me. Paul