Dne 18.4.2013 05:59, Michael Marcin napsal(a):
On 4/17/2013 4:54 PM, Jan Strnad wrote:
Dne 17.4.2013 20:00, Michael Marcin napsal(a):
On 4/17/2013 12:40 PM, Jan Strnad wrote:
Finally, I would implement functions template <typename TRange> find_iterator<TRange> approximate_find (TRange text, TRange pattern, approximate_distance<TRange>& distance);
template <typename TRange> find_iteratorstd::size_t approximate_find_index (TRange text, TRange pattern, approximate_distance<TRange>& distance);
These would return all approximate matches of pattern int text. The first one return the approximate string itself, the later one returns its index.
These functions are for finding substrings in the text?
This seems to me like how the regex_search interface works. Is it possible to make the interface similar?
Something like
string pattern = "acbdef"; string text = "aacdefbcdefabccef"; boost::amatch m; boost::approx e( hamming_distance, pattern, 1 ); while( boost::approx_search( text, m, e ) { cout << m.str() << endl; text = m.suffix().str(); } // prints aacdef, fbcdef abccef
Although I could be seeing similarities where they don't exist.
Well, the problem with this interface is that may not find all occurrences of pattern. Consider text="ababa" and pattern "aba". There are clearly two matches, the first starts at index 0 and the second at index 2. Using the interface above, I only get the one at index 0.
The point is that I can not just drop all matched characters. I may still need them.
So, I think the interface could look like this: string pattern = "acbdef"; string text = "aacdefbcdefabccef"; boost::amatches m; boost::approx e( hamming_distance, pattern, 1 ); boost::approx_search(text, m, e); while(m.next()) { cout << m.str() << endl; } // prints aacdef, fbcdef abccef
Therefore, boost:approx_search should return an interator-like structure.
I see.
In that case try to use an iterator interface that plays more nicely with the STL. Maybe something closer to regex_iterator. Specifically look at the ForwardIterator concept.
Maybe something like:
string pattern = "acbdef"; string text = "aacdefbcdefabccef"; boost::approx_pattern ap( hamming_distance, pattern, 1 ); boost::sapprox_match_iterator begin( text, ap ); boost::sapprox_match_iterator end; std::for_each( begin, end, []( const boost::sapprox_match_result& what ) { cout << what.str() << " starting at " << what.position() << endl; } );
This seems as the best choice to me right now.