Re : extract url with boost::regex
data:image/s3,"s3://crabby-images/5572d/5572d76ddb1e85dd77ec255b6cb70d89fd11787c" alt=""
No, I don't see this example, I was reading the documentation and example about regex iterator and not regex token iterator. I search on other website but don't find this example. I will study it.
Thanks John for always anwsering, and always fast answering. I was completly discouraged. Thanks again!
----- Message d'origine ----
De : John Maddock
Hello;
I try to extract an url from a webpage and it's almostly done but completly unoptimised :
Before I try with a regex iterator. But I don't understand the documentation.
:-( Did you see this example:http://www.boost.org/libs/regex/example/snippets/regex_token_iterator_eg_2.c... It does exactly what you want - it exacts all the URL's from a HTML file.
boost::regex rexp(".*(http:\\/\\/.+)\"*.*");
and I get this result :
http://www.nolife-tv.com/" http://www.nolife-tv.com"> http://www.nolife-tv.com/images/stories/noiz/1.jpg"
http://www.nolife-tv.com/component/option,com_poll/task,results/id,16/Itemid...';"
http://www.joomla.org" http://www.google-analytics.com/urchin.js" http://www.omniture.com
and so on...
I will cut and get only the url without the " or ' why this regex get the " with it? I put the close bracket before the " so why? I already try to do \\" rather than \"
Because the .* on the end of the expression will match whatever text follows the ", the grouping construct (...) spits out a *sub-expression* which you can access via the match_results::operator[] or match_results::str(i) methods. HTH, John. [...] _____________________________________________________________________________ Ne gardez plus qu'une seule adresse mail ! Copiez vos mails vers Yahoo! Mail
participants (1)
-
hallouina-ml@yahoo.fr