Grabbing web page content
I'm looking for an easy way to grab the contents of a web page, i.e., something like this (ignoring error handling): std::string pageContent = getPageContent("http://www.interestingsite.com/funkypage.html"); Is there a Boost library that will give me this kind of interface? I know I can build this on top of asio, but I'd prefer something where I don't have to do much more work than the above. If there's no Boost library, I'd welcome suggestions for other cross-platform ways to achieve this. Thanks, Scott
I'm looking for an easy way to grab the contents of a web page, i.e., something like this (ignoring error handling):
std::string pageContent = getPageContent("http://www.interestingsite.com/funkypage.html");
Is there a Boost library that will give me this kind of interface? I know I can build this on top of asio, but I'd prefer something where I don't have to do much more work than the above. If there's no Boost library, I'd welcome suggestions for other cross-platform ways to achieve this.
Most Un*x systems have libcurl, which will do this. I don't know about Windows, but this looks promising http://sourceforge.net/projects/curl/ -- -- Marshall Marshall Clow Idio Software mailto:marshall@idio.com It is by caffeine alone I set my mind in motion. It is by the beans of Java that thoughts acquire speed, the hands acquire shaking, the shaking becomes a warning. It is by caffeine alone I set my mind in motion.
on Fri Jul 06 2007, Scott Meyers
I'm looking for an easy way to grab the contents of a web page, i.e., something like this (ignoring error handling):
std::string pageContent = getPageContent("http://www.interestingsite.com/funkypage.html");
Is there a Boost library that will give me this kind of interface? I know I can build this on top of asio, but I'd prefer something where I don't have to do much more work than the above. If there's no Boost library, I'd welcome suggestions for other cross-platform ways to achieve this.
Python is your friend. import urllib f = urllib.urlopen("http://www.interestingsite.com/funkypage.html"); pageContent = f.read() Yes, we need C++ versions of all these things. -- Dave Abrahams Boost Consulting http://www.boost-consulting.com The Astoria Seminar ==> http://www.astoriaseminar.com
How cross platform do you need to be? The .Net streams will read from a URL. That could give you Windows and Unix (via mono). - R -----Original Message----- From: boost-users-bounces@lists.boost.org [mailto:boost-users-bounces@lists.boost.org] On Behalf Of Scott Meyers Sent: 07 July 2007 00:55 To: boost-users@lists.boost.org Subject: [Boost-users] Grabbing web page content I'm looking for an easy way to grab the contents of a web page, i.e., something like this (ignoring error handling): std::string pageContent = getPageContent("http://www.interestingsite.com/funkypage.html"); Is there a Boost library that will give me this kind of interface? I know I can build this on top of asio, but I'd prefer something where I don't have to do much more work than the above. If there's no Boost library, I'd welcome suggestions for other cross-platform ways to achieve this. Thanks, Scott _______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
Scott Meyers wrote:
I'm looking for an easy way to grab the contents of a web page, i.e., something like this (ignoring error handling):
std::string pageContent = getPageContent("http://www.interestingsite.com/funkypage.html");
Is there a Boost library that will give me this kind of interface? I know I can build this on top of asio, but I'd prefer something where I don't have to do much more work than the above. If there's no Boost library, I'd welcome suggestions for other cross-platform ways to achieve this.
Why don't you take this example from asio: http://asio.sourceforge.net/boost_asio_0_3_7/libs/asio/doc/examples/httpclie... and factor the guts into a function, say getPageContent just to make something up ;-) Jeff
participants (5)
-
David Abrahams
-
Jeff Garland
-
Marshall Clow
-
Richard
-
Scott Meyers