Search and replace in text files with Boost.Xpressive
What is the better approach with Boost to catch a file with some wildcard-prefixed variables and replace them by numerical values? Example: $ cat file.tmpl ============ var1 = $something ... ... $coord1 $coord2 ... $coordN ... ... ============ $ cat file.dat ============ var1 = 2.35323 ... ... 1.4 3.5 124 910 ... ... ============ The Boost libraries i know for regular expressions are Boost.Regex and Boost.Xpressive. I like the second, because it's header only, has static and dynamic regular expressions and uses it's own namespace. In other words, it incorporates the features of Boost.Regex in a more elegant way. As far as i know, Boost.Xpressive treats only std::string and C-like strings. What i need to do for work with text files? Put them inside a std::string? About the performance, the files has approximately 10^3 lines, the search and replace will be an issue? Regards, Júlio.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 4/2/2011 8:23 AM, Júlio Hoffimann wrote: <snip>
As far as i know, Boost.Xpressive treats only std::string and C-like strings.
Not so. Both Boost.Xpressive and Boost.Regex work on bidirectional iterator.
What i need to do for work with text files? Put them inside a std::string?
That's one way. A more efficient way would be to memory-map the file, which I think you can do portably using Boost.Interprocess. That should give you a random-access iterator to the source file.
About the performance, the files has approximately 10^3 lines, the search and replace will be an issue?
You should use the version of regex_replace that takes an output iterator, and either write into a string or vector. They both grow their capacity exponentially (or should if your STL implementation is worth anything), so this is quite efficient. Or, with an ofstream iterator, you can write directly to a temporary output file. I think the latest version of Boost.Filesystem has a portable way to create temporary files. Then you can do whatever you want with the temp file. HTH, - -- Eric Niebler BoostPro Computing http://www.boostpro.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJNlsyRAAoJEAeJsEDfjLbXQKEH/R/UFdXez0Nxp11nRqDuItUc OjlI0cbWJhS+vvKDmGhFbRAGLLUdaQ+8rrmEtuPhbyKv3khjlVklgtbQlMoC9bu+ RZrNgzVm2Ro1QcHG23NWsYky1NVuiaNXXHLB9kf05ibInzyVTvq3yewnBoXzMQ5o ynGFsSmbp+STAuESl/hUM+6yirqQHLXfuidBpXVFyTh/9TsL8FVYnsbrzxNrxwQB uhUYXCgPCxXC9MquxdH1MKNUcL1k9DQr7YGcZpHn6j+tsKHijEsfzn0Yy1KyMEIH Ljnx8dCszZrWobXUqjBVxMuK2m9ptbZeJfLfi/I6g9a5seoaPAdjIBl5zXo8qko= =t0HL -----END PGP SIGNATURE-----
Hi Eric,
Thank you for the perfect answer! Could you please guide me with some
classes/methods names in libraries you told?
For memory-map the file with Boost.Interprocess, what is the class you have
in mind? I found basic_managed_mapped_file but how can i use it with
Boost.Xpressive regex_replace() function?
http://www.boost.org/doc/libs/1_46_1/doc/html/interprocess/managed_memory_se...
I've looked Boost.Filesystem other day and someone told me about
unique_path() function, could i use it even with concurrent tasks
(Boost.MPI)? Change the model of unique_path() with the rank of the process
will avoid overwrite the generated files?
http://www.boost.org/doc/libs/1_44_0/libs/filesystem/v3/doc/reference.html#u...
Best regards,
Júlio.
2011/4/2 Eric Niebler
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 4/2/2011 8:23 AM, Júlio Hoffimann wrote: <snip>
As far as i know, Boost.Xpressive treats only std::string and C-like strings.
Not so. Both Boost.Xpressive and Boost.Regex work on bidirectional iterator.
What i need to do for work with text files? Put them inside a std::string?
That's one way. A more efficient way would be to memory-map the file, which I think you can do portably using Boost.Interprocess. That should give you a random-access iterator to the source file.
About the performance, the files has approximately 10^3 lines, the search and replace will be an issue?
You should use the version of regex_replace that takes an output iterator, and either write into a string or vector. They both grow their capacity exponentially (or should if your STL implementation is worth anything), so this is quite efficient. Or, with an ofstream iterator, you can write directly to a temporary output file. I think the latest version of Boost.Filesystem has a portable way to create temporary files. Then you can do whatever you want with the temp file.
HTH,
- -- Eric Niebler BoostPro Computing http://www.boostpro.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iQEcBAEBAgAGBQJNlsyRAAoJEAeJsEDfjLbXQKEH/R/UFdXez0Nxp11nRqDuItUc OjlI0cbWJhS+vvKDmGhFbRAGLLUdaQ+8rrmEtuPhbyKv3khjlVklgtbQlMoC9bu+ RZrNgzVm2Ro1QcHG23NWsYky1NVuiaNXXHLB9kf05ibInzyVTvq3yewnBoXzMQ5o ynGFsSmbp+STAuESl/hUM+6yirqQHLXfuidBpXVFyTh/9TsL8FVYnsbrzxNrxwQB uhUYXCgPCxXC9MquxdH1MKNUcL1k9DQr7YGcZpHn6j+tsKHijEsfzn0Yy1KyMEIH Ljnx8dCszZrWobXUqjBVxMuK2m9ptbZeJfLfi/I6g9a5seoaPAdjIBl5zXo8qko= =t0HL -----END PGP SIGNATURE----- _______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
Júlio Hoffimann wrote:
Hi Eric,
Thank you for the perfect answer! Could you please guide me with some classes/methods names in libraries you told?
For memory-map the file with Boost.Interprocess, what is the class you have in mind? I found basic_managed_mapped_file but how can i use it with Boost.Xpressive regex_replace() function?
http://www.boost.org/doc/libs/1_46_1/doc/html/interprocess/managed_memory_se...
You need a boost::interprocess::file_mapping and a boost::interprocess::mapped_region. The example at your link above should be enough to get you going. I've found boost::iostreams::mapped_file[_source/_sink] to be a little more straightforward, and it directly supports using boost::filesystem::paths. See: http://www.boost.org/doc/libs/1_45_0/libs/iostreams/doc/classes/mapped_file.... Jeff
Hi Jeff,
Thank you, i'll check all the possibilities, Boost has many. :-)
Regards,
Júlio.
2011/4/2 Jeff Flinn
Júlio Hoffimann wrote:
Hi Eric,
Thank you for the perfect answer! Could you please guide me with some classes/methods names in libraries you told?
For memory-map the file with Boost.Interprocess, what is the class you have in mind? I found basic_managed_mapped_file but how can i use it with Boost.Xpressive regex_replace() function?
http://www.boost.org/doc/libs/1_46_1/doc/html/interprocess/managed_memory_se...
You need a boost::interprocess::file_mapping and a boost::interprocess::mapped_region. The example at your link above should be enough to get you going.
I've found boost::iostreams::mapped_file[_source/_sink] to be a little more straightforward, and it directly supports using boost::filesystem::paths. See:
http://www.boost.org/doc/libs/1_45_0/libs/iostreams/doc/classes/mapped_file....
Jeff
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
participants (3)
-
Eric Niebler
-
Jeff Flinn
-
Júlio Hoffimann